每日一R--match

match 

pmatch

intersect

%in%

setdiff

===================================================

match package:base R Documentation

Value Matching

Description:

‘match‘ returns a vector of the positions of (first) matches of
its first argument in its second.

‘%in%‘ is a more intuitive interface as a binary operator, which
returns a logical vector indicating if there is a match or not for
its left operand.

Usage:

match(x, table, nomatch = NA_integer_, incomparables = NULL)

x: 向量, 要匹配的值;

table: 向量, 被匹配的值;

nomatch: 没匹配上的返回值, 必须是整数;

incomparables: 指定不能用来匹配的值.

x %in% table

这个返回的是TRUE和FALSE

> rep(1, 3) %in% rep(1, 5)
[1] TRUE TRUE TRUE

match返回的是位置

> match(rep(1, 3), rep(1, 5))
[1] 1 1 1

Arguments:

x: vector or ‘NULL‘: the values to be matched. Long vectors are
supported.

table: vector or ‘NULL‘: the values to be matched against. Long
vectors are not supported.

nomatch: the value to be returned in the case when no match is found.
Note that it is coerced to ‘integer‘.

incomparables: a vector of values that cannot be matched. Any value in
‘x‘ matching a value in this vector is assigned the ‘nomatch‘
value. For historical reasons, ‘FALSE‘ is equivalent to
‘NULL‘.

Details:

‘%in%‘ is currently defined as
‘"%in%" <- function(x, table) match(x, table, nomatch = 0) > 0‘

原来这个函数是这样定义的

> "%in%" <- function(x, table) match(x, table, nomatch = 0)
> 1:10 %in% c(1,3,5,9)
 [1] 1 0 2 0 3 0 0 0 4 0
> "%in%" <- function(x, table) match(x, table, nomatch = 0)>0
> 1:10 %in% c(1,3,5,9)
 [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE

左边的值在右边的位置

Factors, raw vectors and lists are converted to character vectors,
and then ‘x‘ and ‘table‘ are coerced to a common type (the later
of the two types in R‘s ordering, logical < integer < numeric <
complex < character) before matching. If ‘incomparables‘ has
positive length it is coerced to the common type.

Matching for lists is potentially very slow and best avoided
except in simple cases.

Exactly what matches what is to some extent a matter of
definition. For all types, ‘NA‘ matches ‘NA‘ and no other value.
For real and complex values, ‘NaN‘ values are regarded as matching
any other ‘NaN‘ value, but not matching ‘NA‘.

That ‘%in%‘ never returns ‘NA‘ makes it particularly useful in
‘if‘ conditions.

Character strings will be compared as byte sequences if any input
is marked as ‘"bytes"‘ (see ‘Encoding‘).

Value:

A vector of the same length as ‘x‘.

‘match‘: An integer vector giving the position in ‘table‘ of the
first match if there is a match, otherwise ‘nomatch‘.

If ‘x[i]‘ is found to equal ‘table[j]‘ then the value returned in
the ‘i‘-th position of the return value is ‘j‘, for the smallest
possible ‘j‘. If no match is found, the value is ‘nomatch‘.

‘%in%‘: A logical vector, indicating if a match was located for
each element of ‘x‘: thus the values are ‘TRUE‘ or ‘FALSE‘ and
never ‘NA‘.

References:

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
Language_. Wadsworth & Brooks/Cole.

See Also:

‘pmatch‘ and ‘charmatch‘ for (_partial_) string matching,
‘match.arg‘, etc for function argument matching. ‘findInterval‘
similarly returns a vector of positions, but finds numbers within
intervals, rather than exact matches.

‘is.element‘ for an S-compatible equivalent of ‘%in%‘.

Examples:

## The intersection of two sets can be defined via match():
## Simple version:

## intersect <- function(x, y) y[match(x, y, nomatch = 0)]
intersect # the R function in base is slightly more careful
intersect(1:10, 7:20)

1:10 %in% c(1,3,5,9)
sstr <- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%")
sstr[sstr %in% c(letters, LETTERS)]

> sstr <- c("c","ab","B","bba","c",NA,"@","bla","a","Ba","%")
> sstr[sstr %in% c(letters, LETTERS)]
[1] "c" "B" "c" "a"

c(letters, LETTERS)
大小写字母这么表示

"%w/o%" <- function(x, y) x[!x %in% y] #-- x without y
(1:10) %w/o% c(3,7,12)

## Note that setdiff() is very similar and typically makes more sense:
c(1:6,7:2) %w/o% c(3,7,12) # -> keeps duplicates

setdiff(c(1:6,7:2), c(3,7,12)) # -> unique values

> setdiff(c(1:6,7:2), c(3,7,12))
[1] 1 2 4 5 6

setdiff是集合

#=====================================================

> ?pmatch
> pmatch("", "") # returns NA
[1] NA
> pmatch("m", c("mean", "median", "mode")) # returns NA [1] NA #因为不是完全匹配,也不是唯一匹配> pmatch("med", c("mean", "median", "mode")) # returns 2 [1] 2#匹配上多个返回NA
>
> pmatch(c("", "ab", "ab"), c("abc", "ab"), dup = FALSE)
[1] NA 2 1#“”没匹配上,去掉;“ab”匹配上2,去掉x和table该位置的ab,“ab”不完全匹配上“abc”,返回第一个位置;感觉这个用的不多
> pmatch(c("", "ab", "ab"), c("abc", "ab"), dup = TRUE)
[1] NA 2 2
> ## compare
> charmatch(c("", "ab", "ab"), c("abc", "ab"))
[1] 0 2 2

pmatch函数是一个部分匹配函数, 依次从x里面挑出元素, 对照table进行匹配, 若匹配上则剔除匹配上的值, 不再参与下次匹配, duplicate.ok可设置是否剔除; 对于某一个元素, 匹配一共分成三步:

1. 如果可以完全匹配, 则认为匹配上了, 返回table中的位置;
2. 不满足上述条件, 如果是唯一部分匹配, 则返回table中的位置;
3. 不满足上述条件, 则认为没有值与其匹配上.

#===========================================================================

本文引用至

Rbase Documentation

http://blog.sina.com.cn/s/blog_73206f7b0102vyox.html

时间: 2024-10-20 05:18:01

每日一R--match的相关文章

使用r.js来打包模块化的javascript文件

前面的话 r.js(下载)是requireJS的优化(Optimizer)工具,可以实现前端文件的压缩与合并,在requireJS异步按需加载的基础上进一步提供前端优化,减小前端文件大小.减少对服务器的文件请求.本文将详细介绍r.js 简单打包 [项目结构] 以一个简单的例子来说明r.js的使用.该项目名称为'demo',在js目录下包含s1.js和s2.js两个文件,使用requirejs进行模块化,内容如下 //s1.js define(function (){ return 1; }) /

C#正则表达式编程(三):Match类和Group类用法

原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 .作者信息和本声明.否则将追究法律责任.http://zhoufoxcn.blog.51cto.com/792419/281956 前面两篇讲述了正则表达式的基础和一些简单的例子,这篇将稍微深入一点探讨一下正则表达式分组,在.NET中正则表达式分组是用Math类来代表的.首先先看一段代码: /// <summary>/// 显示Match内多个Group的例子/// </summary>public void Show

R (Ani Katchova) &middot; Eric

首先介绍一下Ani Katchova的R教程,然后再继续总结Advanced R. R introduction setwd("path")设置工作路径 mydata<-read.csv("path")读csv attach(mydata)成为part ofmemory names(mydata) list names head(mydata); mydata[1:10,] summary,sd,length,sort,table,cor,t.test(mpg

C#字符串的截取函数用法总结

这篇文章主要介绍了C#字符串的截取函数用法,实例总结了substring,Remove,indexOf等函数的用法,并对具体应用进行了实例分析,需要的朋友可以参考下 本文实例总结了C#常用的字符串截取函数用法.分享给大家供大家参考.具体分析如下: 在C#中字符串截取函数包括有substring 函数,Remove 函数,indexOf 函数,它们三个都可以对字符串进行截取操作,下面我们来分别介绍一下. 下面是截取字符串过程中我们必须知道的以下函数:substring 函数.Remove 函数.i

h5新特性

  CSDN博客 Gane_Cheng HTML5新特性浅谈 发表于2016/10/17 21:25:58  7809人阅读 分类: 前端 转载请注明出处: http://blog.csdn.net/gane_cheng/article/details/52819118 http://www.ganecheng.tech/blog/52819118.html (浏览效果更好) 2014年10月29日,W3C宣布,经过接近8年的艰苦努力,HTML5标准规范终于制定完成. HTML5将会取代

Scala

Scala 一.前言 大数据领域的Spark.Kafka.Summingbird等都是由Scala语言编写而成,相比Java而言,Scala更精炼.由于笔者从事大数据相关的工作,所以有必要好好学习Scala语言,之前也学习过,但是没有记录,所以就会忘记,感觉Scala确实比Java方便精炼很多,下面以Scala Cookbook英文版作为参考资料,从头到尾梳理Scala相关知识点,也加深印象.PS:这是在研究Zookeeper源码的间隙中交叉学习,不至于总是看源码太枯燥. 二.String 在S

python re模块

re 正则表达式操作  本模块提供了类似于Perl的正则表达式匹配操作.要匹配的模式和字符串可以是Unicode字符串以及8位字符串. 正则表达式使用反斜杠字符('\')来表示特殊的形式或者来允许使用特殊的字符而不要启用它们特殊的含义.这与字符串字面值中相同目的的相同字符的用法冲突:例如,要匹配一个反斜线字面值,你必须写成'\\\\'作为模式字符串,因为正则表达式必须是\\,每个反斜线在Python字符串字面值内部必须表达成\\. 解决的办法是使用Python的原始字符串符号表示正则表达式的模式

POJ 2955 Brackets

Brackets Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 6622   Accepted: 3558 Description We give the following inductive definition of a “regular brackets” sequence: the empty sequence is a regular brackets sequence, if s is a regular

XPATH 注入的介绍与代码防御

0x01 介绍 软件未正确对 XML 中使用的特殊元素进行无害化处理,导致攻击者能够在终端系统处理 XML 的语法.内容或命令之前对其进行修改.在 XML 中,特殊元素可能包括保留字或字符,例如“<”.“>”.“"”和“&”,它们可能用于添加新数据或修改 XML 语法.我们发现用户可控制的输入并未由应用程序正确进行无害化处理,就在 XPath 查询中使用.例如,假定 XML 文档包含“user”名称的元素,每个元素各包含 3 个子元素 -“name”.“password”和

C#基础回顾:正则表达式

??写在前面:本文根据笔者的学习体会结合相关书籍资料对正则表达式的语法和使用(C#)进行基本的介绍.适用于初学者. ??????摘要:正则表达式(Regular Expressions),相信做软件开发的朋友或多或少都对其有所了解,但是你是否可以用其来解决一些问题呢?本文将带着读者从基本的正则语法入手,先向大家展示语法的全貌,然后通过实例演示来对部分语法进行详细介绍.并在结尾给出一些综合性的实例,以便大家参考. ??????索引:????????????1.正则表达式语法概述??????????