keep or remove data frame columns in R

You should use either indexing or the subset function. For example :

R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8)
R> df
  x y z u
1 1 2 3 4
2 2 3 4 5
3 3 4 5 6
4 4 5 6 7
5 5 6 7 8

Then you can use the which function and the - operator in column indexation :

R> df[ , -which(names(df) %in% c("z","u"))]
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

Or, much simpler, use the select argument of the subset function : you can then use the - operator directly on a vector of column names, and you can even omit the quotes around the names !

R> subset(df, select=-c(z,u))
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

Note that you can also select the columns you want instead of dropping the others :

R> df[ , c("x","y")]
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

R> subset(df, select=c(x,y))
  x y
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6

===============================

Simple R functions to keep or remove data frame columns

This function removes columns from a data frame by name:

removeCols <- function(data, cols){ return(data[,!names(data) %in% cols]) }

This function keeps columns of a data frame by name:

keepCols <- function(data, cols){
return(data[,names(data) %in% cols]) }

or just one function

colKeepRemove <- function(data, cols, remove=1){
if(remove == 1){ return(data[,!names(data) %in% cols]) }
else { return(data[,!names(data) %in% cols]) }}



===============================REF:http://stackoverflow.com/questions/4605206/drop-columns-r-data-framehttp://stackoverflow.com/questions/5234117/how-to-drop-columns-by-name-in-a-data-framehttp://ewens.caltech.edu/2011/05/17/simple-r-functions-to-keep-or-remove-data-frame-columns/

时间： 2024-10-13 22:21:52

keep or remove data frame columns in R的相关文章

R语言合并data.frame

Merging Data Adding Columns To merge two data frames (datasets) horizontally, use the merge function. In most cases, you join two data frames by one or more common key variables (i.e., an inner join). # merge two data frames by ID total <- merge(

R vs Python：构建data.frame、读取csv与统计描述

一.Python 数据框就是典型的关系型数据库的数据存储形式,每一行是一条记录,每一列是一个属性,最终构成表格的形式,这是数据科学家必须熟悉的最典型的数据结构. 1.构建数据框 import pandas as pd data = {'year':[2010, 2011, 2012, 2010, 2011, 2012, 2010, 2011, 2012], 'team':['FCBarcelona', 'FCBarcelona', 'FCBarcelona', 'RMadrid', 'RMadr

转载:R语言Data Frame数据框常用操作

Data Frame一般被翻译为数据框,感觉就像是R中的表,由行和列组成,与Matrix不同的是,每个列可以是不同的数据类型,而Matrix是必须相同的. Data Frame每一列有列名,每一行也可以指定行名.如果不指定行名,那么就是从1开始自增的Sequence来标识每一行. 初始化使用data.frame函数就可以初始化一个Data Frame.比如我们要初始化一个student的Data Frame其中包含ID和Name还有Gender以及Birthdate,那么代码为: studen

将R非时间序列的data.frame转变为时序格式

将R非时间序列的data.frame转变为时序格式,常常会用到,尤其是股票数据处理中, 举例:dailyData包括两列数据:Date Close10/11/2013 871.9910/10/2013 868.2410/9/2013 855.8610/8/2013 853.6710/7/2013 865.7410/4/2013 872.3510/3/2013 876.0910/2/2013 887.9910/1/2013 8879/30/2013 875.919/27/2013 876.399/

Reordering the columns in a data frame

Problem You want to do reorder the columns in a data frame. Solution # A sample data frame data <- read.table(header=TRUE, text=' id weight size 1 20 small 2 27 large 3 24 medium ') # Reorder by column number data[c(1,3,2)] #> id size weight #> 1

R语言Data Frame数据框常用操作

R 语言中 data table 的相关，内存高效的增量式 data frame

面对的是这样一个问题,不断读入一行一行数据,append到data frame上,如果用dataframe, rbind() ,可以发现数据大的时候效率明显变低. 原因是每次bind 都是一次重新整个数据集的重新拷贝这个链接有人测试了各种方案,似乎给出了最优方案 http://stackoverflow.com/questions/11486369/growing-a-data-frame-in-a-memory-efficient-manner library(data.table) d

R语言的data.frame()

R语言的数据框类似于矩阵,有行和列两个维度,然而数据框的每一列可以是不同的mode,例如某列由float组成,某列由char组成等等.技术层面上讲,数据框是一个每个组件长度都相等的list. > kids <- c('jack', 'jill') > ages <- c(12, 10) > d <- data.frame(kids, ages, stringsAsFactors = F) #创建数据框 > d kids ages 1 jack 12 2 jill

R语言中的 Vector, Array, List 和 Data Frame

1.Vector 所有的元素必须是同一类型. 例如下面的代码创建了2个vectors. name <- c("Mike", "Lucy", "John") age <- c(20, 25, 30) 2.Array & Matrix Matrix是一种特殊的vector.Maxtrix是一个拥有两个额外属性的vector:行数和列数. > x <- matrix(c(1,2,3,4), nrow=2, ncol=2)