R Programming week1-Data Type

Objects

R has five basic or “atomic” classes of objects:

character

numeric (real numbers)

integer

complex

logical (True/False)

The most basic object is a vector

A vector can only contain objects of the same class

BUT: The one exception is a list, which is represented as a vector but can contain objects of

different classes (indeed, that’s usually why we use them)

Empty vectors can be created with the vector() function.

Numbers

Numbers in R a generally treated as numeric objects (i.e. double precision real numbers)

If you explicitly want an integer, you need to specify the L suffix

Ex: Entering 1 gives you a numeric object; entering 1L explicitly gives you an integer.

There is also a special number Inf which represents infinity; e.g. 1 / 0; Inf can be used in

ordinary calculations; e.g. 1 / Inf is 0

The value NaN represents an undefined value (“not a number”); e.g. 0 / 0; NaN can also be

thought of as a missing value (more on that later)

Attributes

R objects can have attributes

names, dimnames

dimensions (e.g. matrices, arrays)

class

length

other user-defined attributes/metadata

Attributes of an object can be accessed using the attributes() function.

Creating Vectors

The c() function can be used to create vectors of objects.

Using the vector() function

> x <- vector("numeric", length = 10)

> x

[1] 0 0 0 0 0 0 0 0 0 0

Mixing Objects Mixing Objects

> y <- c(1.7, "a") ## character

> y <- c(TRUE, 2) ## numeric

> y <- c("a", TRUE) ## character

When different objects are mixed in a vector, coercion occurs so that every element in the vector is

of the same class.

Explicit Coercion

Objects can be explicitly coerced from one class to another using the as.* functions, if available.

> x <- 0:6

> class(x)

[1] "integer"

> as.numeric(x)

[1] 0 1 2 3 4 5 6

> as.logical(x)

[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE

> as.character(x)

[1] "0" "1" "2" "3" "4" "5" "6"

Nonsensical coercion results in NAs.

> x <- c("a", "b", "c")

> as.numeric(x)

[1] NA NA NA

Warning message:

NAs introduced by coercion

> as.logical(x)

[1] NA NA NA

> as.complex(x)

[1] 0+0i 1+0i 2+0i 3+0i 4+0i 5+0i 6+0i

Lists

Lists are a special type of vector that can contain elements of different classes. Lists are a very

important data type in R and you should get to know them well.

> x <- list(1, "a", TRUE, 1 + 4i)

> x

[[1]]

[1] 1

[[2]]

[1] "a"

[[3]]

[1] TRUE

[[4]]

[1] 1+4i

Matrices Matrices

Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol)

> m <- matrix(nrow = 2, ncol = 3)

> m

[,1] [,2] [,3]

[1,] NA NA NA

[2,] NA NA NA

> dim(m)

[1] 2 3

> attributes(m)

$dim

[1] 2 3

Matrices (cont’d)

Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns.

> m <- matrix(1:6, nrow = 2, ncol = 3)

> m

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

Matrices can also be created directly from vectors by adding a dimension attribute.

> m <- 1:10

> m

[1] 1 2 3 4 5 6 7 8 9 10

> dim(m) <- c(2, 5)

> m

[,1] [,2] [,3] [,4] [,5]

[1,] 1 3 5 7 9

[2,] 2 4 6 8 10

cbind-ing and rbind-ing cbind-ing and rbind-ing

Matrices can be created by column-binding or row-binding with cbind() and rbind().

> x <- 1:3

> y <- 10:12

> cbind(x, y)

x y

[1,] 1 10

[2,] 2 11

[3,] 3 12

> rbind(x, y)

[,1] [,2] [,3]

x 1 2 3

y 10 11 12

Factors

Factors are used to represent categorical data. Factors can be unordered or ordered. One can think

of a factor as an integer vector where each integer has a label.

Factors are treated specially by modelling functions like lm() and glm()

Using factors with labels is better than using integers because factors are self-describing; having

a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.

> x <- factor(c("yes", "yes", "no", "yes", "no"))

> x

[1] yes yes no yes no

Levels: no yes

> table(x)

x

no yes

2 3

> unclass(x)

[1] 2 2 1 2 1

attr(,"levels")

[1] "no" "yes"

The order of the levels can be set using the levels argument to factor(). This can be important

in linear modelling because the first level is used as the baseline level.

> x <- factor(c("yes", "yes", "no", "yes", "no"),

levels = c("yes", "no"))

> x

[1] yes yes no yes no

Levels: yes no

Missing Values Missing Values

Missing values are denoted by NA or NaN for undefined mathematical operations.

is.na() is used to test objects if they are NA

is.nan() is used to test for NaN

NA values have a class also, so there are integer NA, character NA, etc.

A NaN value is also NA but the converse is not true

> x <- c(1, 2, NA, 10, 3)

> is.na(x)

[1] FALSE FALSE TRUE FALSE FALSE

> is.nan(x)

[1] FALSE FALSE FALSE FALSE FALSE

> x <- c(1, 2, NaN, NA, 4)

> is.na(x)

[1] FALSE FALSE TRUE TRUE FALSE

> is.nan(x)

[1] FALSE FALSE TRUE FALSE FALSE

Data Frames

Data frames are used to store tabular data

They are represented as a special type of list where every element of the list has to have the

same length

Each element of the list can be thought of as a column and the length of each element of the list

is the number of rows

Unlike matrices, data frames can store different classes of objects in each column (just like lists);

matrices must have every element be the same class

Data frames also have a special attribute called row.names

Data frames are usually created by calling read.table() or read.csv()

Can be converted to a matrix by calling data.matrix()

> x <- data.frame(foo = 1:4, bar = c(T, T, F, F))

> x

foo bar

1 1 TRUE

2 2 TRUE

3 3 FALSE

4 4 FALSE

> nrow(x)

[1] 4

> ncol(x)

[1] 2

Names

R objects can also have names, which is very useful for writing readable code and self-describing

objects.

> x <- 1:3

> names(x)

NULL

> names(x) <- c("foo", "bar", "norf")

> x

foo bar norf

1 2 3

> names(x)

[1] "foo" "bar" "norf"

Summary

Data Types

atomic classes: numeric, logical, character, integer, complex \

vectors, lists

factors

missing values

data frames

names

时间: 2024-08-03 19:09:09

R Programming week1-Data Type的相关文章

R Programming week1-Reading Data

Reading Data There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readLines, for reading lines of a text file source, for reading in R code files (inverse of dump) dget, for reading in R code files (

Coursera系列-R Programming第三周-词法作用域

完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心思. Lexical Scopping: The value of free variables are searched for in the environment where the function was defined. 因此 make.power<-function(n){ pow<

include pointers as a primitive data type

Computer Science An Overview _J. Glenn Brookshear _11th Edition Many modern programming languages include pointers as a primitive data type. That is, they allow the declaration, allocation, and manipulation of pointers in ways reminiscent of integers

salesforce 零基础开发入门学习(四)多表关联下的SOQL以及表字段Data type详解

建立好的数据表在数据库中查看有很多方式,本人目前采用以下两种方式查看数据表. 1.采用schema Builder查看表结构以及多表之间的关联关系,可以登录后点击setup在左侧搜索框输入schema Builder 或者build-->schema Builder进入: 2.采用force.com Explorer通过自己写查询语句来查询数据. 此链接为force.com Explorer的下载链接:  http://force-com-explorer-beta.software.infor

Algebraic Data Type 及其在 Haskell 和 Scala 中的表现

http://songkun.me/2018/07/12/2018-07-12-adt-in-haskell-and-scala/ 函数式编程接触久了以后,我们会发现很多 FP 语言(这里指静态 FP 语言)具有不少类似的语言特性,这非常自然,因为语言特性就那么多,好用.实用的特性更少,这一方面造成了语言之间的同质化,另一方面也减轻了我们语言切换的成本,算是有利也有弊吧. 常见的静态函数式语言有 Haskell.Standard ML.OCaml.Scala 等,它们之间非常类似,共有的特性有:

PHP 笔记一(systax/variables/echo/print/Data Type)

PHP stands for "Hypertext Preprocessor" ,it is a server scripting language. What Can PHP Do? PHP can generate dynamic page content PHP can create, open, read, write, delete, and close files on the server PHP can collect form data PHP can send an

Linux C double linked for any data type

/************************************************************************** * Linux C double linked for any data type * 声明: * 提供一种双链接口,可以保存保存任何类型的数据. * * 2015-12-25 晴 深圳 南山平山村 曾剑锋 **********************************************************************

诡异错误二:TypeError: data type not understood

如何使用Python产生一个数组,数组的长度为1024,数组的元素全为0? 很简单啊, 使用zeros(1024) 即可实现! 如何产生一个2×1024的全0矩阵呢?是否是zeros(2,1024) ? 若是上述这种写法就会出现 TypeError: data type not understood  这种错误: 正确的写法是 zeros((2,1024)),python的二维数据表示要用二层括号来进行表示. 三维数据是否使用三层括号?试一试,果然可以正确输出!试猜一猜, 下述三层括号中的数字分

【Agile Pair Coding】Data Type Mapping

介绍 今天下午用了1个小时左右,和同事Agile Pair Coding敏捷开发了一把,感觉挺爽的. Agile Pair Coding给我们带来的直接好处是:相互不浪费时间,高效:idea很快达成共识,不纠结于无谓的讨论:idea立马coding,不沉迷于头脑风暴:代码更严谨:重构概率大:加深基情:相互学习,相互欣赏,相互指正:避免无知,避免自我感觉良好...... 代码主要实现:从所有类型文件中,得到所有NE类型下的所有Object类型下的所有属性的数据类型. 当然,本文只是一个短时间内的D