R Programming week 3-Loop functions

Looping on the Command Line

Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier

lapply: Loop over a list and evaluate a function on each elementsapply: Same as lapply but try to simplify the result

apply: Apply a function over the margins of an array

tapply: Apply a function over subsets of a vector mapply: Multivariate version of lapply

An auxiliary function split is also useful, particularly in conjunction with lapply

lapply

lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) FUN; (3) other arguments via its ... argument. If X is not a list, it will be coerced to a list using as.list.

## function (X, FUN, ...)

## {

## FUN <- match.fun(FUN)

## if (!is.vector(X) || is.object(X))

## X <- as.list(X)

## .Internal(lapply(X, FUN))

## }

## <bytecode: 0x7ff7a1951c00>

## <environment: namespace:base>

The actual looping is done internally in C code.

lapply always returns a list, regardless of the class of the input.

x <- list(a = 1:5, b = rnorm(10))

lapply(x, mean)

x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5)) lapply(x, mean)

> x <- 1:4 > lapply(x, runif)

lapply and friends make heavy use of anonymous function

> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2))

> x

$a

[,1] [,2]

[1,] 1 3

[2,] 2 4

$b

[,1] [,2]

[1,] 1 4

[2,] 2 5

[3,] 3 6

An anonymous function for extracting the first column of each matrix.

> lapply(x, function(elt) elt[,1])

$a

[1] 1 2

$b

[1] 1 2 3

sapply

> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))

> lapply(x, mean)

apply

apply is used to a evaluate a function (often an anonymous one) over the margins of an array.

It is most often used to apply a function to the rows or columns of a matrix

It can be used with general arrays, e.g. taking the average of an array of matrices

It is not really faster than writing a loop, but it works in one line!

> str(apply)

function (X, MARGIN, FUN, ...)

X is an array

MARGIN is an integer vector indicating which margins should be “retained”.

FUN is a function to be applied

... is for other arguments to be passed to FUN

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 2, mean)

[1] 0.04868268 0.35743615 -0.09104379

[4] -0.05381370 -0.16552070 -0.18192493

[7] 0.10285727 0.36519270 0.14898850

[10] 0.26767260

col/row sums and means

For sums and means of matrix dimensions, we have some shortcuts.

rowSums = apply(x, 1, sum)

rowMeans = apply(x, 1, mean)

colSums = apply(x, 2, sum)

colMeans = apply(x, 2, mean)

The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.

Other Ways to Apply

Quantiles of the rows of a matrix.

> x <- matrix(rnorm(200), 20, 10)

> apply(x, 1, quantile, probs = c(0.25, 0.75))

mapply

mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.

> str(mapply)

function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE,USE.NAMES = TRUE)

FUN is a function to apply ... contains arguments to apply over MoreArgs is a list of other arguments to FUN.

SIMPLIFY indicates whether the result should be simplified

The following is tedious to type

list(rep(1, 4), rep(2, 3), rep(3, 2), rep(4, 1))

Instead we can do

Vectorizing a Function

> noise <- function(n, mean, sd) {

+ rnorm(n, mean, sd)

+ }

> noise(5, 1, 2)

[1] 2.4831198 2.4790100 0.4855190 -1.2117759

[5] -0.2743532

> noise(1:5, 1:5, 2)

[1] -4.2128648 -0.3989266 4.2507057 1.1572738

[5] 3.7413584

Instant Vectorization

> mapply(noise, 1:5, 1:5, 2)

Which is the same as

list(noise(1, 1, 2), noise(2, 2, 2), noise(3, 3, 2), noise(4, 4, 2), noise(5, 5, 2))

tapply

tapply is used to apply a function over subsets of a vector. I don’t know why it’s called tapply.

> str(tapply) function (X, INDEX, FUN = NULL, ..., simplify = TRUE)

X is a vector

INDEX is a factor or a list of factors (or else they are coerced to factors)

FUN is a function to be applied

... contains other arguments to be passed FUN

simplify, should we simplify the result?

Take group means.

> x <- c(rnorm(10), runif(10), rnorm(10, 1))

> f <- gl(3, 10)

> f

[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3

[24] 3 3 3 3 3 3 3

Levels: 1 2 3

> tapply(x, f, mean)

1 2 3

0.1144464 0.5163468 1.2463678

Take group means without simplification.

> tapply(x, f, mean, simplify = FALSE)

$‘1‘

[1] 0.1144464

$‘2‘

[1] 0.5163468

$‘3‘

[1] 1.246368

Find group ranges.

> tapply(x, f, range)

$‘1‘

[1] -1.097309 2.694970

$‘2‘

[1] 0.09479023 0.79107293

$‘3‘

[1] 0.4717443 2.5887025

split

split takes a vector or other objects and splits it into groups determined by a factor or list of
factors.

> str(split)
function (x, f, drop = FALSE, ...)

x is a vector (or list) or data frame

f is a factor (or coerced to one) or a list of factors

drop indicates whether empty factors levels should be dropped

A common idiom is split followed by an lapply.

> lapply(split(x, f), mean)

Splitting a Data Frame

> library(datasets)

> head(airquality)

> s <- split(airquality, airquality$Month)

> lapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")]))

> sapply(s, function(x) colMeans(x[, c("Ozone", "Solar.R", "Wind")], na.rm = TRUE))

Splitting on More than One Level

> x <- rnorm(10)

> f1 <- gl(2, 5)

> f2 <- gl(5, 2)

Interactions can create empty levels.

> str(split(x, list(f1, f2)))

split

Empty levels can be dropped

> str(split(x, list(f1, f2), drop = TRUE))

List of 6

$ 1.1: num [1:2] -0.378 0.445

$ 1.2: num [1:2] 1.4066 0.0166

$ 1.3: num -0.355

$ 2.3: num 0.315

$ 2.4: num [1:2] -0.907 0.723

$ 2.5: num [1:2] 0.732 0.360

欢迎关注

时间: 2024-10-06 12:11:24

R Programming week 3-Loop functions的相关文章

Coursera系列-R Programming第三周-词法作用域

完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心思. Lexical Scopping: The value of free variables are searched for in the environment where the function was defined. 因此 make.power<-function(n){ pow<

R Programming week2 Control Structures

Control Structures Control structures in R allow you to control the flow of execution of the program, depending on runtime conditions. Common structures are: if, else: testing a condition for: execute a loop a fixed number of times while: execute a l

[Johns Hopkins] R Programming 作業 Week 2 - Air Pollution

Introduction For this first programming assignment you will write three functions that are meant to interact with dataset that accompanies this assignment. The dataset is contained in a zip file specdata.zip that you can download from the Coursera we

R Programming week1-Reading Data

Reading Data There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readLines, for reading lines of a text file source, for reading in R code files (inverse of dump) dget, for reading in R code files (

R Programming week1-Data Type

Objects R has five basic or “atomic” classes of objects: character numeric (real numbers) integer complex logical (True/False) The most basic object is a vector A vector can only contain objects of the same class BUT: The one exception is a list, whi

R Programming week 3-Debugging

Something’s Wrong! Indications that something’s not right message: A generic notification/diagnostic message produced by the message function;execution of the function continues warning: An indication that something is wrong but not necessarily fatal

Coursera系列-R Programming (John Hopkins University)-课件案例

课件里介绍了一个很实用又能学到很多知识点的例子.并且Roger老师可是用了40分钟的视频亲力亲为.所以这里我把课件和视频知识整理一下会比较更清晰地解析这个案例. 视频链接: https://www.youtube.com/watch?v=VE-6bQvyfTQ&feature=youtu.be Data Analysis Case Study: Changes in Fine Particle Air Pollution in the U.S. Reading in the 1999 data

reactive programming 2.3 loop

实现WHILE函数 def WHILE(condition: => Boolean)(command: => Unit): Unit =   if (condition) {   command   WHILE(condition)(command)   }   else () 2.实现repeat函数 package week2 object LOOP extends App{      def repeat (command: => Unit) (condition: => B

Coursera系列-R Programming (John Hopkins University)-Programming Assignment 3

经过断断续续一个月的学习,R语言这门课也快接近尾声了.进入Week 4,作业对于我这个初学者来说感到越发困难起来.还好经过几天不断地摸索和试错,最终完整地解决了问题. 本周的作业Assignment 3是处理一个来自美国Department of Health and Human Services的一个文件,叫“outcome-of-care-measures.csv”.里面储存了美国50个州4000多家医院的几个常见疾病的死亡率.具体说来是30-day mortality and readmi