My "Top 5 R Functions"(转)

In preparation for a R Workgroup meeting, I started thinking about what would be my "Top 5 R Functions". I ruled out the functions for basic mechanics - save, load, mean, etc. - they‘re obviously critical, but every programming language has them, so there‘s nothing especially "R" about them. I also ruled out the fancy statistical analysis functions like (g)lmer -- most people (including me) start using R because they want to run those analyses so it seemed a little redundant. I started using R because I wanted to do growth curve analysis, so it seems like a weak endorsement to say that I like R because it can do growth curve analysis. No, I like R because it makes (many) somewhat complex data operations really, really easy. Understanding how take advantage of these R functions is what transformed my view of R from purely functional (I need to do analysis X and R has functions for doing analysis X) to an all-purpose tool that allows me to do data processing, management, analysis, and visualization extremely quickly and easily. So, here are the 5 functions that did that for me:

  1. subset() for making subsets of data (natch)
  2. merge() for combining data sets in a smart and easy way
  3. melt() for converting from wide to long data formats
  4. dcast() for converting from long to wide data formats, and for making summary tables
  5. ddply() for doing split-apply-combine operations, which covers a huge swath of the most tricky data operations

For anyone interested, I posted my R Workgroup notes on how to use these functions on RPubs. Side note: after a little configuration, I found it super easy to write these using knitr, "knit" them into a webpage, and post that page on RPubs.

Conspicuously missing from the above list is ggplot, which I think deserves a special lifetime achievement award for how it has transformed how I think about data exploration and data visualization. I‘m planning that for the next R Workgroup meeting.

时间: 2024-08-02 09:45:40

My "Top 5 R Functions"(转)的相关文章

Useful R functions

Random Samples and Permutations sample(x, size, replace = FALSE, prob = NULL) sample.int(n, size = n, replace = FALSE, prob = NULL) Samples in Normal Distribution rnorm(n, mean = 0, sd = 1) Generate Factor Levels gl(n, k, length = n*k, labels = seq_l

使用r.js来打包模块化的javascript文件

前面的话 r.js(下载)是requireJS的优化(Optimizer)工具,可以实现前端文件的压缩与合并,在requireJS异步按需加载的基础上进一步提供前端优化,减小前端文件大小.减少对服务器的文件请求.本文将详细介绍r.js 简单打包 [项目结构] 以一个简单的例子来说明r.js的使用.该项目名称为'demo',在js目录下包含s1.js和s2.js两个文件,使用requirejs进行模块化,内容如下 //s1.js define(function (){ return 1; }) /

keep or remove data frame columns in R

You should use either indexing or the subset function. For example : R> df <- data.frame(x=1:5, y=2:6, z=3:7, u=4:8) R> df x y z u 1 1 2 3 4 2 2 3 4 5 3 3 4 5 6 4 4 5 6 7 5 5 6 7 8 Then you can use the which function and the - operator in column

在top命令下kill和renice进程

For common process management tasks, top is so great because it gives an overview of the most active processes currently running (hence the name top). This enables you to easily find processes that might need attention. From top, you can also perform

a note of R software write Function

Functionals “To become significantly more reliable, code must become more transparent. In particular, nested conditions and loops must be viewed with great suspicion. Complicated control flows confuse programmers. Messy code often hides bugs.” — Bjar

Linux 指令详解 top 系统资源检测

指令:top   持续的监测整个系统的程序工作状态 <1>.top是一个优秀的交互式工具,能够实时显示当前系统的进程的相关信息,包括PID.内存占用率.CPU占用率等,还可以根据需要按CPU占用情况排序.内存使用情况排序. <2>.如果在前台执行该命令,它将一直显示前台,直到用户终止该程序 命令格式: # top [-] [d] [p] [q] [c] [C] [S] [s] [n] 参数说明: -b: 批处理模式.通常用在脚本中,不断输出信息 -c: 显示包含路径的命令行,而不只

Create and format Word documents using R software and Reporters package

http://www.sthda.com/english/wiki/create-and-format-word-documents-using-r-software-and-reporters-package Install and load the ReporteRs R package Create a simple Word document Add texts : title and paragraphs of texts Format the text of a Word docum

通过百度echarts实现数据图表展示功能

现在我们在工作中,在开发中都会或多或少的用到图表统计数据显示给用户.通过图表可以很直观的,直接的将数据呈现出来.这里我就介绍说一下利用百度开源的echarts图表技术实现的具体功能. 1.对于不太理解echarts是个怎样技术的开发者来说,可以到echarts官网进行学习了解,官网有详细的API文档和实例供大家参考学习. 2.以下是我在工作中实现整理出来的实例源码: 公用的支持js文件 echarts.js.echarts.min.js,还有其他的图表需要支持的js文件也可以到官网下载 echa

python数据分析入门——数据导入数据预处理基本操作

数据导入到python环境:http://pandas.pydata.org/pandas-docs/stable/io.html(英文版) IO Tools (Text, CSV, HDF5, ...)? The pandas I/O API is a set of top level reader functions accessed like pd.read_csv() that generally return a pandasobject. read_csv read_excel re