RMySQL数据库编程指南

R语言作为统计学一门语言一直在小众领域闪耀着光芒。直到大数据的爆发R语言变成了一门炙手可热的数据分析的利器。随着越来越多的工程背景的人的加入R语言的社区在迅速扩大成长。现在已不仅仅是统计领域教育银行电商互联网….都在使用R语言。

要成为有理想的极客我们不能停留在语法上要掌握牢固的数学概率统计知识同时还要有创新精神把R语言发挥到各个领域。让我们一起动起来吧开始R的极客理想。

 

转载请注明出处:
http://blog.fens.me/r-mysql-rmysql/

前言

MySQL是一款最常用到开源数据库软件安装简单运行稳定非常适用于中小型的数据存储。R作为数据分析的工具当然要支持数据库驱动接口。让R和MySQL配合在一起所能爆发出的能量是巨大的。

由于操作系统的原因让Win和Linux有不一样的字符集不一样的运行时环境。所以今天我们讲一下如何在Linux和Win上面安装和使用RMySQL。

目录

  1. RMySQL介绍
  2. RMySQL在Linux下安装
  3. RMySQL在Win7下安装
  4. RMySQL函数使用
  5. RMySQL案例实践

1. RMySQL介绍

RMySQL一个R语言程序包提供了访问MySQL数据库的R语言接口程序RMySQL需求依赖于DBI项目。RMySQL不仅提供了基本的数据库访问SQL查询还封装了一些方法。比较读整表分页data.frame快速插入等等的功能。掌握好RMySQL数据库编辑将得心应手!!

2. RMySQL在Linux下安装

Linux系统环境:

  • Linux: Ubuntu 12.04.2 LTS 64bit server
  • Linux字符集: en_US.UTF-8
  • R: 3.0.1, x86_64-pc-linux-gnu (64-bit)
  • MySQL: Ver 14.14 Distrib 5.5.29 64bit server
  • MySQL字符集: utf8

~ uname -a
Linux conan 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

~ cat /etc/issue
Ubuntu 12.04.2 LTS \n \l

~ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

~ R --version
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
http://www.gnu.org/licenses/.

~ mysql --version
mysql  Ver 14.14 Distrib 5.5.29, for debian-linux-gnu (x86_64) using readline 6.2

mysql> show variables like ‘%char%‘;
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

在R环境中安装RMySQL


~ R
> install.packages(‘RMySQL‘)

also installing the dependencyDBI’

trying URL ‘http://cran.dataguru.cn/src/contrib/DBI_0.2-7.tar.gz‘
Content type ‘application/x-gzip‘ length 194699 bytes (190 Kb)
opened URL
==================================================
downloaded 190 Kb

trying URL ‘http://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz‘
Content type ‘application/x-gzip‘ length 165363 bytes (161 Kb)
opened URL
==================================================
downloaded 161 Kb

...

Configuration error:
  could not find the MySQL installation include and/or library
  directories.  Manually specify the location of the MySQL
  libraries and the header files and re-run R CMD INSTALL.

INSTRUCTIONS:

1. Define and export the 2 shell variables PKG_CPPFLAGS and
   PKG_LIBS to include the directory for header files (*.h)
   and libraries, for example (using Bourne shell syntax):

      export PKG_CPPFLAGS="-I"
      export PKG_LIBS="-L -lmysqlclient"

   Re-run the R INSTALL command:

      R CMD INSTALL RMySQL_.tar.gz

2. Alternatively, you may pass the configure arguments
      --with-mysql-dir= (distribution directory)
   or
      --with-mysql-inc= (where MySQL header files reside)
      --with-mysql-lib= (where MySQL libraries reside)
   in the call to R INSTALL --configure-args=‘...‘

   R CMD INSTALL --configure-args=‘--with-mysql-dir=DIR‘ RMySQL_.tar.gz

ERROR: configuration failed for package ‘RMySQL’
* removing ‘/home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL’

The downloaded source packages are in
        ‘/tmp/Rtmpu0Gn88/downloaded_packages’
Warning message:
In install.packages("RMySQL") :
  installation of package ‘RMySQL’ had non-zero exit status

安装出错了提示我们需要增加MySQL安装目录的配置参数


# 安装mysql类库
~ sudo apt-get install libdbd-mysql libmysqlclient-dev

# 找到mysql的安装目录
~ whereis mysql
mysql: /usr/bin/mysql /etc/mysql /usr/lib/mysql /usr/bin/X11/mysql /usr/share/mysql /usr/share/man/man1/mysql.1.gz

# 找到刚刚下载的RMySQL_.tar.gz
~ ls /tmp/Rtmpu0Gn88/downloaded_packages
DBI_0.2-7.tar.gz  RMySQL_0.9-3.tar.gz

# 通过命令安装RMySQL
~ R CMD INSTALL --configure-args=‘--with-mysql-dir=/usr/lib/mysql‘ /tmp/Rtmpu0Gn88/downloaded_packages/RMySQL_0.9-3.tar.gz

* installing to library ‘/home/conan/R/x86_64-pc-linux-gnu-library/3.0’
* installing *source* package ‘RMySQL’ ...
** package ‘RMySQL’ successfully unpacked and MD5 sums checked
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ANSI C... none needed
checking how to run the C preprocessor... gcc -E
checking for compress in -lz... yes
checking for getopt_long in -lc... yes
checking for mysql_init in -lmysqlclient... yes
checking for egrep... grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking mysql.h usability... no
checking mysql.h presence... no
checking for mysql.h... no
checking /usr/local/include/mysql/mysql.h usability... no
checking /usr/local/include/mysql/mysql.h presence... no
checking for /usr/local/include/mysql/mysql.h... no
checking /usr/include/mysql/mysql.h usability... yes
checking /usr/include/mysql/mysql.h presence... yes
checking for /usr/include/mysql/mysql.h... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/usr/include/mysql     -fpic  -O3 -pipe  -g  -c RS-DBI.c -o RS-DBI.o
gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG -I/usr/include/mysql     -fpic  -O3 -pipe  -g  -c RS-MySQL.c -o RS-MySQL.o
gcc -std=gnu99 -shared -o RMySQL.so RS-DBI.o RS-MySQL.o -lmysqlclient -lz -L/usr/lib/R/lib -lR
installing to /home/conan/R/x86_64-pc-linux-gnu-library/3.0/RMySQL/libs
** R
** inst
** preparing package for lazy loading
Creating a generic function for ‘format’ from package ‘base’ in package ‘RMySQL’
Creating a generic function for ‘print’ from package ‘base’ in package ‘RMySQL’
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
* DONE (RMySQL)

RMySQL安装成功.

在MySQL中建库建表


~ mysql -uroot -p

mysql> create database rmysql;
Query OK, 1 row affected (0.00 sec)

mysql> grant all on rmysql.* to [email protected]‘%‘ identified by ‘rmysql‘;
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on rmysql.* to [email protected] identified by ‘rmysql‘;
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql
Database changed

mysql> CREATE TABLE t_user(
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.07 sec)

mysql> INSERT INTO t_user(user) values(‘A1‘),(‘AB‘),(‘fens.me‘);
Query OK, 3 rows affected (0.04 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.00 sec)

通过R程序读MySQL数据库数据


~ R

> library(RMySQL)
Loading required package: DBI

> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql")
> users = dbGetQuery(conn, "SELECT * FROM t_user")
> dbDisconnect(conn)
[1] TRUE
> users
  id    user
1  1      A1
2  2      AB
3  3 fens.me

好了我们实现了在Linux下R和MySQL的连接。

3. RMySQL在Win7下安装

Win系统环境:

  • Win7: 64位 旗舰版
  • Win字符集: gbk,utf8
  • R: 3.0.1, x86_64-w64-mingw32/x64 (64-bit)
  • MySQL: mysql Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

~ R --version
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
http://www.gnu.org/licenses/.

~ mysql --version
mysql  Ver 14.14 Distrib 5.6.11, for Win64 (x86_64)

mysql> show variables like ‘%char%‘;
+--------------------------+------------------------------------+
| Variable_name            | Value                              |
+--------------------------+------------------------------------+
| character_set_client     | gbk                                |
| character_set_connection | gbk                                |
| character_set_database   | utf8                               |
| character_set_filesystem | binary                             |
| character_set_results    | gbk                                |
| character_set_server     | utf8                               |
| character_set_system     | utf8                               |
| character_sets_dir       | D:\toolkit\mysql56\share\charsets\ |
+--------------------------+------------------------------------+
8 rows in set (0.07 sec)

在R环境中安装RMySQL


~ D:\workspace\R\mysql>R

> install.packages(‘RMySQl‘)
package ‘RMySQl‘ is not available (for R version 3.0.1)

我们看到提示没有对应的RMySQL安装版本。


# 下载RMySQL源代码包
> install.packages("RMySQL", type="source")
URLhttp://cran.dataguru.cn/src/contrib/RMySQL_0.9-3.tar.gz‘
Content type ‘application/x-gzip‘ length 165363 bytes (161 Kb)
URL
downloaded 161 Kb

* installing *source* package ‘RMySQL‘ ...
** ‘RMySQL‘MD5
checking for $MYSQL_HOME... not found... searching registry...

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key ‘SOFTWARE\MySQL AB‘ not found

ERROR: configuration failed for package ‘RMySQL‘
* removing ‘C:/Program Files/R/R-3.0.1/library/RMySQL‘
        ‘C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages‘

In install.packages("RMySQL", type = "source") :
  ‘RMySQL‘B0

找到源代码包:RMySQL_0.9-3.tar.gz


~ dir C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages
2013-09-24  13:16           165,363 RMySQL_0.9-3.tar.gz

通过源代码包安装


~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library ‘C:/Program Files/R/R-3.0.1/library‘
* installing *source* package ‘RMySQL‘ ...
** ‘RMySQL‘MD5
checking for $MYSQL_HOME... not found... searching registry...

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/bin/x64/Rscript
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
readRegistry("SOFTWARE\\MySQL AB", hive = "HLM", maxdepth = 2) :
  Registry key ‘SOFTWARE\MySQL AB‘ not found

ERROR: configuration failed for package ‘RMySQL‘
* removing ‘C:/Program Files/R/R-3.0.1/library/RMySQL‘

设置MYSQL_HOME的环境变量


set MYSQL_HOME=D:\toolkit\mysql56

注: MYSQL_HOME建议设置在系统环境变量中。

再一次安装RMySQL


D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library ‘C:/Program Files/R/R-3.0.1/library‘
* installing *source* package ‘RMySQL‘ ...
** ‘RMySQL‘MD5
checking for $MYSQL_HOME... D:\toolkit\mysql56
cygwin warning:
  MS-DOS style path detected: D:\toolkit\mysql56
  Preferred POSIX equivalent is: /cygdrive/d/toolkit/mysql56
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
** libs
: this package has a non-empty ‘configure.win‘ file,
so building only the main architecture

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-DBI.c -o RS-DBI.o
RS-DBI.c: In function ‘RS_na_set‘:
RS-DBI.c:1219:11: warning: variable ‘c‘ set but not used [-Wunused-but-set-variable]
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-MySQL.c -o RS-MySQL.o
RS-MySQL.c: In function ‘RS_MySQL_fetch‘:
RS-MySQL.c:657:13: warning: variable ‘fld_nullOk‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_DBI_invokeBeginGroup‘:
RS-MySQL.c:1137:30: warning: variable ‘val‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_DBI_invokeNewRecord‘:
RS-MySQL.c:1158:20: warning: variable ‘val‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_MySQL_dbApply‘:
RS-MySQL.c:1219:38: warning: variable ‘fld_nullOk‘ set but not used [-Wunused-but-set-variable]
gcc -m64 -shared -s -static-libgcc -o RMySQL.dll tmp.def RS-DBI.o RS-MySQL.o D:\toolkit\mysql56/bin/libmySQL.dll -Ld:/RC
ompile/CRANpkg/extralibs64/local/lib/x64 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-30~1.1/bin/x64 -l
R
gcc.exe: error: D:\toolkit\mysql56/bin/libmySQL.dll: No such file or directory
ERROR: compilation failed for package ‘RMySQL‘
* removing ‘C:/Program Files/R/R-3.0.1/library/RMySQL‘

错误为没有找到动态链接库:D:\toolkit\mysql56/bin/libmySQL.dll


# 复制动态链接库libmySQL.dll
cp D:\toolkit\mysql56\lib\libmysql.dll D:\toolkit\mysql56\binmv D:\toolkit\mysql56\bin\libmysql.dll D:\toolkit\mysql56\bin\libmySQL.dll

再一次安装RMySQL


~ D:\workspace\R\mysql>R CMD INSTALL C:\Users\Administrator\AppData\Local\Temp\RtmpsfqQjK\downloaded_packages\RMySQL_0.9-3
.tar.gz
* installing to library ‘C:/Program Files/R/R-3.0.1/library‘
* installing *source* package ‘RMySQL‘ ...
** ‘RMySQL‘MD5
checking for $MYSQL_HOME... D:\toolkit\mysql56
cygwin warning:
  MS-DOS style path detected: D:\toolkit\mysql56
  Preferred POSIX equivalent is: /cygdrive/d/toolkit/mysql56
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
** libs
: this package has a non-empty ‘configure.win‘ file,
so building only the main architecture

cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  Preferred POSIX equivalent is: /cygdrive/c/PROGRA~1/R/R-30~1.1/etc/x64/Makeconf
  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user‘s guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-DBI.c -o RS-DBI.o
RS-DBI.c: In function ‘RS_na_set‘:
RS-DBI.c:1219:11: warning: variable ‘c‘ set but not used [-Wunused-but-set-variable]
gcc -m64 -I"C:/PROGRA~1/R/R-30~1.1/include" -DNDEBUG -I"D:\toolkit\mysql56"/include    -I"d:/RCompile/CRANpkg/extralibs6
4/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c RS-MySQL.c -o RS-MySQL.o
RS-MySQL.c: In function ‘RS_MySQL_fetch‘:
RS-MySQL.c:657:13: warning: variable ‘fld_nullOk‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_DBI_invokeBeginGroup‘:
RS-MySQL.c:1137:30: warning: variable ‘val‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_DBI_invokeNewRecord‘:
RS-MySQL.c:1158:20: warning: variable ‘val‘ set but not used [-Wunused-but-set-variable]
RS-MySQL.c: In function ‘RS_MySQL_dbApply‘:
RS-MySQL.c:1219:38: warning: variable ‘fld_nullOk‘ set but not used [-Wunused-but-set-variable]
gcc -m64 -shared -s -static-libgcc -o RMySQL.dll tmp.def RS-DBI.o RS-MySQL.o D:\toolkit\mysql56/bin/libmySQL.dll -Ld:/RC
ompile/CRANpkg/extralibs64/local/lib/x64 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-30~1.1/bin/x64 -l
R
installing to C:/Program Files/R/R-3.0.1/library/RMySQL/libs/x64
** R
** inst
** preparing package for lazy loading
Creating a generic function for ‘format‘ from package ‘base‘ in package ‘RMySQL‘
Creating a generic function for ‘print‘ from package ‘base‘ in package ‘RMySQL‘
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
MYSQL_HOME defined as D:\toolkit\mysql56
* DONE (RMySQL)

安装成功!

在MySQL中建库建表


~ mysql -uroot -p

mysql> create database rmysql;
Query OK, 1 row affected (0.04 sec)

mysql> grant all on rmysql.* to [email protected]‘%‘ identified by ‘rmysql‘;
Query OK, 0 rows affected (0.00 sec)

mysql> grant all on rmysql.* to [email protected] identified by ‘rmysql‘;
Query OK, 0 rows affected (0.00 sec)

mysql> use rmysql
Database changed
mysql> CREATE TABLE t_user(
    -> id INT PRIMARY KEY AUTO_INCREMENT,
    -> user varchar(12) NOT NULL UNIQUE
    -> )ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (1.01 sec)

mysql>
mysql> INSERT INTO t_user(user) values(‘A1‘),(‘AB‘),(‘fens.me‘);
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
+----+---------+
3 rows in set (0.03 sec)

通过R程序读MySQL数据库数据。

注:如果刚才没有把MYSQL_HOME的变量写到环境变更中每次在启动R之前要先设置变量。


~ set MYSQL_HOME=D:\toolkit\mysql56
~ R

> library(RMySQL)
DBI
MYSQL_HOME defined as D:\toolkit\mysql56

> conn <- dbConnect(MySQL(), dbname = "rmysql", username="root", password="",client.flag=CLIENT_MULTI_STATEMENTS)
> users = dbGetQuery(conn, "SELECT * FROM t_user")
> dbDisconnect(conn)
[1] TRUE
> users
  id    user
1  1      A1
2  2      AB
3  3 fens.me

好了我们实现了在Win7下R和MySQL的连接。

4. RMySQL函数使用

环境都安装好了接下来我们具体使用一下RMySQL的包。

  • RMySQL辅助操作
  • RMySQL数据库操作
  • 针对win的字符集设置

1). RMySQL辅助操作

加载类库

> library(RMySQL)

建立本地连接


> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",client.flag=CLIENT_MULTI_STATEMENTS)

建立远程连接


> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",host="192.168.1.201",port=3306)

关闭连接

dbDisconnect(conn)

查看数据库的表


> dbListTables(conn)
[1] "t_user"

查看表的字段


> dbListFields(conn, "t_user")
[1] "id"   "user"

查询MySQL信息


> summary(MySQL(), verbose = TRUE)
<MySQLDriver:(23864)>
  Driver name:  MySQL
  Max  connections: 16
  Conn. processed: 3
  Default records per fetch: 500
  DBI API version:  

# MySQL连接实例信息
> summary(conn, verbose = TRUE)
<MySQLConnection:(23864,2)>
  User: root
  Host: localhost
  Dbname: rmysql
  Connection type: localhost via TCP/IP
  MySQL server version:  5.6.11
  MySQL client version:  5.6.11
  MySQL protocol version:  10
  MySQL server thread id:  35
  No resultSet available

# MySQL连接信息
> dbListConnections(MySQL())
[[1]]
<MySQLConnection:(23864,2)>

2). RMySQL数据库操作
RMySQL数据库操作


# 建表并插入数据
> t_demo<-data.frame(
  a=seq(1:10),
  b=letters[1:10],
  c=rnorm(10)
)
> dbWriteTable(conn, "t_demo", t_demo)

# 获得整个表数据
> dbReadTable(conn, "t_demo")
    a b           c
1   1 a  0.98868164
2   2 b -0.66935770
3   3 c  0.27703638
4   4 d  1.36137156
5   5 e -0.70291017
6   6 f  1.61235088
7   7 g  0.17616068
8   8 h  0.29700017
9   9 i  0.19032719
10 10 j -0.06222173

# 插入新数据
> dbWriteTable(conn, "t_demo", t_demo, append=TRUE)
> dbReadTable(conn, "t_demo")
   row_names  a b           c
1          1  1 a  0.98868164
2          2  2 b -0.66935770
3          3  3 c  0.27703638
4          4  4 d  1.36137156
5          5  5 e -0.70291017
6          6  6 f  1.61235088
7          7  7 g  0.17616068
8          8  8 h  0.29700017
9          9  9 i  0.19032719
10        10 10 j -0.06222173
11         1  1 a  0.98868164
12         2  2 b -0.66935770
13         3  3 c  0.27703638
14         4  4 d  1.36137156
15         5  5 e -0.70291017
16         6  6 f  1.61235088
17         7  7 g  0.17616068
18         8  8 h  0.29700017
19         9  9 i  0.19032719
20        10 10 j -0.06222173

# 覆盖原表数据
> dbWriteTable(conn, "t_demo", t_demo, overwrite=TRUE)

# 1). 查询数据
> d0 <- dbGetQuery(conn, "SELECT * FROM t_demo where c>0")
> class(d0)
[1] "data.frame"

> d0
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716
4         6 6 f 1.6123509
5         7 7 g 0.1761607
6         8 8 h 0.2970002
7         9 9 i 0.1903272

# 2). 执行SQL脚本查询并分页
> rs <- dbSendQuery(conn, "SELECT * FROM t_demo where c>0")
> class(rs)
[1] "MySQLResult"
attr(,"package")
[1] "RMySQL"
> mysqlCloseResult(rs)
[1] TRUE

> d1 <- fetch(rs, n = 3)
> d1
  row_names a b         c
1         1 1 a 0.9886816
2         3 3 c 0.2770364
3         4 4 d 1.3613716

# 3). 查看集统计信息
> summary(rs, verbose = TRUE)
  row_names               a              b                   c
 Length:7           Min.   :1.000   Length:7           Min.   :0.1762
 Class :character   1st Qu.:3.500   Class :character   1st Qu.:0.2337
 Mode  :character   Median :6.000   Mode  :character   Median :0.2970
                    Mean   :5.429                      Mean   :0.7004
                    3rd Qu.:7.500                      3rd Qu.:1.1750
                    Max.   :9.000                      Max.   :1.6124

# 不插入row.names字段
> dbWriteTable(conn, "t_demo", t_demo,row.names=FALSE,overwrite=TRUE)
> dbGetQuery(conn, "SELECT * FROM t_demo where c>0")
  a b         c
1 1 a 0.9886816
2 3 c 0.2770364
3 4 d 1.3613716
4 6 f 1.6123509
5 7 g 0.1761607
6 8 h 0.2970002
7 9 i 0.1903272

# 删除表
> if(dbExistsTable(conn,‘t_demo‘)){
+     dbRemoveTable(conn, "t_demo")
+ }
[1] TRUE

执行SQL语句dbSendQuery


> query<-dbSendQuery(conn, "show tables")
> data <- fetch(query, n = -1)
> data
  Tables_in_rmysql
1           t_demo
2           t_user
> mysqlCloseResult(query)
[1] TRUE

4). win的字符集设置
在win7中向MySQL插入中文


mysql> INSERT INTO t_user(user) values(‘小朋友‘),(‘你好‘),(‘正确了‘);
Query OK, 3 rows affected (0.07 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t_user;
+----+---------+
| id | user    |
+----+---------+
|  1 | A1      |
|  2 | AB      |
|  3 | fens.me |
|  5 | 你好    |
|  4 | 小朋友  |
|  6 | 正确了  |
+----+---------+
6 rows in set (0.07 sec)

通过RMySQL查询


> dbGetQuery(conn, "SELECT * FROM t_user")
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5      ??
5  4     ???
6  6     ???

设置GKB字符集


> dbDisconnect(conn)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="root", password="",client.flag=CLIENT_MULTI_STATEMENTS)
> dbSendQuery(conn,‘SET NAMES gbk‘)
<mysqlresult:(23864,43,2)>
> query<-dbSendQuery(conn, "SELECT * FROM t_user")
> data <- fetch(query, n = -1)
> mysqlCloseResult(query)
[1] TRUE
> data
  id    user
1  1      A1
2  2      AB
3  3 fens.me
4  5    你好
5  4  小朋友
6  6  正确了

OK我们在win下面修正字符编号的问题。

5. RMySQL案例实践

系统需求描述:Linux MySQLWin7的R环境远程连接

  • 1. 通过SQL新建表t_blog主键索引唯一键索引
  • 2. 用RMySQL插入数据包括中文字段
  • 3. 再用RMySQL取出数据

1). 通过SQL新建表t_blog主键索引唯一键索引
建表语句


CREATE TABLE t_blog(
id INT PRIMARY KEY AUTO_INCREMENT,
title varchar(12) NOT NULL UNIQUE,
author varchar(12) NOT NULL,
length int NOT NULL,
create_date timestamp NOT NULL DEFAULT now()
)ENGINE=INNODB DEFAULT CHARSET=UTF8;

mysql> desc t_blog;
+-------------+-------------+------+-----+-------------------+----------------+
| Field       | Type        | Null | Key | Default           | Extra          |
+-------------+-------------+------+-----+-------------------+----------------+
| id          | int(11)     | NO   | PRI | NULL              | auto_increment |
| title       | varchar(12) | NO   | UNI | NULL              |                |
| author      | varchar(12) | NO   |     | NULL              |                |
| length      | int(11)     | NO   |     | NULL              |                |
| create_date | timestamp   | NO   |     | CURRENT_TIMESTAMP |                |
+-------------+-------------+------+-----+-------------------+----------------+
5 rows in set (0.00 sec)

mysql> show indexes from t_blog;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_blog |          0 | PRIMARY  |            1 | id          | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
| t_blog |          0 | title    |            1 | title       | A         |           3 |     NULL | NULL   |      | BTREE      |         |               |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)

INSERT INTO t_blog(title,author,length) values(‘你好第一篇‘,‘Conan‘,20),(‘RMySQL数据库编程‘,‘Conan‘,99),(‘R的极客理想系列文章‘,‘Conan‘,15);

mysql> select * from t_blog;
+----+------------------------------+--------+--------+---------------------+
| id | title                        | author | length | create_date         |
+----+------------------------------+--------+--------+---------------------+
|  1 | 你好第一篇                 | Conan  |     20 | 2013-08-15 00:13:13 |
|  2 | RMySQL数据库编程             | Conan  |     99 | 2013-08-15 00:13:13 |
|  3 | R的极客理想系列文章          | Conan  |     15 | 2013-08-15 00:13:13 |
+----+------------------------------+--------+--------+---------------------+
3 rows in set (0.00 sec)

2). 用RMySQL插入数据包括中文字段再取出数据


> library(RMySQL)
> conn <- dbConnect(MySQL(), dbname = "rmysql", username="rmysql", password="rmysql",host="192.168.1.201",port=3306)
>
> dbSendQuery(conn,‘SET NAMES gbk‘)
<mysqlresult:(23864,49,0)>
> dbSendQuery(conn,"INSERT INTO t_blog(title,author,length) values(‘R插入的新文章‘,‘Conan‘,50)");
<mysqlresult:(23864,49,1)>
>
> query<-dbSendQuery(conn, "SELECT * FROM t_blog")
Warning message:
In mysqlExecStatement(conn, statement, ...) :
  RS-DBI driver warning: (unrecognized MySQL field type 7 in column 4 imported as character)
> data <- fetch(query, n = -1)
> mysqlCloseResult(query)
[1] TRUE
> print(data)
  id               title author length         create_date
1  1        你好第一篇  Conan     20 2013-08-15 00:13:13
2  2    RMySQL数据库编程  Conan     99 2013-08-15 00:13:13
3  3 R的极客理想系列文章  Conan     15 2013-08-15 00:13:13
4  4       R插入的新文章  Conan     50 2013-08-15 00:29:45
>
> dbDisconnect(conn)
[1] TRUE

特别提示不能用dbWriteTable函数!

时间: 2024-12-11 02:21:33

RMySQL数据库编程指南的相关文章

数据库编程4 Oracle Pro*C/C++开发

[本文谢绝转载原文来自http://990487026.blog.51cto.com] 1,上次难点复习 1 求所有部门的平均奖金 2 求各部门的平均薪水 3 求各部门每个工种的平均薪水 4 求各部门每个工种大于2000的平均薪水 5 求10号部门的平均工资(2种写法) 6 创建一个学生表 7 并向表中插入一条数据 8 列出不是管理层的员工: 9 工资最高前10名的员工信息(rownum是一个属性值) 10 找到各部门大于本部门平均薪水的员工: 11 创建索引 Linux 环境 Oracle  

Archive for the ‘Erlang’ Category 《Erlang编程指南》读后感

http://timyang.net/category/erlang/ 在云时代,我们需要有更好的能利用多核功能及分布式能力的编程语言,Erlang在这方面具有天生的优势,因此我们始终对它保持强烈关注. 按:此为客座文章,投稿人为新浪微博基础研发工程师赵鹏城(http://weibo.com/iamzpc),以下为原文.在对一个分布式KV存储系统的研究过程中,我有幸遇到了Erlang语言.因此,我研究工作的第一目标就是快速入门Erlang语言并在实际研究过程中进一步深入理解Erlang的精髓.在

Apache Spark 2.2.0 中文文档 - Spark Streaming 编程指南 | ApacheCN

Spark Streaming 编程指南 概述 一个入门示例 基础概念 依赖 初始化 StreamingContext Discretized Streams (DStreams)(离散化流) Input DStreams 和 Receivers(接收器) DStreams 上的 Transformations(转换) DStreams 上的输出操作 DataFrame 和 SQL 操作 MLlib 操作 缓存 / 持久性 Checkpointing Accumulators, Broadcas

Spark(1.6.1) Sql 编程指南+实战案例分析

首先看看从官网学习后总结的一个思维导图 概述(Overview) Spark SQL是Spark的一个模块,用于结构化数据处理.它提供了一个编程的抽象被称为DataFrames,也可以作为分布式SQL查询引擎. 开始Spark SQL Spark SQL中所有功能的入口点是SQLContext类,或者它子类中的一个.为了创建一个基本的SQLContext,你所需要的是一个SparkContext. 除了基本的SQLContext,你还可以创建一个HiveContext,它提供了基本的SQLCon

Apache Spark 2.2.0 中文文档 - Structured Streaming 编程指南 | ApacheCN

Structured Streaming 编程指南 概述 快速示例 Programming Model (编程模型) 基本概念 处理 Event-time 和延迟数据 容错语义 API 使用 Datasets 和 DataFrames 创建 streaming DataFrames 和 streaming Datasets Input Sources (输入源) streaming DataFrames/Datasets 的模式接口和分区 streaming DataFrames/Dataset

(转载)JDOM/XPATH编程指南

JDOM/XPATH编程指南 本文分别介绍了 JDOM 和 XPATH,以及结合两者进行 XML 编程带来的好处. 前言 XML是一种优秀的数据打包和数据交换的形式,在当今XML大行于天下,如果没有听说过它的大名,那可真是孤陋寡闻了.用XML描述数据的优势显而易见,它具有结构简单,便于人和机器阅读的双重功效,并弥补了关系型数据对客观世界中真实数据描述能力的不足.W3C组织根据技术领域的需要,制定出了XML的格式规范,并相应的建立了描述模型,简称DOM.各种流行的程序设计语言都纷纷根据这一模型推出

高级Bash脚本编程指南

http://tldp.org/LDP/abs/html/ 高级Bash脚本编程指南对脚本语言艺术的深入探索 本教程不承担以前的脚本或编程知识,但进展迅速走向一个中级/高级水平的指令...一直偷偷在细小的UNIX®智慧和学识.它作为一本教科书,一本手册,自学,并作为一个参考和知识的来源,壳牌的脚本技术.练习和大量的评论实例请读者参与,在这样的前提下,真正学习脚本的唯一途径是编写脚本.这本书是适合课堂使用的一般介绍编程的概念.本文件被授予公共领域.没有版权! 奉献对于安妮塔,所有魔术的来源内容表第

[转]ODBC编程指南

DM4 ODBC编程指南本章结合DM4数据库的特点,比较全面系统的介绍ODBC的基本概念以及DM4 ODBC DRIVER的使用方法,以便用户更好地使用DM4 ODBC编写应用程序.ODBC提供给你访问不同类型的数据库的途径.结构化查询语言SQL是一种用来访问数据库的语言.通过使用ODBC,应用程序能够使用相同的源代码和各种各样的数据库交互.这使得开发者不需要以特殊的数据库管理系统DBMS为目标,或者了解不同支撑背景的数据库的详细细节,就能够开发和发布客户/服务器应用程序.DM4 ODBC3.0

使用结构(C# 编程指南)

struct 类型适于表示 Point.Rectangle 和 Color 等轻量对象. 尽管使用自动实现的属性将一个点表示为类同样方便,但在某些情况下使用结构更加有效. 例如,如果声明一个 1000 个 Point 对象组成的数组,为了引用每个对象,则需分配更多内存:这种情况下,使用结构可以节约资源. 因为 .NET Framework 包含一个名为 Point 的对象,所以本示例中的结构命名为“CoOrds”. 1 public struct CoOrds 2 { 3 public int