Alex 的 Hadoop 菜鸟教程: 第8课 Sqoop1 安装/导入/导出教程

靠!sqoop2的文档太少了,而且居然不支持Hbase,十分简陋,所以我愤而放弃Sqoop2转为使用Sqoop1,之前跟着我教程看到朋友不要拿砖砸我,我是也是不知情的群众

卸载sqoop2

这步可选,如果你们是照着我之前的教程你们已经装了sqoop2就得先卸载掉,没装的可以跳过这步

$ sudo su -
$ service sqoop2-server stop
$ yum -y remove sqoop2-server
$ yum -y remove sqoop2-client

安装Sqoop1

 yum install -y sqoop

用help测试下是否有安装好

# sqoop help
Warning: /usr/lib/sqoop/../hive-hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/11/28 11:33:11 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.1
usage: sqoop COMMAND [ARGS]

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.

拷贝驱动到 /usr/lib/sqoop/lib

mysql jdbc 驱动下载地址

下载后,解压开找到驱动jar包,upload到服务器上,然后移过去

mv /home/alex/mysql-connector-java-5.1.34-bin.jar /usr/lib/sqoop/lib

导入

数据准备

在mysql里面建立一个表

CREATE TABLE `student` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;  

插入几条数据

insert into student (id,name) values (1,'jim');
insert into student (id,name) values (2,'billy');
insert into student (id,name) values (3,'jackson'); 

导入mysql到hdfs

列出所有表

我们先不急着导入,先做几个准备步骤热身一下,也方便排查问题

列出所有数据库

# sqoop list-databases --connect jdbc:mysql://localhost:3306/sqoop_test --username root --password root
Warning: /usr/lib/sqoop/../hive-hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/12/01 09:20:28 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.1
14/12/01 09:20:28 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/12/01 09:20:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
cacti
metastore
mysql
sqoop_test
wordpress
zabbix

先用sqoop连接上数据库并列出所有表

# sqoop list-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root
Warning: /usr/lib/sqoop/../hive-hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/11/28 11:46:11 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.1
14/11/28 11:46:11 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/11/28 11:46:11 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
employee
student
workers

这条命令不用跟驱动的类名是因为sqoop默认支持mysql的,如果要跟jdbc驱动的类名用

# sqoop list-tables --connect jdbc:mysql://localhost/sqoop_test --username root --password root --driver com.mysql.jdbc.Driver

导入数据到hdfs

sqoop import --connect jdbc:mysql://localhost:3306/sqoop_test --username root --password root --table employee --m 1 --target-dir /user/test3
# sqoop import --connect jdbc:mysql://localhost:3306/sqoop_test --username root --password root --table employee --m 1 --target-dir /user/test
Warning: /usr/lib/sqoop/../hive-hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/12/01 14:15:41 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.1
14/12/01 14:15:41 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/12/01 14:15:41 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/12/01 14:15:41 INFO tool.CodeGenTool: Beginning code generation
14/12/01 14:15:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
14/12/01 14:15:42 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
14/12/01 14:15:42 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/7b8091924ce8deb4f2ccae14c404a5bf/employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/12/01 14:15:45 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/7b8091924ce8deb4f2ccae14c404a5bf/employee.jar
14/12/01 14:15:45 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/12/01 14:15:45 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/12/01 14:15:45 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/12/01 14:15:45 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/12/01 14:15:45 INFO mapreduce.ImportJobBase: Beginning import of employee
14/12/01 14:15:46 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/12/01 14:15:47 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/12/01 14:15:47 INFO client.RMProxy: Connecting to ResourceManager at xmseapp01/10.172.78.111:8032
14/12/01 14:15:50 INFO db.DBInputFormat: Using read commited transaction isolation
14/12/01 14:15:51 INFO mapreduce.JobSubmitter: number of splits:1
14/12/01 14:15:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406097234796_0019
14/12/01 14:15:52 INFO impl.YarnClientImpl: Submitted application application_1406097234796_0019
14/12/01 14:15:52 INFO mapreduce.Job: The url to track the job: http://xmseapp01:8088/proxy/application_1406097234796_0019/
14/12/01 14:15:52 INFO mapreduce.Job: Running job: job_1406097234796_0019
14/12/01 14:16:08 INFO mapreduce.Job: Job job_1406097234796_0019 running in uber mode : false
14/12/01 14:16:08 INFO mapreduce.Job:  map 0% reduce 0%
14/12/01 14:16:19 INFO mapreduce.Job:  map 100% reduce 0%
14/12/01 14:16:20 INFO mapreduce.Job: Job job_1406097234796_0019 completed successfully
14/12/01 14:16:21 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=99855
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=16
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=8714
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=8714
		Total vcore-seconds taken by all map tasks=8714
		Total megabyte-seconds taken by all map tasks=8923136
	Map-Reduce Framework
		Map input records=2
		Map output records=2
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=58
		CPU time spent (ms)=1560
		Physical memory (bytes) snapshot=183005184
		Virtual memory (bytes) snapshot=704577536
		Total committed heap usage (bytes)=148897792
	File Input Format Counters
		Bytes Read=0
	File Output Format Counters
		Bytes Written=16
14/12/01 14:16:21 INFO mapreduce.ImportJobBase: Transferred 16 bytes in 33.6243 seconds (0.4758 bytes/sec)
14/12/01 14:16:21 INFO mapreduce.ImportJobBase: Retrieved 2 records.

查看一下结果

# hdfs dfs -ls /user/test
Found 2 items
-rw-r--r--   2 root supergroup          0 2014-12-01 14:16 /user/test/_SUCCESS
-rw-r--r--   2 root supergroup         16 2014-12-01 14:16 /user/test/part-m-00000
# hdfs dfs -cat /user/test/part-m-00000
1,michael
2,ted

我也不知道为什么mysql有3条数据,而导入了之后只有2条,有哪位懂的介绍下?

我遇到遇到的问题

如果你遇到以下问题

14/12/01 10:12:42 INFO mapreduce.Job: Task Id : attempt_1406097234796_0017_m_000000_0, Status : FAILED
Error: employee : Unsupported major.minor version 51.0

用ps aux| grep hadoop看下会发现hadoop用的是jdk1.6 。我的cdh是5.0.1 sqoop版本是 1.4.4 ,我遇到了这个问题。

原因:sqoop是使用jdk1.7编译的,所以如果你用 ps aux| grep hadoop 看到hadoop用的是1.6运行的,那sqoop不能正常工作

注意:CDH4.7以上已经兼容jdk1.7 ,但如果你是从4.5升级上来的会发现hadoop用的是jdk1.6,需要修改一下整个hadoop调用的jdk为1.7,而且这是官方推荐的搭配 

关于改jdk的方法

官方提供了2个方法

http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Requirements-and-Supported-Versions/cdhrsv_topic_3.html

这个是让你把 /usr/java/ 下建一个软链叫 default 指向你要的jdk,我这么做了,无效

http://www.cloudera.com/content/cloudera/en/documentation/archives/cloudera-manager-4/v4-5-3/Cloudera-Manager-Enterprise-Edition-Installation-Guide/cmeeig_topic_16_2.html

这个是叫你增加一个环境变量, 我这么做了,无效

最后我用了简单粗暴的办法:停掉所有相关服务,然后删掉那个该死的jdk1.6然后再重启,这回就用了 /usr/java/default 了

停掉所有hadoop相关服务的命令

for x in `cd /etc/init.d ; ls hive-*` ; do sudo service $x stop ; done
for x in `cd /etc/init.d ; ls hbase-*` ; do sudo service $x stop ; done
/etc/init.d/zookeeper-server stop
for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done

zookeeper , hbase, hive 如果你们没装就跳过。建议你们用ps aux | grep jre1.6 去找找有什么服务,然后一个一个关掉,先关其他的,最后关hadoop

启动所有

for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x start ; done
/etc/init.d/zookeeper-server start
for x in `cd /etc/init.d ; ls hbase-*` ; do sudo service $x start ; done
for x in `cd /etc/init.d ; ls hive-*` ; do sudo service $x start ; done

从hdfs导出数据到mysql

接着这个例子做

数据准备

清空employee

truncate employee

导出数据到mysql

# sqoop export --connect jdbc:mysql://localhost:3306/sqoop_test --username root --password root --table employee --m 1 --export-dir /user/test
Warning: /usr/lib/sqoop/../hive-hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
14/12/01 15:16:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.1
14/12/01 15:16:50 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/12/01 15:16:51 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/12/01 15:16:51 INFO tool.CodeGenTool: Beginning code generation
14/12/01 15:16:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
14/12/01 15:16:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `employee` AS t LIMIT 1
14/12/01 15:16:52 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/f4a75fdefe1eb604181d47d6bc827e48/employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/12/01 15:16:55 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/f4a75fdefe1eb604181d47d6bc827e48/employee.jar
14/12/01 15:16:55 INFO mapreduce.ExportJobBase: Beginning export of employee
14/12/01 15:16:55 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/12/01 15:16:57 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/12/01 15:16:57 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/12/01 15:16:57 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/12/01 15:16:57 INFO client.RMProxy: Connecting to ResourceManager at xmseapp01/10.172.78.111:8032
14/12/01 15:17:00 INFO input.FileInputFormat: Total input paths to process : 1
14/12/01 15:17:00 INFO input.FileInputFormat: Total input paths to process : 1
14/12/01 15:17:00 INFO mapreduce.JobSubmitter: number of splits:1
14/12/01 15:17:00 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1406097234796_0021
14/12/01 15:17:01 INFO impl.YarnClientImpl: Submitted application application_1406097234796_0021
14/12/01 15:17:01 INFO mapreduce.Job: The url to track the job: http://xmseapp01:8088/proxy/application_1406097234796_0021/
14/12/01 15:17:01 INFO mapreduce.Job: Running job: job_1406097234796_0021
14/12/01 15:17:13 INFO mapreduce.Job: Job job_1406097234796_0021 running in uber mode : false
14/12/01 15:17:13 INFO mapreduce.Job:  map 0% reduce 0%
14/12/01 15:17:21 INFO mapreduce.Job: Task Id : attempt_1406097234796_0021_m_000000_0, Status : FAILED
Error: java.io.IOException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'sqoop_test'
	at org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:79)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:624)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:744)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'sqoop_test'
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
	at com.mysql.jdbc.Util.getInstance(Util.java:360)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:978)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3887)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3823)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
	at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1659)
	at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1206)
	at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2234)
	at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2265)
	at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2064)
	at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:790)
	at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:44)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
	at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:395)
	at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:325)
	at java.sql.DriverManager.getConnection(DriverManager.java:571)
	at java.sql.DriverManager.getConnection(DriverManager.java:215)
	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
	at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.<init>(AsyncSqlRecordWriter.java:76)
	at org.apache.sqoop.mapreduce.ExportOutputFormat$ExportRecordWriter.<init>(ExportOutputFormat.java:95)
	at org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:77)
	... 8 more

14/12/01 15:17:29 INFO mapreduce.Job: Task Id : attempt_1406097234796_0021_m_000000_1, Status : FAILED
Error: java.io.IOException: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'sqoop_test'
	at org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:79)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:624)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:744)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'sqoop_test'
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
	at com.mysql.jdbc.Util.getInstance(Util.java:360)
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:978)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3887)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3823)
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
	at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1659)
	at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1206)
	at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2234)
	at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2265)
	at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2064)
	at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:790)
	at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:44)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
	at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
	at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:395)
	at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:325)
	at java.sql.DriverManager.getConnection(DriverManager.java:571)
	at java.sql.DriverManager.getConnection(DriverManager.java:215)
	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302)
	at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.<init>(AsyncSqlRecordWriter.java:76)
	at org.apache.sqoop.mapreduce.ExportOutputFormat$ExportRecordWriter.<init>(ExportOutputFormat.java:95)
	at org.apache.sqoop.mapreduce.ExportOutputFormat.getRecordWriter(ExportOutputFormat.java:77)
	... 8 more

14/12/01 15:17:40 INFO mapreduce.Job:  map 100% reduce 0%
14/12/01 15:17:41 INFO mapreduce.Job: Job job_1406097234796_0021 completed successfully
14/12/01 15:17:41 INFO mapreduce.Job: Counters: 32
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=99542
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=139
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=4
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters
		Failed map tasks=2
		Launched map tasks=3
		Other local map tasks=2
		Rack-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=21200
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=21200
		Total vcore-seconds taken by all map tasks=21200
		Total megabyte-seconds taken by all map tasks=21708800
	Map-Reduce Framework
		Map input records=2
		Map output records=2
		Input split bytes=120
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=86
		CPU time spent (ms)=1330
		Physical memory (bytes) snapshot=177094656
		Virtual memory (bytes) snapshot=686768128
		Total committed heap usage (bytes)=148897792
	File Input Format Counters
		Bytes Read=0
	File Output Format Counters
		Bytes Written=0
14/12/01 15:17:41 INFO mapreduce.ExportJobBase: Transferred 139 bytes in 43.6687 seconds (3.1831 bytes/sec)
14/12/01 15:17:41 INFO mapreduce.ExportJobBase: Exported 2 records.

那一串异常我也不知道为什么会有?!反正最后去mysql看成功导出了2条数据

mysql> select * from employee;
+----+---------+
| id | name    |
+----+---------+
|  1 | michael |
|  2 | ted     |
+----+---------+
2 rows in set (0.00 sec)

好,下课!

时间: 2024-10-06 10:04:15

Alex 的 Hadoop 菜鸟教程: 第8课 Sqoop1 安装/导入/导出教程的相关文章

Alex 的 Hadoop 菜鸟教程: 第17课 Spark 安装以及使用教程

声明 本文基于Centos6.x + CDH 5.x 本文基于CSDN的markdown编辑器写成,csdn终于支持markdown了,高兴! Spark是什么 Spark是Apache的顶级项目.项目背景是 Hadoop 的 MapReduce 太挫太慢了,于是有人就做了Spark,目前Spark声称在内存中比Hadoop快100倍,在磁盘上比Hadoop快10倍. 安装Spark spark有5个组件 spark-core: spark核心包 spark-worker: spark-work

Alex 的 Hadoop 菜鸟教程: 第15课 Impala 安装使用教程

声明: 本文基于Centos 6.x + CDH 5.x 硬件要求 Impala的使用是有硬件要求的!我第一次见到有硬件要求的,那就是你的CPU必须支持SSSE3,如果你的CPU较老,不支持SSSE3 (3个S),那么你只能找别的机器来学习Impala,因为虚拟机也没有办法给你虚拟出不同的CPU. 为什么用 Impala 因为Hive 太慢了!Impala 也可以执行SQL,但是比Hive的速度快很多.为什么Impala可以比Hive快呢?因为Hive采用的是把你的sql转化成hadoop 的

Alex 的 Hadoop 菜鸟教程: 第4课 Hadoop 安装教程 - HA方式 (2台服务器)

声明 本文基于Centos 6.x + CDH 5.x 官方英文安装教程 http://www.cloudera.com/content/cloudera/en/documentation/cdh5/v5-0-0/CDH5-Installation-Guide/cdh5ig_cdh5_install.html 本文并不是简单翻译,而是再整理 如果没有yum源请参考http://blog.csdn.net/nsrainbow/article/details/36629339#t2 准备工作 用vm

Alex 的 Hadoop 菜鸟教程: 第19课 华丽的控制台 HUE 安装以及使用教程

声明 本文基于Centos 6.x + CDH 5.x HUE Hadoop也有web管理控制台,而且还很华丽,它的名字叫HUE.通过HUE可以管理Hadoop常见的组件.下面用一幅图说明HUE能管理哪些组件 除了Oozie,LDAP SAML和Solr以外,前面的课程都说过了,Oozie是一个工作流组件,在下一课讲解,LDAP是一个用户密码的管理中心,负责用户的登陆.Solr是一个全文检索的引擎,不过其实Solr也不算Hadoop系专有的,Solr在很早以前就出现了,要算的话应该算Lucene

Alex 的 Hadoop 菜鸟教程: 第7课 Sqoop2 导出教程

承接上节课,现在说说导出教程 检查连接 先看看有没有可用的connection 连接,如果没有就要根据上节课的方法创建一个 sqoop:000> show connector --all 1 connector(s) to show: Connector with id 1: Name: generic-jdbc-connector Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector Version: 1.99.3-cdh5.0

Linux下mongodb安装及数据导入导出教程

Linux下mongodb安装及数据导入导出教程 #查看linux发行版本 cat /etc/issue #查看linux内核版本号 uname -r 一.Linux下mongodb安装的一般步骤 1.到mongodb的官网(https://www.mongodb.org/downloads) 下载相应你系统的安装包,拷贝(能够用ftp工具如winscp)到你的linux系统上面. 2.解压相应的安装包 命令例如以下:tar zxvf mongodb-linux-x86_64-3.0.4.tgz

Alex 的 Hadoop 菜鸟教程: 第8课 Sqoop1 导入 Hbase 以及 Hive

继续写,其实mysql 导入导出 hdfs 对于实际项目开发没啥用的,但是那个可以拿来入门.今天写跟Hbase和Hive的协作.我突然发现我的教程写的顺序很凌乱啊,没有先介绍Hive 的安装,这点向大家道歉,我后面补上. 数据准备 mysql 在mysql 里面建立表 employee 并插入数据 CREATE TABLE `employee` ( `id` int(11) NOT NULL, `name` varchar(20) NOT NULL, PRIMARY KEY (`id`) ) E

Alex 的 Hadoop 菜鸟教程: 第10课 Hive 入门教程

Hive 安装 相比起很多教程先介绍概念,我喜欢先动手装上,然后用例子来介绍概念.我们先来安装一下Hive 先确认是否已经安装了对应的yum源,如果没有照这个教程里面写的安装cdh的yum源http://blog.csdn.net/nsrainbow/article/details/36629339 Hive是什么 Hive 提供了一个让大家可以使用sql去查询数据的途径.但是最好不要拿Hive进行实时的查询.因为Hive的实现原理是把sql语句转化为多个Map Reduce任务所以Hive非常

Alex 的 Hadoop 菜鸟教程: 第20课 工作流引擎 Oozie

本文基于 Centos6.x + CDH5.x Oozie是什么 简单的说Oozie是一个工作流引擎.只不过它是一个基于Hadoop的工作流引擎,在实际工作中,遇到对数据进行一连串的操作的时候很实用,不需要自己写一些处理代码了,只需要定义好各个action,然后把他们串在一个工作流里面就可以自动执行了.对于大数据的分析工作非常有用 安装Oozie Oozie分为服务端和客户端,我现在选择host1作为服务端,host2作为客户端. 所以在host1上运行 yum install oozie 在h