sqoop同步mysql到hdfs

链接:http://pan.baidu.com/s/1gfHnaVL 密码:7j12

mysql-connector version 5.1.32

若在安装版本过程遇到些问题,可参考http://dbspace.blog.51cto.com/6873717/1875955,其中一些问题的解决办法

下载并安装:

cd /usr/local/tar -zxvf sqoop2-1.99.3-cdh5.0.0.tar.gzmv sqoop2-1.99.3-cdh5.0.0 sqoop添加sqoop2到系统环境变量中:export SQOOP_HOME=/usr/local/sqoopexport CATALINA_BASE=$SQOOP_HOME/serverexport PATH=$PATH:/usr/local/sqoop/bin拷贝mysql驱动包到$SQOOP2_HOME/server/lib下cp mysql-connector-java-5.1.32-bin.jar /usr/local/sqloop/server/lib/ 修改配置文件:vim /usr/local/sqoop/server/conf/sqoop.propertiesorg.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop#hadoop的配置文件路径vim /usr/local/sqoop/server/conf/catalina.properties把原来58行注释了,这里主要配置了hadoop的jar包的路径信息common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${ca    talina.home}/../lib/*.jar,/usr/local/hadoop/share/hadoop/common/*.jar,/usr/local/hadoop/share/hadoop/common/lib/*.    jar,/usr/local/hadoop/share/hadoop/hdfs/*.jar,/usr/local/hadoop/share/hadoop/hdfs/lib/*.jar,/usr/local/hadoop/shar    e/hadoop/mapreduce/*.jar,/usr/local/hadoop/share/hadoop/mapreduce/lib/*.jar,/usr/local/hadoop/share/hadoop/tools/*    .jar,/usr/local/hadoop/share/hadoop/tools/lib/*.jar,/usr/local/hadoop/share/hadoop/yarn/*.jar,/usr/local/hadoop/sh    are/hadoop/yarn/lib/*.jar启动\停止sqoop/usr/local/sqoop/sqoop2-server start/stop验证是否启动成功:方式一:jps查看进程: Bootstrap [[email protected] sqoop]# jps

25505 SqoopShell13080 SecondaryNameNode12878 NameNode26568 Jps方式二:方式二:http://192.168.1.114:12000/sqoop/version #SQOOP默认使用的端口为12000在/usr/local/sqoop/server/conf/server.xml中进行设置

####接下来测试mysql到hadoop存储的一个过程1、客户端登陆[[email protected] bin]# sqoop2-shell Sqoop home directory: /usr/local/sqoopSqoop Shell: Type ‘help‘ or ‘\h‘ for help.sqoop:000> 2、创建一个mysql链接,在这个版本create 就只有[connection|job],注意不同版本的添加链接方式是不同的. 查看支持的链接服务 sqoop:000> show connector+----+------------------------+-----------------+------------------------------------------------------+| Id |          Name          |     Version     |                        Class                         |+----+------------------------+-----------------+------------------------------------------------------+| 1  | generic-jdbc-connector | 1.99.3-cdh5.0.0 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector |+----+------------------------+-----------------+------------------------------------------------------+##在1.99.7的版本显示的方式和服务更多。sqoop:000> create connection --cid 1Creating connection for connector with id 1Please fill following values to create new connection objectName: mysql_to_hadoop

Connection configuration

JDBC Driver Class: com.mysql.jdbc.DriverJDBC Connection String: jdbc:mysql://192.168.1.107:3306/sqoop #这里需要在1.107先添加好库sqoopUsername: sqoop##需要在数据库添加好链接的用户Password: *******JDBC Connection Properties: There are currently 0 values in the map:entry# Security related configuration optionsMax connections:New connection was successfully created with validation status ACCEPTABLE and persistent id 22、创建jobsqoop:000> create job --xid 2 --type import##注意 --xid 2为链接的id号Creating job for connection with id 2Please fill following values to create new job objectName: mysql_to_hadoop

Database configuration

Schema name: sqoop#MySQL的库名Table name: wangyuan#库下的表Table SQL statement: Table column names: Partition column name: Nulls in partition column: Boundary query: 

Output configuration

Storage type:   0 : HDFSChoose: 0Output format:   0 : TEXT_FILE  1 : SEQUENCE_FILEChoose: 0Compression format:   0 : NONE  1 : DEFAULT  2 : DEFLATE  3 : GZIP  4 : BZIP2  5 : LZO  6 : LZ4  7 : SNAPPYChoose: 0Output directory: hdfs://192.168.1.114:9000/home/mysql_to_hdfs2#注意这个mysql_to_hdfs不能再hadoop的/home/已经存在的,但/home路径要存在,9000端口是在配置hadoop的时候配置,根据实际,或者通过WEB查看http:ip:50070----显示Overview ‘mycat:9000‘ (active)创建hdfs路径/usr/local/hadoop/bin/hadoop fs -mkidr /home查看创建目录:/usr/local/hadoop/bin/hadoop fs -ls /home 或者通过WEB查看http:ip:50070Throttling resourcesExtractors: Loaders: New job was successfully created with validation status FINE  and persistent id 2sqoop:000> 启动jobsqoop:000> start job --jid 2Exception has occurred during processing command Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception根本不知道这个提示说什么,通过修改设置:set option --name verbose --value truesqoop:000> start job --jid 2        Submission detailsJob ID: 2Server URL: http://localhost:12000/sqoop/Created by: rootCreation date: 2016-11-23 21:15:27 CSTLastly updated by: rootExternal ID: job_1479653943050_0007	http://haproxy:8088/proxy/application_1479653943050_0007/Connector schema: Schema{name=wangyuan,columns=[	FixedPoint{name=id,nullable=null,byteSize=null,unsigned=null},	Date{name=c_time,nullable=null,fraction=null,timezone=null}]}2016-11-23 21:15:27 CST: BOOTING  - Progress is not available返回这样信息OK查看结果通过WEB

/usr/local/hadoop/bin/hadoop fs -ls /home/

时间: 2024-10-10 17:32:06

sqoop同步mysql到hdfs的相关文章

sqoop同步mysql数据到hive中

一.sqoop 在同步mysql表结构到hive sqoop create-hive-table --connect jdbc:mysql://ip:3306/sampledata --table t1--username dev --password 1234 --hive-table t1; 执行到这一步就退出了,但是在hadoop的hdfs上的/hive/warehouse/的目录下是找不到t1表的目录, 但是正常执行完成是下面这样的: 错误就是hive的jar包有缺失 全部的jar包该是

通过Sqoop实现Mysql / Oracle 与HDFS / Hbase互导数据

通过Sqoop实现Mysql / Oracle 与HDFS / Hbase互导数据\ 下文将重点说明通过Sqoop实现Mysql与HDFS互导数据,Mysql与Hbase,Oracle与Hbase的互导最后给出命令.一.Mysql与HDFS互导数据环境:宿主机器操作系统为Win7,Mysql安装在宿主机上,宿主机地址为192.168.66.963台虚拟机操作系统为Ubuntu-12.04.1-32位三台虚拟机已成功安装hadoop,并实现免密钥互访,配hosts为:192.168.66.91 m

hive、sqoop、MySQL间的数据传递

hdfs到MySQL csv/txt文件到hdfs MySQL到hdfs  hive与hdfs的映射: drop table if exists emp;create table emp ( id int comment 'ID', emp_name string comment '姓名', job string ) comment '职业' row format delimited -- stored as rcfile location '/user/hive/warehouse/emp';

Sqoop2从Mysql导入Hdfs (hadoop-2.7.1,Sqoop 1.99.6)

一.环境搭建 1.Hadoop http://my.oschina.net/u/204498/blog/519789 2.Sqoop2.x http://my.oschina.net/u/204498/blog/518941 3. mysql 二.从mysql导入hdfs 1.创建mysql数据库.表.以及测试数据 xxxxxxxx$  mysql -uroot -p Enter password:  mysql> show databases; +--------------------+ |

sqoop从mysql导入到hdfs

1.mysql -- 创建数据库 create database logs; -- 使用 use logs; -- 创建表 create table weblogs(  md5 varchar(32),  url varchar(64),  request_date date,  request_time time,  ip varchar(15) ); -- 从外部文本文件加载数据 load data infile '/path/weblogs_entries.txt' into table

使用sqoop将MySQL数据库中的数据导入Hbase

使用sqoop将MySQL数据库中的数据导入Hbase 前提:安装好 sqoop.hbase. 下载jbdc驱动:mysql-connector-java-5.1.10.jar 将 mysql-connector-java-5.1.10.jar 拷贝到 /usr/lib/sqoop/lib/ 下 MySQL导入HBase命令: sqoop import --connect jdbc:mysql://10.10.97.116:3306/rsearch --table researchers --h

[Sqoop]将Mysql数据表导入到Hive

业务背景 mysql数据表YHD_CATEG_PRIOR结构如下: -- Table "YHD_CATEG_PRIOR" DDL CREATE TABLE `YHD_CATEG_PRIOR` ( `category_id` int(11) NOT NULL COMMENT '类目ID', `category_name` varchar(250) DEFAULT NULL COMMENT '类目名称', `category_level` int(11) DEFAULT '0' COMME

使用 sqoop 将mysql数据导入到hive(import)

Sqoop 将mysql 数据导入到hive(import) 1.创建mysql表 CREATE TABLE `sqoop_test` ( `id` int(11) DEFAULT NULL, `name` varchar(255) DEFAULT NULL, `age` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 插入数据 2.hive 建表 hive> create external table sqoop_test

教程 | 使用Sqoop从MySQL导入数据到Hive和HBase

基础环境 sqoop:sqoop-1.4.5+cdh5.3.6+78, hive:hive-0.13.1+cdh5.3.6+397, hbase:hbase-0.98.6+cdh5.3.6+115 Sqool和Hive.HBase简介 Sqoop Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的开源工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中. Hiv