Sqoop2入门之导入关系型数据库数据到HDFS上(sqoop2-1.99.4版本)

sqoop2-1.99.4和sqoop2-1.99.3版本操作略有不同:新版本中使用link代替了老版本的connection,其他使用类似。

sqoop2-1.99.4环境搭建参见:Sqoop2环境搭建

sqoop2-1.99.3版本实现参见:Sqoop2入门之导入关系型数据库数据到HDFS上

启动sqoop2-1.99.4版本客户端:

$SQOOP2_HOME/bin/sqoop.sh client
set server --host hadoop000 --port 12000 --webapp sqoop

查看所有connector:

show connector --all
2 connector(s) to show:
        Connector with id 1:
            Name: hdfs-connector
            Class: org.apache.sqoop.connector.hdfs.HdfsConnector
            Version: 1.99.4-cdh5.3.0

        Connector with id 2:
            Name: generic-jdbc-connector
            Class: org.apache.sqoop.connector.jdbc.GenericJdbcConnector
            Version: 1.99.4-cdh5.3.0

查询所有link:

show link

删除指定link:

delete link --lid x

查询所有job:

show job

删除指定job:

delete job --jid 1

创建generic-jdbc-connector类型的connector

create link --cid 2
    Name: First Link
    JDBC Driver Class: com.mysql.jdbc.Driver
    JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
    Username: root
    Password: ****
    JDBC Connection Properties:
    There are currently 0 values in the map:
    entry# protocol=tcp
    There are currently 1 values in the map:
    protocol = tcp
    entry#
    New link was successfully created with validation status OK and persistent id 3
show link
+----+-------------+-----------+---------+
| Id |    Name     | Connector | Enabled |
+----+-------------+-----------+---------+
| 3  | First Link  | 2         | true    |
+----+-------------+-----------+---------+

创建hdfs-connector类型的connector:

create link -cid 1
    Name: Second Link
    HDFS URI: hdfs://hadoop000:8020
    New link was successfully created with validation status OK and persistent id 4
show link
+----+-------------+-----------+---------+
| Id |    Name     | Connector | Enabled |
+----+-------------+-----------+---------+
| 3  | First Link  | 2         | true    |
| 4  | Second Link | 1         | true    |
+----+-------------+-----------+---------+
show links -all
    2 link(s) to show:
    link with id 3 and name First Link (Enabled: true, Created by null at 15-2-2 ??11:28, Updated by null at 15-2-2 ??11:28)
    Using Connector id 2
      Link configuration
        JDBC Driver Class: com.mysql.jdbc.Driver
        JDBC Connection String: jdbc:mysql://hadoop000:3306/hive
        Username: root
        Password:
        JDBC Connection Properties:
          protocol = tcp
    link with id 4 and name Second Link (Enabled: true, Created by null at 15-2-2 ??11:32, Updated by null at 15-2-2 ??11:32)
    Using Connector id 1
      Link configuration
        HDFS URI: hdfs://hadoop000:8020

根据connector id创建job:

create job -f 3 -t 4
    Creating job for links with from id 3 and to id 4
    Please fill following values to create new job object
    Name: Sqoopy

    From database configuration

    Schema name: hive
    Table name: TBLS
    Table SQL statement:
    Table column names:
    Partition column name:
    Null value allowed for the partition column:
    Boundary query: 

    ToJob configuration

    Output format:
      0 : TEXT_FILE
      1 : SEQUENCE_FILE
    Choose: 0
    Compression format:
      0 : NONE
      1 : DEFAULT
      2 : DEFLATE
      3 : GZIP
      4 : BZIP2
      5 : LZO
      6 : LZ4
      7 : SNAPPY
      8 : CUSTOM
    Choose: 0
    Custom compression format:
    Output directory: hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4

    Throttling resources

    Extractors:
    Loaders:
    New job was successfully created with validation status OK  and persistent id 2

查询所有job:

show job
+----+--------+----------------+--------------+---------+
| Id |  Name  | From Connector | To Connector | Enabled |
+----+--------+----------------+--------------+---------+
| 2  | Sqoopy | 2              | 1            | true    |
+----+--------+----------------+--------------+---------+

启动指定的job:  该job执行完后查看HDFS上的文件(hdfs fs -ls hdfs://hadoop000:8020/sqoop2/tbls_import_demo_sqoop1.99.4/)

start job --jid 2

查看指定job的执行状态:

status job --jid 2

停止指定的job:

stop job --jid 2
时间: 2024-12-16 08:37:03

Sqoop2入门之导入关系型数据库数据到HDFS上(sqoop2-1.99.4版本)的相关文章

Sqoop2入门之导入关系型数据库数据到HDFS上

需求:将hive数据库中的TBLS表导出到HDFS之上: $SQOOP2_HOME/bin/sqoop.sh client sqoop:000> set server --host hadoop000 --port 12000 --webapp sqoop Server is set successfully 创建connection: sqoop:000> create connection --cid 1 Creating connection for connector with id

【甘道夫】Sqoop1.99.3基础操作--导入Oracle的数据到HDFS

第一步:进入客户端Shell [email protected]:~$ sqoop.sh client Sqoop home directory: /home/fulong/Sqoop/sqoop-1.99.3-bin-hadoop200 Sqoop Shell: Type 'help' or '\h' for help. sqoop:000> set server --host FBI003 --port 12000 --webapp sqoop Server is set successfu

sqoop导入关系型数据库-解密Sqoop

Sqoop作为Hadoop与传统数据库之间的桥梁,对于数据的导入导出有着重要作用.通过对Sqoop基本语法以及功能的阐述,深刻解密Sqoop的作用和价值.  一.什么是Apache Sqoop? Cloudera开发的Apache开源项目,是SQL-to-Hadoop的缩写.主要用于在Hadoop(Hive)与传统的数据库(mysql.postgresql...)间进行数据的传递,可以将一个关系型数据库(例如: MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDF

postgres导入其他数据库数据

最近对postgres数据库进行深入研究,将原来项目中使用的sqlserver数据库中的数据表导入postgres,网上搜索postgres数据导入,除空间数据库可以通过PostGIS 2.0 Shapefile and DBF Loader Exporter进行导入外,没发现其他的数据导入工具. 今天使用Navicat Preminm操作数据库时,偶然发现可以使用其数据传输菜单,从其他数据库中导入数据到postgres,利用此菜单可以将mysql.oracle.sqlserver.sqlite

mysql每几个小时导入几条数据到线上

1 由于公司需要,需要将线下的数据库每几个小时导几条数据到线上数据库中 2 3 [[email protected] bbs]# rpm -qa | grep "expect" 4 expect-5.44.1.15-5.el6_4.x86_64 5 如果没有 yum -y install expect 6 7 1,ssh映射 8 ssh_bbs.sh 9 #!/usr/bin/expect 10 set timeout 20 11 spawn ssh -C -f -N -g -i id

导入导出数据库数据,报错,链接服务器"(null)"的 OLE DB 访问接口 "Microsoft.Jet.OLEDB.4.0" 返回了消息 "未指定的错误"。

在操作数据库中,我们难免会遇到导入和导出数据库中的数据问题,但有时用SQL语句操作时,就会报各种错误,比如下面这种错误. 错误各种找啊,最终解决,现总结原因如下: 1.未开启Ad Hoc Distributed Queries 服务: 开启方式和关闭方式如下: 1 --启用Ad Hoc Distributed Queries: 2 exec sp_configure 'show advanced options',1 3 reconfigure 4 exec sp_configure 'Ad H

DB2导入导出数据库数据

导出数据库中数据 在db2cmd命令下生成建库脚本(-z指定模式名) db2look -d BBS -z db2admin -u db2admin -e -o bbs.sql 在db2cmd命令下导出数据 db2move BBS export -sn db2admin -u db2admin -p db2admin 导入数据到数据库 1.使用指定的csv格式的数据文件并生成日志文件 import   from   "D:/tmp/qm_east_info.csv"   OF   DEL

BCP 导入导出数据库数据

使用 bcp 将数据库迁移到 Azure SQL Database --所有 都是在本机sql上运行--先开启cmdshellEXEC sp_configure 'show advanced options', 1GORECONFIGUREGOEXEC sp_configure 'xp_cmdshell', 1GORECONFIGUREGO/**导出指定表的文本文件 */--EXEC master..xp_cmdshell 'bcp MyDB.dbo.Feedbacks out D:\BcpFi

导入/导出数据库数据

D:/oracle/product/10.2.0/client_1/BIN/exp user/user@ORCAL225 file=D:/history/ld_xnsale2.dmp owner=liquidate D:/oracle/product/10.2.0/client_1/BIN/imp user/user@ORCAL225 file=D:/history/ld_xnsale2.dmp  FULL=Y