如何快速把hdfs数据动态导入到hive表

1. hdfs 文件

?

{"retCode":1,"retMsg":"Success","data":[{"secID":"000001.XSHE","ticker":"000001","secShortName":"深发展A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":24,"actPreClosePrice":24,"openPrice":24,"highestPrice":24.4,"lowestPrice":23.85,"closePrice":23.9,"turnoverVol":355700,"turnoverValue":8582250,"turnoverRate":0.0058,"accumAdjFactor":0.0117201563,"negMarketValue":1462295257.8,"marketValue":2145064267.7,"PB":2.2666,"isOpen":1},{"secID":"000002.XSHE","ticker":"000002","secShortName":"深万科A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":8,"actPreClosePrice":8,"openPrice":8,"highestPrice":8,"lowestPrice":7.7,"closePrice":7.9,"turnoverVol":375000,"turnoverValue":2944200,"turnoverRate":0.0066,"accumAdjFactor":0.0117337592,"negMarketValue":451011000,"marketValue":615927450,"PB":1.0001,"isOpen":1},{"secID":"000004.XSHE","ticker":"000004","secShortName":"深安达A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":7.25,"actPreClosePrice":7.25,"openPrice":7.25,"highestPrice":7.25,"lowestPrice":7.2,"closePrice":7.2,"turnoverVol":92000,"turnoverValue":665125,"turnoverRate":0.0078,"accumAdjFactor":0.2649084628,"negMarketValue":84977100,"marketValue":175500000,"PB":7.4199,"isOpen":1},{"secID":"000005.XSHE","ticker":"000005","secShortName":"深原野A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":6.46,"actPreClosePrice":6.46,"openPrice":6.49,"highestPrice":6.49,"lowestPrice":6.49,"closePrice":6.49,"turnoverVol":94500,"turnoverValue":613305,"turnoverRate":0.0021,"accumAdjFactor":0.1016459912,"negMarketValue":287756865,"marketValue":584100000,"PB":9.1783,"isOpen":1},{"secID":"000009.XSHE","ticker":"000009","secShortName":"深宝安A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":5.75,"actPreClosePrice":5.75,"openPrice":5.7,"highestPrice":5.8,"lowestPrice":5.65,"closePrice":5.75,"turnoverVol":767500,"turnoverValue":4382245,"turnoverRate":0.0084,"accumAdjFactor":0.1026538759,"negMarketValue":524745000,"marketValue":1293922500,"PB":2.4503,"isOpen":1},{"secID":"600601.XSHG","ticker":"600601","secShortName":"延中实业","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":65.7,"actPreClosePrice":65.7,"openPrice":66.4,"highestPrice":66.4,"lowestPrice":66.4,"closePrice":66.4,"turnoverVol":5333,"turnoverValue":354111,"dealAmount":81,"turnoverRate":0.0053,"accumAdjFactor":0.0010592167,"negMarketValue":66400000,"marketValue":66400000,"PB":40.7703,"isOpen":1},{"secID":"600602.XSHG","ticker":"600602","secShortName":"真空电子","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":640.6,"actPreClosePrice":640.6,"openPrice":647,"highestPrice":647,"lowestPrice":647,"closePrice":647,"turnoverVol":2589,"turnoverValue":1675083,"dealAmount":227,"turnoverRate":0.0051,"accumAdjFactor":0.0019640692,"negMarketValue":330552300,"marketValue":1294000000,"PB":287.6707,"isOpen":1},{"secID":"600651.XSHG","ticker":"600651","secShortName":"飞乐音响","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":119.6,"actPreClosePrice":119.6,"openPrice":120.8,"highestPrice":120.8,"lowestPrice":120.8,"closePrice":120.8,"turnoverVol":1102,"turnoverValue":133122,"dealAmount":14,"turnoverRate":0.0022,"accumAdjFactor":0.0008192464,"negMarketValue":60400000,"marketValue":60400000,"PB":39.6397,"isOpen":1},{"secID":"600652.XSHG","ticker":"600652","secShortName":"爱使电子","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":83.2,"actPreClosePrice":83.2,"openPrice":0,"highestPrice":0,"lowestPrice":0,"closePrice":83.2,"turnoverVol":0,"turnoverValue":0,"dealAmount":0,"turnoverRate":0,"accumAdjFactor":0.0006920481,"negMarketValue":22464000,"marketValue":22464000,"PB":33.8019,"isOpen":0},{"secID":"600653.XSHG","ticker":"600653","secShortName":"申华电工","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":103.4,"actPreClosePrice":103.4,"openPrice":104.4,"highestPrice":104.4,"lowestPrice":104.4,"closePrice":104.4,"turnoverVol":240,"turnoverValue":25056,"dealAmount":4,"turnoverRate":0.0005,"accumAdjFactor":0.0009289199,"negMarketValue":52200000,"marketValue":52200000,"PB":97.279,"isOpen":1},{"secID":"600654.XSHG","ticker":"600654","secShortName":"飞乐股份","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":633.2,"actPreClosePrice":633.2,"openPrice":639.5,"highestPrice":639.5,"lowestPrice":639.5,"closePrice":639.5,"turnoverVol":101,"turnoverValue":64590,"dealAmount":26,"turnoverRate":0.0048,"accumAdjFactor":0.000663586,"negMarketValue":13429500,"marketValue":134358950,"PB":282.9834,"isOpen":1},{"secID":"600656.XSHG","ticker":"600656","secShortName":"浙江凤凰","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":1242.9,"actPreClosePrice":1242.9,"openPrice":1255.3,"highestPrice":1255.3,"lowestPrice":1255.3,"closePrice":1255.3,"turnoverVol":140,"turnoverValue":175742,"dealAmount":7,"turnoverRate":0.0031,"accumAdjFactor":0.0007136096,"negMarketValue":56502308.3,"marketValue":321798665.6,"PB":-604.4303,"isOpen":1}]}
 
?

2. 创建 hive 临时表

CREATE EXTERNAL TABLE if not exists sensitop.equd_json_tmp (
  retCode string,
  retMsg string,
  data array<struct<
			  secID: string,
			  tradeDate: date,
			  ticker: string,
			  secShortName: string,
			  exchangeCD: string,
			  preClosePrice: double,
			  actPreClosePrice: double,
			  openPrice: double,
			  highestPrice: double,
			  lowestPrice: double,
			  closePrice: double,
			  turnoverVol: double,
			  turnoverValue: double,
			  dealAmount: int,
			  turnoverRate: double,
			  accumAdjFactor: double,
			  negMarketValue: double,
			  marketValue: double,
			  PE: double,
			  PE1: double,
			  PB: double,
			  isOpen: int>>)
ROW FORMAT SERDE ‘org.apache.hive.hcatalog.data.JsonSerDe‘
LOCATION ‘hdfs://hdfs1.wdp:8020/sensitop/finance/equd;
?

3. 创建 hive 表

需要把上面表里数组里的数据一条一条放入这个表:
CREATE TABLE if not exists sensitop.equd_h(
                        secID string,
                        ticker string,
                        secShortName string,
                        exchangeCD string,
                        tradeDate date,
                        preClosePrice double,
                        actPreClosePrice double,
                        openPrice double,
                        highestPrice double,
                        lowestPrice double,
                        closePrice double,
                        turnoverVol double,
                        turnoverValue double,
                        dealAmount int,
                        turnoverRate double,
                        accumAdjFactor double,
                        negMarketValue double,
                        marketValue double,
                        PE double,
                        PE1 double,
                        PB double,
                        isOpen int)
partitioned by (year string)
ROW FORMAT SERDE ‘org.apache.hive.hcatalog.data.JsonSerDe‘
?
注意:这里的字段顺序和上面临时表的顺序要一致。
?

4. 用 Partition 更新数据

insert overwrite table sensitop.equd_h
partition (year=‘2016‘)
select b.dt.secID,
b.dt.ticker,
b.dt.secShortName,
b.dt.exchangeCD,
b.dt.tradeDate,
b.dt.preClosePrice,
b.dt.actPreClosePrice,
b.dt.openPrice,
b.dt.highestPrice,
b.dt.lowestPrice,
b.dt.closePrice,
b.dt.turnoverVol,
b.dt.turnoverValue,
b.dt.dealAmount,
b.dt.turnoverRate,
b.dt.accumAdjFactor,
b.dt.negMarketValue,
b.dt.marketValue,
b.dt.PE,
b.dt.PE1,
b.dt.PB,
b.dt.isOpen
from sensitop.equd_json_tmp LATERAL VIEW explode(equd_json_tmp.data) b AS dt
where dt.tradedate >= ‘2016-01-01‘ and dt.tradedate <= ‘2016-12-31‘;
?
?

5. 用nifi实现动态插入数据

?
这里有二个分支,左边一个是每天20:00更新当年的partion; 右边一个是更新1990 到 2015 年的数据,而且只需要更新一次。
?
insert overwrite table sensitop.equd_h
partition (year=‘${year}‘)
select b.dt.secID,
b.dt.ticker,
b.dt.secShortName,
b.dt.exchangeCD,
b.dt.tradeDate,
b.dt.preClosePrice,
b.dt.actPreClosePrice,
b.dt.openPrice,
b.dt.highestPrice,
b.dt.lowestPrice,
b.dt.closePrice,
b.dt.turnoverVol,
b.dt.turnoverValue,
b.dt.dealAmount,
b.dt.turnoverRate,
b.dt.accumAdjFactor,
b.dt.negMarketValue,
b.dt.marketValue,
b.dt.PE,
b.dt.PE1,
b.dt.PB,
b.dt.isOpen
from sensitop.equd_json_tmp LATERAL VIEW explode(equd_json_tmp.data) b AS dt
where dt.tradedate >= ‘${year}-01-01‘ and dt.tradedate <= ‘${year}-12-31‘
?
?
?

?

时间: 2024-11-05 15:51:27

如何快速把hdfs数据动态导入到hive表的相关文章

mysql中把一个表的数据批量导入另一个表中(不同情况)

mysql中把一个表的数据批量导入另一个表中 不管是在网站开发还是在应用程序开发中,我们经常会碰到需要将MySQL某个表的数据批量导入到另一个表的情况,甚至有时还需要指定导入字段. 本文就将以MySQL数据库为例,介绍如何通过SQL命令行将某个表的所有数据或指定字段的数据,导入到目标表 中. 类别一. 如果两张张表(导出表和目标表)的字段一致,并且希望插入全部数据,可以用这种方法:(此方法只适合导出两表在同一database) INSERT INTO 目标表 SELECT * FROM 来源表;

好程序员大数据学习路线之hive表的查询

好程序员大数据学习路线之hive表的查询 1.join 查询 1.永远是小结果集驱动大结果集(小表驱动大表,小表放在左表). 2.尽量不要使用join,但是join是难以避免的. left join . left outer join . left semi join(左半开连接,只显示左表信息) hive在0.8版本以后开始支持left join left join 和 left outer join 效果差不多 hive的join中的on只能跟等值连接 "=",不能跟< &g

用sqoop将mysql的数据导入到hive表

一.先将mysql一张表的数据用sqoop导入到hdfs 1.1.先在mysql中准备一张测试用的表 mysql> desc user_info; +-----------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+-------------+------+-----+---------+-------+ | id | int

ETL工具—Kettle数据的导入导出—Excel表到数据库

案例目的:当Excel中有几万条记录或者更多数据时,使用Kettle导入到数据库中.速度就能感到比复制粘贴快很多. 这里我会演示将"data.xlsx"表中数据有15万多条记录,使用Kettle导入到oracle数据库的testuser用户下. 准备工作:1.data.xlsx表: 2.testuser用户下要有一个与data.xlsx表结构(列数与字段类型)相同的表. 开始!!! 1.新建一个"转换",将"Excel输入"."表输出&

使用数据泵导入并重命名表名

--查询数据泵使用的目录 SQL> select * from dba_directories; --导出数据库使用的导出目录 EXPDP_DUMP /u02/expdp_dump --导出scott下表T_A的命令 expdp system/oracle directory=expdp_dump dumpfile=expdp_T_A.dmp logfile=expdp_T_A.log tables=scott.T_A --导出操作记录 [[email protected] ~]$ expdp

从Oracle导出数据并导入到Hive

1.配置源和目标的数据连接 源(oracle): 目标(Hive 2.1.1),需要事先将hive的驱动程序导入HHDI的lib目录中. Hive2.1.1需要的jar包如下:可根据自身情况更换hadoop和hive的jar包. hadoop-common-2.6.0.jar hive-common-2.1.0.jar hive-jdbc-2.1.0.jar hive-metastore-2.1.0.jar hive-serde-2.1.0.jar hive-service-2.1.0.jar

ORACLE使用数据泵导入导出部分表

1.导出名字为A的表(这里会导出SEQUENCES及FUNCTION等内容) expdp TEST/TEST@orcl schemas=TEST dumpfile=TEST.dmp DIRECTORY=db_bak EXCLUDE=TABLE:\"NOT IN\(\'A\')\" 2.导出名字不为A的表(去掉NOT与我们习惯的正好相反) expdp TEST/TEST@orcl schemas=TEST dumpfile=TEST.dmp DIRECTORY=db_bak EXCLUD

第3节 sqoop:4、sqoop的数据导入之导入数据到hdfs和导入数据到hive表

注意: (1)\001 是hive当中默认使用的分隔符,这个玩意儿是一个asc 码值,键盘上面打不出来 (2)linux中一行写不下,可以末尾加上 一些空格和 “ \ ”,换行继续写余下的命令: bin/sqoop import --connect jdbc:mysql://192.168.25.24:3306/userdb --username root --password admin --table \emp --fields-terminated-by '\001' \--hive-im

hive建表与数据的导入导出

建表: create EXTERNAL table tabtext(IMSI string,MDN string,MEID string,NAI string,DestinationIP string,DestinationPort string,SourceIP string,SourcePort string,PDSNIP string,PCFIP string,HAIP string,UserZoneID string,BSID string,Subnet string,ServiceOp