【甘道夫】实现Hive数据同步更新的shell脚本

引言：

上一篇文章《【甘道夫】Sqoop1.4.4 实现将 Oracle10g 中的增量数据导入 Hive0.13.1 ，并更新Hive中的主表》http://blog.csdn.net/u010967382/article/details/38735381 描述了增量更新Hive表的原理和Sqoop，Hive命令，本文基于上一篇文章的内容实现了shell脚本的编写，稍加修改就可用于实际工程。

***欢迎转载，请注明来源***

http://blog.csdn.net/u010967382/article/details/38824327

shell脚本

#!/bin/bash

#Please set the synchronize interval,unit is hour.

update_interval=24

#Please set the RDBMS connection params

rdbms_connstr="jdbc:oracle:thin:@192.168.0.147:1521:ORCLGBK"

rdbms_username="SP"

rdbms_pwd="fulong"

rdbms_table="OMP_SERVICE"

rdbms_columns="ID,SERVICE_NAME,SERVICE_PROCESS,CREATE_TIME,ENABLE_ORG,ENABLE_PLATFORM,IF_DEL"

#Please set the hive params

hive_increment_table="SERVICE_TMP"

hive_full_table="service_all"

#---------------------------------------------------------

#Import icrement data in RDBMS into Hive

enddate=$(date ‘+%Y/%m/%d %H:%M:%S‘)

startdate=$(date ‘+%Y/%m/%d %H:%M:%S‘ -d ‘-‘+${update_interval}+‘ hours‘)

$SQOOP_HOME/bin/sqoop import --connect ${rdbms_connstr} --username ${rdbms_username} --password ${rdbms_pwd} --table ${rdbms_table} --columns "${rdbms_columns}" --where "CREATE_TIME > to_date(‘${startdate}‘,‘yyyy-mm-dd hh24:mi:ss‘)
and CREATE_TIME < to_date(‘${enddate}‘,‘yyyy-mm-dd hh24:mi:ss‘)" --hive-import --hive-overwrite --hive-table ${hive_increment_table}

#---------------------------------------------------------

#Update the old full data table to latest status

$HIVE_HOME/bin/hive -e "insert overwrite table ${hive_full_table} select * from ${hive_increment_table} union all select a.* from ${hive_full_table} a left outer join ${hive_increment_table} b on a.service_code = b.service_code
where b.service_code is null;"

注意：

在shell脚本中执行hive hql的命令格式是 hive -e "select ..."

Cron脚本

添加定时任务每天凌晨2点执行该脚本

0 2 * * * /home/fulong/shell/dataSync.sh

时间： 2025-01-10 01:14:54

【甘道夫】实现Hive数据同步更新的shell脚本

【甘道夫】实现Hive数据同步更新的shell脚本的相关文章

[转]实现Hive数据同步更新的shell脚本

【甘道夫】Hive 0.13.1 on Hadoop2.2.0 + Oracle10g部署详解

【甘道夫】Hive 0.13.1 on Hadoop2.2.0 + Oracle10g部署详细解释

【甘道夫】Sqoop1.4.4 实现将 Oracle10g 中的增量数据导入 Hive0.13.1 ，并更新Hive中的主表

【甘道夫】Hadoop2.2.0环境使用Sqoop-1.4.4将Oracle11g数据导入HBase0.96，并自动生成组合行键

【甘道夫】Sqoop1.99.3基础操作--导入Oracle的数据到HDFS

【甘道夫】使用sqoop-1.4.4.bin__hadoop-2.0.4-alpha将Oracle11g数据导入HBase0.96

【甘道夫】HBase基本数据操作详解【完整版，绝对精品】

【甘道夫】Hadoop2.2.0 NN HA详细配置+Client透明性试验【完整版】