Loading Data From Oracle To Hive By ODI 12c

本文描述如何通过ODI将Oracle表数据同步到Hive。
1、准备工作
在hadoop集群的各个节点分别安装Oracle Big Data Connectors，具体的组件如下图所示：

这里只需安装Oracle Loader For Hadoop(oraloader)以及Oracle SQL Connector for Hadoop Distributed File System (oraosch)两个软件。安装比较简单，直接解压即可使用（这里的ODI、oraosch以及oraloader组件都是以oracle用户身份安装的。）
2、创建目标表
在Hive上创建目标表，如下：

CREATE TABLE `RHNPACKAGE2`(
  `id` bigint,
  `org_id` bigint,
  `name_id` bigint,
  `evr_id` bigint,
  `package_arch_id` bigint,
  `package_group` bigint,
  `rpm_version` string,
  `description` string,
  `summary` string,
  `package_size` bigint,
  `payload_size` bigint,
  `installed_size` bigint,
  `build_host` string,
  `build_time` timestamp,
  `source_rpm_id` bigint,
  `checksum_id` bigint,
  `vendor` string,
  `payload_format` string,
  `compat` bigint,
  `path` string,
  `header_sig` string,
  `copyright` string,
  `cookie` string,
  `last_modified` timestamp,
  `created` timestamp,
  `header_start` bigint,
  `header_end` bigint,
  `modified` timestamp);

3、创建映射
之前已经创建好oracle和hive模型，这里直接使用其创建映射。如下图所示：

Integration Type设置属性：

连接设置属性：

过滤设置属性：

LKM设置属性：

IKM设置属性：

如果设置TRUNCATE的值为True，每次导入之前，会把表里的数据清空再导入，默认为false。
4、执行映射
结果如下图：

原文地址：http://blog.51cto.com/candon123/2088516

时间： 2024-11-03 01:37:08

Loading Data From Oracle To Hive By ODI 12c的相关文章

SQOOP Load Data from Oracle to Hive Table

sqoop import -D oraoop.disabled=true --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=HOSTNAME)(port=PORT))(connect_data=(service_name=SERVICE_NAME)))" --username USERNAME --table TABLE_NAME --null-string '\\N' --null-non-s

Manipulating Data from Oracle Object Storage to ADW

0. Introduction and Prerequisites This article presents an overview on how to use Oracle Data Integrator in order to manipulate data from Oracle Cloud Infrastructure Object Storage on OCI. The scenarios here present loading the data in an object stor

使用OGG&quot;Loading data from file to Replicat&quot;的方法应该注意的问题：replicat进程是前台进程

使用OGG的 "Loading data from file to Replicat"的方法应该注意的问题:replicat进程是前台进程因此.最好是在vncserver中调用该replicat进程或者以nohup方式放在后台执行.以下的是使用nohup方式放在后台执行. [[email protected] ~]$ ll rep_backgroud.sh -rwxr-xr-x 1 oracle oinstall 98 Jun 2 03:02 rep_backgroud.sh [[e

使用OGG"Loading data from file to Replicat"的方法应该注意的问题：replicat进程是前台进程

使用OGG的 "Loading data from file to Replicat"的方法应该注意的问题:replicat进程是前台进程因此,最好是在vncserver中调用该replicat进程或者以nohup方式放在后台运行.下面的是使用nohup方式放在后台运行. [[email protected] ~]$ ll rep_backgroud.sh -rwxr-xr-x 1 oracle oinstall 98 Jun 2 03:02 rep_backgroud.sh [[e

Loading Data into HDFS

How to use a PDI job to move a file into HDFS. Prerequisites In order to follow along with this how-to guide you will need the following: Hadoop Pentaho Data Integration Sample Files The sample data file needed for this guide is: File Name Content we

关于 OGG "Loading data from file to Replicat"同步含有lob字段表的部分记录的关键参数

首先说明一点: Loading data with an Oracle GoldenGate direct load 这个方法,对含有如下数据类型的table,是无法使用的: LOBs, LONGs, user-defined types (UDT), or any other large data type that is greater than 4 KB in size. 因此,遇到如下需求时,就不得不用OGG "Loading data from file to Replicat&quo

Data Base Oracle 常用命令

Data Base Oracle 常用命令 1.登录:(不需要密码,属于管理员权限) conn /as sysdba 2.查看数据库存储位置: select name from v$datafile; 3.创建表空间: 语法:create tablespace 表空间名称 datafile '数据文件的路径' size 大小; 示例:create tablespace test_db datafile 'D:\ORACLE\ORADATA\ORCL\test_db.dbf' size 3000

OGG &quot;Loading data from file to Replicat&quot;table静态数据同步配置过程

OGG "Loading data from file to Replicat"table静态数据同步配置过程一个.mgr过程 GGSCI (lei1) 3> view params mgr port 7809 二.抽取进程extftor GGSCI (lei1) 4> view params extftor SOURCEISTABLE userid goldengate, password yyyyy rmthost 192.168.100.189, mgrport 7

解决ODI 12C Studio 运行缓慢问题

一.配置 ODI 12C Studio 1.1 修改ODI Studio process的-Xms和-Xmx ide.conf: modifying the initial Heap size (-Xms), and/or maximum Heap size (-Xmx) of the ODI Studio Java process. 路径: "$ODI_HOME\jdeveloper\ide\bin\ide.conf (我的路径为D:\Oracle\Middleware\Oracle_Hom

猜你喜欢

android studio使用发布者证书调试

某些时候还是要用到的,直接说步骤,修改app.gradle apply plugin: 'com.android.application' android { .................... ...

第26天：js-$id函数、焦点事件

一.函数return语句定义函数的返回值,在函数内部用return来设置返回值,一个函数只能有一个返回值.同时,终止代码的执行.所有自定义函数默认没有返回值return后面不要换行 var a=10, ...

django - from django.db.models import F - class F

F() 的执行不经过 python解释器,不经过本机内存,是生成 SQL语句的执行. # Tintin filed a news story! reporter = Reporters.objects ...

收藏网址目录

git权威Git书籍ProGit(中文) http://git.oschina.net/progit/1-%E8%B5%B7%E6%AD%A5.html#1.1-%E5%85%B3%E4%BA%8E% ...

在eclipse 把Modle1和Model2架构改为MVC架构开发Jsp遇到的种种问题

这里只是总结了一下我的遇到的问题,最近在使用eclipse学习jsp的开发,一开始利用Model1的架构写了很多,逐渐的也是发现modle1的问题颇多,尤其是html和java片段混合狠不清晰而且对于 ...

使用ServletContextListener监听器

//使用ServletContextListener监听器在Servlet API中有一个ServletContextListener接口,它能够监听ServletContext对象的生命周期,实际 ...

Virtualbox中安装Arch Linux

1.配置Pacman源,添加阿里云的源地址. 编辑/etc/pacman.d/mirrorlist,先注释掉里面的所有行,然后在文件的最顶端添加 1 Server = http://mirrors.a ...

基金法律法规、职业道德与业务规范考试大纲(2015年度)

导读: 基金法律法规.职业道德与业务规范考试大纲(2015年度)已公布,考试大纲自2015年7月17日通知发布之日起开始实施. 基金从业资格全国统一考试大纲 ——基金法律法规.职业道德与业务规范(20 ...

ACM大一练习赛-第三场——A - 海森堡不确定原理

A - 海森堡不确定原理 Time Limit:1000MS Memory Limit:32768KB 64bit IO Format:%I64d & %I64u Submit ...

学习C#修饰符：类修饰符和成员修饰符

C#修饰符之类修饰符:public.internal. partial.abstract.sealed.static C#修饰符之成员修饰符:public.protected.private.inte ...

Multiple Instance Learning

///////////////////////////////////////////推荐学习组////////////////////////////// http://www.robots.ox. ...

实战c++中的string系列--十六进制的字符串转为十六进制的整型(一般是颜色代码使用)

非常久没有写关于string的博客了.由于写的差点儿相同了.可是近期又与string打交道,于是荷尔蒙上脑,小蝌蚪躁动. 在程序中,假设用到了颜色代码,一般都是十六进制的,即hex. 可是server ...

unity webgl获取跳转页面的url信息

需求的这样的客户端用webgl开发但登陆界面是普通的html页面比如你登陆百度后跳转到unity webgl页面因为http的无状态无连接的性质所以需要我们使用地址栏传递下登陆的信息 ...

Java-JDBC连接Oracle 数据库

package com.zse.oracle; import oracle.jdbc.*; import java.sql.*; import javax.swing.text.html.HTMLDo ...

cocos2dx 3.x win7+VS2012开发环境搭建及HelloWorld

1. 准备工作 (1)VS2012 (2)cocos2dx cn.cocos2d-x.org/download (3)python 新版本的cocos2dx 需要python编译 2. 安装软件 (1 ...

linux系统编程：进程间通信-mmap

进程间通信-mmap #include <sys/mman.h> void *mmap(void *addr, size_t length, int prot, int flags, in ...

Highmaps的天津地图数据JSON格式

Highmaps的天津地图数据JSON格式 Highmaps的天津地图数据JSON格式下载链接: http://pan.baidu.com/s/1eQgxECU password: tjmj 天津地 ...

黑马程序员__三大特性

三大特性封装封装是把成员变量包装起来,不让外界直接访问.根据成员变量的作用域,默认都是protect类型的. set 和get方法有时候我们必须要访问成员变量,但是成员变量被封装起来了,这时候我 ...

ie7-8 js表单提交的问题

需求是上传一个图片,一开始是这么安排目录结构的 <form method="post" action="http://gwactv2.***/upload" ...

TCP/IP协议栈各个层次及分别的功能

网络接口层:这是协议栈的最低层,对应OSI的物理层和数据链路层,主要完成数据帧的实际发送和接收.网络层:处理分组在网络中的活动,例如路由选择和转发等,这一层主要包括IP协议.ARP.ICMP协议等.传 ...

专题

随机推荐

© 2024 憋错料 | info#biecuoliao.com | 10 q. 0.019 s.