各种数据格式的Hive建表语句

Xml格式

CREATE EXTERNAL TABLE Gateway_pmsarisoap(

BookingSoapLogID STRING,?

GuidNo STRING,?

SoapType STRING,?

SoapContent STRING,?

InsertDate STRING,?

SourceOpsType STRING)

PARTITIONED BY (?

? `dt` string)

ROW FORMAT SERDE ‘com.ibm.spss.hive.serde2.xml.XmlSerDe‘

WITH SERDEPROPERTIES (

"column.xpath.BookingSoapLogID"="/HWSoapBase/BookingSoapLogID/text()",

"column.xpath.GuidNo"="/HWSoapBase/GuidNo/text()",

"column.xpath.SoapType"="/HWSoapBase/SoapType/text()",

"column.xpath.SoapContent"="/HWSoapBase/SoapContent/*",

"column.xpath.InsertDate"="/HWSoapBase/InsertDate/text()",

"column.xpath.SourceOpsType"="/HWSoapBase/SourceOpsType/text()"

)

STORED AS

INPUTFORMAT ‘com.ibm.spss.hive.serde2.xml.XmlInputFormat‘

OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat‘

LOCATION ‘hdfs://ns1/wh/source/hw/Gateway/PmsARISoap‘

TBLPROPERTIES (

"xmlinput.start"="<HWSoapBase",

"xmlinput.end"="</HWSoapBase>"

);

Json格式

CREATE EXTERNAL TABLE QuhuhuGateway_pmsinvcountnotify(

CountType string,

Count string,

HotelCode string,

Start string,

`End` string)?

PARTITIONED BY (dt string)?

ROW FORMAT SERDE ‘com.cloudera.hive.serde.JSONSerDe‘?

STORED AS INPUTFORMAT ‘com.hadoop.mapred.DeprecatedLzoTextInputFormat‘?

OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat‘?

LOCATION ‘hdfs://ns1/wh/source/hw/QuhuhuGateway/PmsInvCountNotify‘;

ORC格式

create external table BWAdmin_Log(

? `LogID` BIGINT,?

? `AccountID` BIGINT,??

? `VHotelID` BIGINT,?

? `LogType` String,?

? `LogComment` String,

? `OperateTime` INT

)

row format delimited

fields terminated by ‘\t‘

STORED AS ORC

location?

‘hdfs://ns1/wh/source/bw/hotel/admin_log‘

AVRO格式

CREATE EXTERNAL TABLE `hotel_list`

PARTITIONED BY ( `dt` string)

ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.avro.AvroSerDe‘

WITH SERDEPROPERTIES( ‘avro.schema.url‘=‘hdfs://ns1/wh/config/schema/web/online/hotel_list.avsc‘)

STORED AS INPUTFORMAT ‘org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat‘

OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat‘

LOCATION ‘hdfs://ns1/wh/format/online_search‘;

LZO格式

CREATE EXTERNAL TABLE online_test(

sid int,

pvid int,

ts bigint)

PARTITIONED BY ( dt string)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t‘

LINES TERMINATED BY ‘\n‘

STORED AS INPUTFORMAT‘com.hadoop.mapred.DeprecatedLzoTextInputFormat‘

OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat‘

LOCATION ‘hdfs://ns1/test/online‘;

TEXT格式

CREATE EXTERNAL TABLE `order_currenthis`(

`orderid` string,

`room` int)

PARTITIONED BY ( `dt` string)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t‘ LINES TERMINATED BY ‘\n‘

STORED AS INPUTFORMAT ‘org.apache.hadoop.mapred.TextInputFormat‘

OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat‘

LOCATION ‘hdfs://ns1/wh/format/otb/order_currenthis‘

原文地址:http://blog.51cto.com/10120275/2119243

时间: 2024-08-30 00:28:07

各种数据格式的Hive建表语句的相关文章

hive建表语句

原文:http://jingyan.baidu.com/article/a378c96092cf56b328283006.html 创建表的语句:Create [EXTERNAL] TABLE [IF NOT EXISTS] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_com

Hive建表-分隔符

在hive建表中,默认的分隔符为  ‘,’ ,可以指定想用的分隔符 hive默认的列分割类型为org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,这其实就是^A分隔符,hive中默认使用^A(ctrl+A)作为列分割符,如果用户需要指定的话,等同于row format delimited fields terminated by '\001',因为^A八进制编码体现为'\001'.所以如果使用默认的分隔符,可以什么都不加,也可以按照上面的指定加‘

表操作--建表语句

表操作--建表语句 1.创建默认字符集库 下面已默认格式的字符集库 mysql> create database Ysolin; Query OK, 1 row affected (0.00 sec) mysql> show create database Ysolin\G *************************** 1. row *************************** Database: Ysolin Create Database: CREATE DATABAS

hive建表没使用LZO存储格式,可是数据是LZO格式时遇到的问题

今天微博大数据平台发邮件来说.他们有一个hql执行失败.可是从gateway上面的日志看不出来是什么原因导致的,我帮忙看了一下.最后找到了问题的解决办法,下面是分析过程: 1.执行失败的hql: INSERT OVERWRITE TABLE brand_ad_user_with_interact_score_3 select a.uid, a.brand, a.friend, CASE b.weight WHEN NULL THEN '0.000000' ELSE b.weight END fr

hive建表并load数据小结

一.建表的时候要指明分隔符 hive建表时默认的分隔符是'\001',若在建表的时候没有指明分隔符,load文件的时候文件的分隔符需要是'\001'的, 若文件分隔符不是'\001',程序不会报错,但表查询的结果会全部为'NULL', 如何制作分隔符为'\001'的测试文件 用vi编辑器Ctrl+v然后Ctrl+a就可以通过键盘输入'\001' 也可以在建表的时候指明分隔符为制表符,然后测试文件用excel制表符制作, 例如: create table pokes(foo INT,bar STR

hive建表没使用LZO存储格式,但是数据是LZO格式时遇到的问题

今天微博大数据平台发邮件来说,他们有一个hql运行失败,但是从gateway上面的日志看不出来是什么原因导致的,我帮忙看了一下,最后找到了问题的原因,以下是分析过程: 1.运行失败的hql: INSERT OVERWRITE TABLE brand_ad_user_with_interact_score_3 select a.uid, a.brand, a.friend, CASE b.weight WHEN NULL THEN '0.000000' ELSE b.weight END from

DB2建表语句

db2 => create table test (name char(8) not null primary key,depid smallint,pay bigint) DB20000I SQL 命令成功完成. db2 => create table test1 (name char(8) not null primary key,depid smallint references department (depid),pay bigint) DB20000I SQL 命令成功完成. db

excel的宏与VBA实践——建表语句

不带分区版本:V1.0: Sub createTableDDL() '自动创建建表语句 '定义换行和TAB Ln = Chr(13) + Chr(10) TB = Chr(9) '定义脚本目录 Dim dir AS String dir = "C:\CREATE_TABLE_DDL" Set FSOE = CreateObject("Scripting.FileSystemObject") If FSOE.folderexists(dir) = False Then

根据javabean转换为mysql建表语句

一般上,我们会使用数据库表转换为javabean.dao.或是mapper,就叫逆向工程.做项目时一般也是先设计数据库,再进行系统开发,所以一般使用逆向工程. 但我这边由于工作临时的需要,需要将javabean转换为建表语句,于是上网搜了一下,大部分是做一个工具类进行bean解析输出SQL语句. 根据自身项目命名设计要求,简化一个例子如下,供参考 package com.util; import java.io.IOException; import java.lang.reflect.Fiel