根据JSON创建对应的HIVE表

  本文提供一种用SCALA把JSON串转换为HIVE表的方法,由于比较简单,只贴代码,不做解释。有问题可以留言探讨

package com.gabry.hiveimport org.json4s._import org.json4s.native.JsonMethods._import scala.io.Source
class Json2Hive{
  /**
    * sealed abstract class JValue
    *case object JNothing extends JValue // ‘zero‘ for JValue
    *case object JNull extends JValue
    *case class JString(s: String) extends JValue
    *case class JDouble(num: Double) extends JValue
    *case class JDecimal(num: BigDecimal) extends JValue
    *case class JInt(num: BigInt) extends JValue
    *case class JBool(value: Boolean) extends JValue
    *case class JObject(obj: List[JField]) extends JValue
    *case class JArray(arr: List[JValue]) extends JValue
    *type JField = (String, JValue)
    *create table student_test(id INT, info struct< name:string,age:INT >)
    *jsonString:{ "people_type":1,"people":{"person_id": 5,"test_count": 5,"para":{"name":"jack","age":6}}}
    */

  private def fieldDelimiter(level:Int) = if ( level == 2 ) " " else ":"
  private def decodeJson(jv: Any,level:Int,hql:StringBuilder) :Unit = {
    jv match {
      case js:JString => hql.append(fieldDelimiter(level)+"string,")
      case jdo:JDouble => hql.append(fieldDelimiter(level)+"double,")
      case jde:JDecimal => hql.append(fieldDelimiter(level)+"decimal,")
      case ji:JInt => hql.append(fieldDelimiter(level)+"bigint,")
      case jb:JBool => hql.append(fieldDelimiter(level)+"int,")
      case jf:JField=>
        hql.append(jf._1)
        decodeJson(jf._2,level+1,hql)
      case ja:JArray=>
          hql.append(level + " struct<")
          ja.arr.foreach(decodeJson(_,level+1,hql))
          hql.append(">")
      case jo:JObject=>
          if (level !=0) hql.append(" struct<")
          jo.obj.foreach(decodeJson(_,level+1,hql))
          if ( hql.endsWith(",") ) hql.deleteCharAt(hql.length-1)
          if (level !=0) hql.append(">,")
      case JNull=> hql.append(fieldDelimiter(level)+"string,")
      case _ =>println(jv)
    }
  }
  def toHive(jsonStr:String,tableName:String):String = {
    val jsonObj = parse(jsonStr)
    val hql = new StringBuilder()
    decodeJson(jsonObj,0,hql)
    "create table %s ( %s )".format(tableName,hql.toString())
  }
}
object Json2Hive{
  val json2hive = new Json2Hive()
  def main (args :Array[String]) : Unit = {
    if ( args.length != 2 ) println("usage : json2hive jsonFile hiveTableName")
    val jsonFile = args(0)
    val hiveTableName = args(1)
    //val jsonstr ="{ \"people_type\":0,\"people_num\":0.1,\"people\":{\"person_id\": 5,\"test_count\": 5,\"para\":{\"name\":\"jack\",\"age\":6}},\"gender\":1}"
    //val jsonstr ="{ \"people_type\":0,\"object\":{\"f1\":1,\"f2\":1},\"gender\":1}"/* 由于JSON串不容易用参数传递,故此处以json文件代替 */
    val file = Source.fromFile(jsonFile,"UTF-8")/* 将文件中的json串转换为对应的HIVE表 */
    file.getLines().foreach(line=>println(json2hive.toHive(line.toString,hiveTableName)))
    file.close()
  }
}

  

以下是测试结果

create table example ( people_type bigint,people_num double,people struct<person_id:bigint,test_count:bigint,para struct<name:string,age:bigint>>,gender bigint )

时间: 2024-08-08 22:01:19

根据JSON创建对应的HIVE表的相关文章

创建function实现hive表结果导出到mysql

1. 创建临时function (这里两个包都是hive自带到,不需要自己开发的,可以根据名称查找对应的版本) add jar /opt/local/hive/lib/hive-contrib-2.3.3.jar; add jar /opt/local/hive/lib/mysql-connector-java-5.1.35-bin.jar; CREATE TEMPORARY FUNCTION dboutput AS 'org.apache.hadoop.hive.contrib.generic

flume的sink写入hive表

a1.sources = r1 a1.sinks = s1 a1.channels = c1 a1.sources.r1.type = netcat      a1.sources.r1.bind = localhost  a1.sources.r1.port = 44444 a1.sinks.s1.type = hive a1.sinks.s1.type.hive.metastore=thrift://master:9083 a1.sinks.s1.type.hive.datebase=bd1

创建GZIP压缩格式的HIVE表

[Author]:  kwu GZIP为Linux系统中最常用的压缩格式,创建GZIP压缩格式的HIVE表具体步骤如下. 1.以 STORED AS TEXTFILE 为存储格式创建HIVE表 CREATE TABLE TRACKLOG (DATEDAY STRING COMMENT "日期",IP STRING COMMENT "IP",COOKIEID STRING COMMENT "用户COOKIE",USERID STRING COMME

hive 表的创建的操作与测试

Hive 中创建表的三种方式,应用场景说明及练习截图 内部表和外部表的区别,练习截图 分区表的功能.创建,如何向分区表中加载数据.如何检索分区表中的数据,练习截图 一:hive HQL 的表操作: 1.1.1创建数据库: hive> create database yangyang; hive> desc database yangyang; 删除数据库: hive> drop database yangyang casecad; ----> casecad 表示有表也删除 1.1

Hive使用HDFS目录数据创建Hive表分区

描述: Hive表pms.cross_sale_path建立以日期作为分区,将hdfs目录/user/pms/workspace/ouyangyewei/testUsertrack/job1Output/crossSale上的数据,写入该表的$yesterday分区上 表结构: hive -e " set mapred.job.queue.name=pms; drop table if exists pms.cross_sale_path; create external table pms.c

shell定时创建Hive表分区

首先看一下hive 的help命令: [[email protected] hive]$ hive -h Missing argument for option: h usage: hive -d,--define <key=value> Variable subsitution to apply to hive commands. e.g. -d A=B or --define A=B --database <databasename> Specify the database

hive表信息查询:查看表结构、表操作等--转

原文地址:http://www.aboutyun.com/forum.PHP?mod=viewthread&tid=8590&highlight=Hive 问题导读:1.如何查看hive表结构?2.如何查看表结构信息?3.如何查看分区信息?4.哪个命令可以模糊搜索表? 1.hive模糊搜索表  show tables like '*name*'; 2.查看表结构信息  desc formatted table_name;  desc table_name; 3.查看分区信息  show p

hive表信息查询:查看表结构、表操作等

转自网友的,主要是自己备份下 有时候不记得! 问题导读:1.如何查看hive表结构?2.如何查看表结构信息?3.如何查看分区信息?4.哪个命令可以模糊搜索表 1.hive模糊搜索表 show tables like '*name*'; 2.查看表结构信息  desc formatted table_name;  desc table_name; 3.查看分区信息  show partitions table_name; 4.根据分区查询数据  select table_coulm from ta

HIVE表数据的导入与导出(load data&amp;insert overwrite)

1. 准备测试数据 首先创建普通表: create table test(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; 创建分区表: CREATE EXTERNAL TABLE test_p( id int, name string ) partitioned by (date STRING) ROW FORMAT DELIMITED FIELDS TERMINATED