Hive中变量的使用

1.Hive配置属性

(1)命令行方式

Hive配置属性存储于 hiveconf 命名空间中，该命名空间中的属性是可读写的。在查询语句中插入 ‘${hiveconf:变量名}‘，就可以通过 hive -hiveconf来替换变量。例如，查询语句和执行方式如下：

[root]$cat test.sql    #查看该文件
SELECT * FROM ${hiveconf:tablename}
limit ${hiveconf:var_rows};
[root]$hive -hiveconf tablename=‘t1‘ -hiveconf var_rows=10 -f test.sql
或者
#!/bin/bash
tablename="student"
limitcount="8"

hive -S -e "use test; select * from ${tablename} limit ${limitcount};"

需要注意的是：

如果有多个变量，每个变量前都要有参数 -hiveconf
变量赋值等号左右不能有空格(例如var_rows=10不能有空格）

（2）hql脚本方式

-- 设置变量
SET startdate=20181201;
SET enddate=20181231;
SET event_name=(‘网商节_主会场‘, ‘网商节_微信分享‘,‘网商节_主会场‘,‘网商节_分会场‘);

-- 查询语句
select
    event_name
    , count(1) pv
    , count(distinct ga_id) uv
from edw_log.user_trace_log_di
where dt between ${hiveconf:startdate} and ${hiveconf:enddate}
and  event_name  in ${hiveconf:event_name}
and data_source_id = ‘3‘
group by event_name
;

2.Hive命令行变量

Hive命令行变量，存储于 hivevar 命名空间中，该命名空间中的变量是可读写的。使用方式和hive配置属性类似，只是在查询语句中插入的是‘${hivecar:变量名}‘，其中命名空间"hivecar:"可以省略。例如：

[root]$cat test.sql
SELECT * FROM ${hivevar:tablename}  #等同于${tablename}
limit ${hiveconf:var_rows};
[root]$hive -hivevar tablename=‘t1‘ -hiveconf var_rows=10 -f test.sql

因为命令行变量的命名空间是唯一可以省略的，因此：

${hivevar:变量名}等价于${变量名}
除了用hive -hivevar 变量赋值，还可以用hive -d，d是define的简写，例如下面三个执行方式是一样的：

[root]$hive -hivevar tablename=‘t1‘ -hiveconf var_rows=10 -f test.sql
[root]$hive -define tablename=‘t1‘ -hiveconf var_rows=10 -f test.sql
[root]$hive -d tablename=‘t1‘ -hiveconf var_rows=10 -f test.sql

其他替换变量的方法：
利用shell脚本设置hive查询语句中的变量
 利用Python替换Hive查询语句中的变量

参考资料：

在hive查询中使用变量
 hive 传递变量的两种方式
 hive中的hiveconf与hivevar区别以及其作用域

原文地址：https://www.cnblogs.com/shujuxiong/p/10265800.html

时间： 2024-10-12 15:00:28

Hive中变量的使用

1.Hive配置属性

2.Hive命令行变量

Hive中变量的使用的相关文章

Hive 中 set 定义出来的变量以及 hive -d 设置的变量

Hive中的分桶

hive的变量传递设置

【甘道夫】Sqoop1.4.4 实现将 Oracle10g 中的增量数据导入 Hive0.13.1 ，并更新Hive中的主表

Hive之变量和属性

Hive中使用LZO

hive中order by,sort by, distribute by, cluster by作用以及用法

java中变量命名和引用变量的一个坑

kettle连接Hive中数据导入导出（6）