一、问题
hive如何将
a 1,2,3 b 4,7 c 5
转化成为:
a 1 a 2 a 3 b 4 b 7 c 5
二、原始数据
cat row_column.txt a 1,2,3 b 4,7 c 5
三、解决方案
3.1 遍历每一列
3.1.1 创建表
-- 创建表 create table tmp.row_column ( col1 string, col3 string ) row format delimited fields terminated by ‘\t‘ stored as textfile; -- 载入数据 load data local inpath ‘/tmp/row_column.txt‘ into table row_column;
3.1.2 查看数据:
hive> select * from row_column; OK a 1,2,3 b 4,7 c 5
3.1.3 遍历每一列
select col1,name from tmp.row_column lateral view explode(split(col3,‘,‘)) col3 as name; --------------------------------------------------------------- Total MapReduce CPU Time Spent: 2 seconds 20 msec OK a 1 a 2 a 3 b 4 b 7 c 5
3.2 数组遍历
3.2.1 创建表
create table tmp.row_column_array ( col1 string, col3 array<int> ) row format delimited fields terminated by ‘\t‘ collection items terminated by ‘,‘ stored as textfile;
3.2.2 加载数据
load data local inpath ‘/tmp/row_column.txt‘ into table tmp.row_column_array;
3.2.3 查看数据
hive> select * from tmp.row_column_array; OK a [1,2,3] b [4,7] c [5]
3.2.4 查看每一列
select col1,name from tmp.row_column_array lateral view explode(col3) col3 as name;
3.2.5 结果
a 1 a 2 a 3 b 4 b 7 c 5
四、补充
查看使用逗号分割的列
select t.list[0],t.list[1],t.list[2] from ( select (split(col3,‘,‘)) list from tmp.row_column)t; Total MapReduce CPU Time Spent: 1 seconds 740 msec OK 1 2 3 4 7 NULL 5 NULL NULL Time taken: 15.264 seconds, Fetched: 3 row(s)
查看长度
select col1, size(split(col3,‘,‘)) list from tmp.row_column; Total MapReduce CPU Time Spent: 1 seconds 690 msec OK a 3 b 2 c 1
时间: 2024-10-08 16:38:35