Hive集成HBase
配置
将hive的lib/中的HBase.jar包用实际安装的Hbase的jar包替换掉
cd /opt/hive/lib/
ls hbase-0.94.2*
rm -rf hbase-0.92*
cp /opt/hbase/hbase-0.94.2*
将Hive的lib/中的zookeeper.jar包用HBase中lib/中的替换掉
步骤同上
在hive-site.xml中添加:
<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/hive/lib/hive-hbase-handler-0.9.0.jar,file:///opt/hive/lib/hbase-0.94.2.jar,file:///opt/hive/lib/zookeeper-3.4.3.jar</value>
</property>
运行
cd /opt/hive/bin
./hive -hiveconf hbase.master=master:60000
流程如下:
先启动hbase,才能在hive里创建表;
在hive创建表后,在hbase你添加数据;
===========启动hbase,并在其中添加数据==============
[[email protected] bin]$ cd /opt/hbase/bin
[[email protected] bin]$ ./start-hbase.sh
[[email protected] bin]$ ./hbase shell
在hbase中添加数据
hbase(main):004:0> put ‘htest‘,‘1‘,‘f:values‘,‘test‘
hbase(main):005:0> scan ‘htest‘
===============启动hive,并创建表格===========
cd /opt/hive/bin
./hive -hiveconf hbase.master=master:60000
hive> create table htest(key int,value string) stored by ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler‘ with serdeproperties (‘hbase.columns.mapping‘=‘:key,f:value‘) tblproperties(‘hbase.table.name‘=‘htest‘);
hive> show tables;
hive> select * from htest;
安装Pig
解压并安装
tar -zxvf pig-0.10.0.tar.gz /opt/
mv pig-0.10.0/ pig
chown -R hadoop:hadoop pig
配置
因为pig/conf里没有xxx-en.vsh文件,所以在pig/bin理修改pig
cd /opt/pig/bin
vi pig
添加下面内容:
export JAVA_HOME=/usr/program/jdk1.6.0_13/
export PIG_INSTALL=/opt/pig
export HADOOP_INSTALL=/home/hadoop/hadoop-env/hadoop-1.0.1/
export PATH=$PIG_INSTALL/bin:HADOOP_INSTALL/bin:$PATH
export PIG_CLASSPATH=$HADOOP_INSTALL/conf
执行
先启动hadoop,再启动hivve
cd /opt/hive/bin
./pig
=======上传数据到hadoop hdfs=========================
hadoop fs -copyFromLocal /opt/data/test.txt /opt/data/test.txt 将电脑数据上传到hadoop fs中
hadoop fs -ls /opt/data/test.txt
hadoop fs -cat /opt/data/test.txt
=========pig中显示数据========================
grunt>A = LOAD ‘/opt/data/test.txt‘ USING PigStorage(‘#‘) AS (id,name);
grunt> B = FOREACH A GENERATE name;
grunt> STORE B INTO ‘opt/data/dist.txt‘ USING PigStorage(‘\t‘);
grunt> dump A;
Pig Latin常用命令
LOAD....USING PigStorage(‘‘)......AS......;
FOREACH......GENERATE......;
FILTER......BY......;
DUMP;
STORE......INTO;
GROUP ......BY;
[[email protected] bin]$ hadoop fs -ls /user/hive/warehouse/my
查看hive中的数据仓库