这里使用的版本是cdh发行的pig-0.12.0-cdh5.1.2 下载地址点这里
1.Pig简介:
Pig是yahoo捐献给apache的一个项目,它是SQL-like语言,是在MapReduce上构建的一种高级查询语言,把一些运算编译进MapReduce模型的Map和Reduce中,并且用户可以定义自己的功能。这是Yahoo开发的又一个克隆Google的项目:Sawzall。
Pig是一个客户端应用程序,就算你要在Hadoop集群上运行Pig,也不需要在集群上装额外的东西
2.安装
解压下载完成的pig到指定目录,我这里将其解压到用户hadoop目录下
<span style="font-size:18px;">[email protected]:~/pig/conf$ tar -xzvf ~/Downloads/pig-0.12.0-cdh5.1.2.tar.gz -C ~/ </span>
为配置方便 这里将其建立软链接到pig
<span style="font-size:18px;">[email protected]:~/pig/conf$ ln -s pig-0.12.0-cdh5.1.2/ pig</span>
3.环境变量配置
通过编辑/etc/.profile文件或者是用户目录下面的~/.profile文件,我这里编辑hadoop用户目录下面的配置文件来配置
<span style="font-size:18px;">export PIG_HOME=/home/hadoop/pig export PIG_CLASSPATH=${HADOOP_HOME}/etc/hadoop export PATH=$PATH:$PIG_HOME/bin</span>
其中PIG_CLASSPATH指定了hadoop的配置文件路径,本地模式不要配置,如果需要访问hadoop的时候必须配置
通过source ~/.profile使配置生效
4.local运行
<span style="font-size:18px;">[email protected]:~/pig/conf$ pig -x local 2014-10-13 19:17:34,862 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.1.2 (rexported) compiled Aug 25 2014, 19:51:48 2014-10-13 19:17:34,863 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig-0.12.0-cdh5.1.2/conf/pig_1413199054861.log 2014-10-13 19:17:34,905 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found 2014-10-13 19:17:35,204 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS 2014-10-13 19:17:35,205 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2014-10-13 19:17:35,206 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.98.1-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2014-10-13 19:17:35,732 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-10-13 19:17:35,918 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 2014-10-13 19:17:35,922 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS grunt> </span>
出现grunt提示说明启动成功
5.hadoop运行
需要启动hadoop集群,pig会根据PIG_CLASSPATH的路径下面的配置文件自动识别hadoop集群
<span style="font-size:18px;">grunt> [email protected]:~/pig/conf$ pig 2014-10-13 19:18:36,511 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.0-cdh5.1.2 (rexported) compiled Aug 25 2014, 19:51:48 2014-10-13 19:18:36,511 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig-0.12.0-cdh5.1.2/conf/pig_1413199116510.log 2014-10-13 19:18:36,541 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/hadoop/.pigbootup not found 2014-10-13 19:18:36,849 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2014-10-13 19:18:36,849 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS 2014-10-13 19:18:36,849 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://192.168.118.168:9100 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.3.0-cdh5.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hbase-0.98.1-cdh5.1.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2014-10-13 19:18:37,071 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2014-10-13 19:18:38,379 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS grunt> </span>
至此安装已经完成了。安装很简单但是功能不简单,使用会在后面一步步展开,并且可以使用pig为hdfs的数据建索引并推送到elasticsearch集群中。非常期待~
时间: 2024-10-29 13:43:27