一、hadoop下载
使用2.7.6版本,因为公司生产环境是这个版本
cd /opt wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz
二、配置文件
参考文档:https://hadoop.apache.org/docs/r2.7.6
在$HADOOP_HOME/etc/hadoop目录下需要配置7个文件
1.core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://pangu10:9000</value> <description>NameNode URI,hdfs处理对外端口</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/hdfs/tmp</value> <description>hdfs重新格式化时(如新增了一个datenode)需要删除这个临时目录</description> </property> </configuration>
2.hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/opt/hdfs/name</value> <description>namenode上存储hdfs名字空间元数据 </description> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/opt/hdfs/data</value> <description>datanode上数据块的物理存储位置</description> </property> <property> <name>dfs.replication</name> <value>1</value> <description>设置dfs副本数,不设置默认是3个</description> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>pangu11:50090</value> <description>设置secondname的端口</description> </property> </configuration>
3.yarn-site.xml
<?xml version="1.0"?> <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>pangu10</value> <description>指定resourcemanager所在的hostname</description> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序</description> </property> <property> <name>yarn.nodemanager.pmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> </configuration>
4.mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>指定mapreduce使用yarn框架</description> </property> </configuration>
5.slaves
pangu10 pangu11 pangu12
6.yarn-env.sh
找到第23行
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
替换成
export JAVA_HOME=/opt/jdk1.8.0_181/
7.hadoop-env.sh
找到25行
export JAVA_HOME=${JAVA_HOME}
替换成
export JAVA_HOME=/opt/jdk1.8.0_181/
三、复制到slave
四、hdfs格式化
shell执行如下命令
hadoop namenode -format
如果出现下面红色的日志内容则格式化成功
18/10/12 12:38:33 INFO util.GSet: capacity = 2^15 = 32768 entries 18/10/12 12:38:33 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1164998719-192.168.56.10-1539362313584 18/10/12 12:38:33 INFO common.Storage: Storage directory /opt/hdfs/name has been successfully formatted. 18/10/12 12:38:33 INFO namenode.FSImageFormatProtobuf: Saving image file /opt/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression 18/10/12 12:38:33 INFO namenode.FSImageFormatProtobuf: Image file /opt/hdfs/name/current/fsimage.ckpt_0000000000000000000 of size 320 bytes saved in 0 seconds. 18/10/12 12:38:33 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 18/10/12 12:38:33 INFO util.ExitUtil: Exiting with status 0 18/10/12 12:38:33 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at pangu10/192.168.56.10 ************************************************************/
五、启动hadoop
原文地址:https://www.cnblogs.com/Netsharp/p/9780971.html
时间: 2024-10-31 02:00:24