虽然现在hadoop版本已经到了主流2点多的时代,但是对于学习大数据而言,我还是选择从以前老的版本0.20.2学起。
下面就是伪分布式的环境搭建过程。
hadoop下载地址:
http://archive.apache.org/dist/hadoop/core/hadoop-0.20.2/hadoop-0.20.2.tar.gz
linux系统版本:centos7
1、配置主机名
[[email protected] ~]# vi /etc/sysconfig/network
# Created by anaconda
master1
[[email protected] ~]# hostname master1
2、创建管理hadoop的组和用户
[[email protected] ~]# groupadd hduser
[[email protected] ~]# useradd -g hduser hduser
[[email protected] ~]# passwd hduser
3、hosts主机名ip解析
[[email protected] ~]# vi /etc/hosts
192.168.11.131 master1
4、配置hadoop的sudoers权限
[[email protected] ~]# vi /etc/sudoers
hduser ALL=(ALL) NOPASSWD:ALL
5、关闭selinux和防火墙
[[email protected] ~]# vi /etc/sysconfig/selinux
SELINUX=enforcing --> SELINUX=disabled
[[email protected] ~]# systemctl stop firewalld
[[email protected] ~]# systemctl disable firewalld
6、解压包
[[email protected] ~]# su hduser
[[email protected] root]$ cd
[[email protected] ~]$ ll *tar*
-rw-r--r--. 1 root root 44575568 Jun 16 17:24 hadoop-0.20.2.tar.gz
-rw-r--r--. 1 root root 288430080 Mar 16 2016 jdk1.7.0_79.tar
[[email protected] ~]$ tar xf jdk1.7.0_79.tar
[[email protected] ~]$ tar zxf hadoop-0.20.2.tar.gz
[[email protected] ~]$ mv jdk1.7.0_79 jdk
[[email protected] ~]$ mv hadoop-0.20.2 hadoop
7、配置java环境
[[email protected] ~]$ vi .bashrc
export JAVA_HOME=/home/hduser/jdk
export JRE_HOME=$JAVA_HOME/jre
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=./:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
[[email protected] ~]$ source .bashrc
[[email protected] ~]$ java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
8、配置hadoop
[[email protected] conf]$ pwd
/home/hduser/hadoop/conf
[[email protected] conf]$ vi hadoop-env.sh
export JAVA_HOME=/home/hduser/jdk
[[email protected] conf]$ vi core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master1:9000</value>
</property>
</configuration>
[[email protected] conf]$ sudo mkdir -p /data/hadoop/data
[[email protected] conf]$ sudo chown -R hduser:hduser /data/hadoop/data
[[email protected] conf]$ vi hdfs-site.xml
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/data/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
[[email protected] conf]$ vi mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master1:9001</value>
</property>
</configuration>
9、做无密码认证
[[email protected] conf]$ cd
[[email protected] ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:qRJhPSF32QDs9tU3e0/mAx/EBC2MHamGv2WPvUw19/M [email protected]
The key's randomart image is:
+---[RSA 2048]----+
| ..+.o+ +o= |
| +.o. .. = o |
| o.o ... + |
| . .o. o.o. oo |
| .. .S.o ..+o|
| . .. . +..O|
| . . + *B+|
| . . .o==|
| oE|
+----[SHA256]-----+
一路enter键
[[email protected] ~]$ cd .ssh
[[email protected] .ssh]$ ls
id_rsa id_rsa.pub
[[email protected] .ssh]$ cp id_rsa.pub authorized_keys
10、格式化文件系统
[[email protected] .ssh]$ cd
[[email protected] ~]$ cd hadoop/bin
[[email protected] bin]$ ./hadoop namenode -format
18/06/19 04:02:12 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master1/192.168.11.131
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
18/06/19 04:02:13 INFO namenode.FSNamesystem: fsOwner=hduser,hduser
18/06/19 04:02:13 INFO namenode.FSNamesystem: supergroup=supergroup
18/06/19 04:02:13 INFO namenode.FSNamesystem: isPermissionEnabled=true
18/06/19 04:02:13 INFO common.Storage: Image file of size 96 saved in 0 seconds.
18/06/19 04:02:13 INFO common.Storage: Storage directory /tmp/hadoop-hduser/dfs/name has been successfully formatted.
18/06/19 04:02:13 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master1/192.168.11.131
************************************************************/
11、启动服务
[[email protected] bin]$ ./start-all.sh
starting namenode, logging to /home/hduser/hadoop/bin/../logs/hadoop-hduser-namenode-master1.out
The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:OXYl4X6F6g4TV7YriZaSvuBIFM840h/qTg8/B7BUil0.
ECDSA key fingerprint is MD5:b6:b6:04:2d:49:70:8b:ed:65:00:e2:05:b0:95:5b:6d.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
localhost: starting datanode, logging to /home/hduser/hadoop/bin/../logs/hadoop-hduser-datanode-master1.out
localhost: starting secondarynamenode, logging to /home/hduser/hadoop/bin/../logs/hadoop-hduser-secondarynamenode-master1.out
starting jobtracker, logging to /home/hduser/hadoop/bin/../logs/hadoop-hduser-jobtracker-master1.out
localhost: starting tasktracker, logging to /home/hduser/hadoop/bin/../logs/hadoop-hduser-tasktracker-master1.out
12、查看服务
[[email protected] bin]$ jps
1867 JobTracker
1804 SecondaryNameNode
1597 NameNode
1971 TaskTracker
2011 Jps
1710 DataNode
[[email protected] bin]$
13、浏览器查看服务状态
使用web查看HSFS运行状态
在浏览器输入
http://192.168.11.131:50030
使用web查看MapReduce运行状态
在浏览器输入
http://192.168.11.131:50070
原文地址:http://blog.51cto.com/xiaoxiaozhou/2131518