Dockerfile完成Hadoop2.6的伪分布式搭建

在 《Docker中搭建Hadoop-2.6单机伪分布式集群》中在容器中操作来搭建伪分布式的Hadoop集群,这一节中将主要通过Dokcerfile 来完成这项工作。

1 获取一个简单的Docker系统镜像,并建立一个容器。

  1.1 这里我选择下载CentOS镜像

docker pull centos

  1.2 通过docker tag命令将下载的CentOS镜像名称换成centos,并删除老标签

docker tag docker.io/centos centosdocker rmr docker.io/centos

2. JDK的安装和配置

  去Oracle官网提前下载好所需的jdk。

  建立文件夹,并将jdkcopy到文件夹下

[[email protected] ~]# mkdir centos-jdk
[[email protected]-docker ~]# mv jdk-7u79-linux-x64.tar.gz ./centos-jdk/
[[email protected]-docker ~]# cd centos-jdk/
[[email protected]-docker centos-jdk]# ls
jdk-7u79-linux-x64.tar.gz

  在centos-jdk文件夹中建立Dockerfile,其内容如下:

# CentOS with JDK 7
# Author        amei

# build a new image with basic  centos
FROM centos
# who is the author
MAINTAINER amei

# make a new directory to store the jdk files
RUN mkdir /usr/local/java

# copy the jdk  archive to the image,and it will automaticlly unzip the tar file
ADD jdk-7u79-linux-x64.tar.gz /usr/local/java/

# make a symbol link
RUN ln -s /usr/local/java/jdk1.7.0_79 /usr/local/java/jdk

# set environment variables
ENV JAVA_HOME /usr/local/java/jdk
ENV JRE_HOME ${JAVA_HOME}/jre
ENV CLASSPATH .:${JAVA_HOME}/lib:${JRE_HOME}/lib
ENV PATH ${JAVA_HOME}/bin:$PATH

  根据Dokcerfile创建新镜像:

# 注意后边的 . 不能忘了[[email protected] centos-jdk]# docker build -t="centos-jdk" .
Sending build context to Docker daemon 153.5 MB
Step 1 : FROM centos
 ---> e8f1bdb3b6a7
.....................................
Step 9 : ENV PATH ${JAVA_HOME}/bin:$PATH
 ---> Running in 5ecbe2fac774
 ---> ad1110b84433
Removing intermediate container 5ecbe2fac774
Successfully built ad1110b84433

  查看新建立的镜像

[[email protected] centos-jdk]# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
centos-jdk          latest              ad1110b84433        5 minutes ago       503 MB
centos              latest              e8f1bdb3b6a7        2 weeks ago         196.7 MB

  建立容器,查看新的镜像中的JDK是否正确

[[email protected] centos-jdk]# docker run -it centos-jdk /bin/bash
[[email protected] /]# java -version    # 出来结果表明配置没问题
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
[[email protected] /]# echo $JAVA_HOME
/usr/local/java/jdk

3. 在前一步基础上安装ssh

  建立新的文件夹,并在其下建立Dokcerfile文件,其内容为:

# build a new image with centos-jdkFROM centos-jdk# who is the authorMAINTAINER amei

# install opensshRUN yum -y  install openssh-server openssh-clients

#generate key filesRUN ssh-keygen -q -t rsa -b 2048 -f /etc/ssh/ssh_host_rsa_key -N ‘‘RUN ssh-keygen -q -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N ‘‘RUN ssh-keygen -q -t dsa -f /etc/ssh/ssh_host_ed25519_key  -N ‘‘

# login localhost without passwordRUN ssh-keygen -f /root/.ssh/id_rsa -N ‘‘RUN touch /root/.ssh/authorized_keysRUN cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

# set password of rootRUN echo "root:1234" | chpasswd

# open the port 22EXPOSE 22# when start a container it will be executedCMD ["/usr/sbin/sshd","-D"]

  利用此Dockerfile 建立镜像:

[[email protected] centos-jdk-ssh]# docker build -t "centos-jdk-ssh" .
Sending build context to Docker daemon  2.56 kB
Step 1 : FROM centos-jdk
 ---> ad1110b84433
。。。。。。。。。。。。。。。。。。。。。。。。
Successfully built 5286623a6cc0

  验证建立好的镜像:

#在刚才的镜像之上建立容器[[email protected] centos-jdk-ssh]# docker run -it centos-jdk-ssh /bin/bash
[[email protected] /]# /usr/sbin/sshd      #开启sshd服务
[[email protected] /]# ssh [email protected]    #登陆到本机
The authenticity of host ‘localhost (::1)‘ can‘t be established.    # 观察确实不用密码即可登陆
ECDSA key fingerprint is b7:f0:33:15:c9:ca:12:8b:93:0d:45:95:6f:43:4f:78.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘localhost‘ (ECDSA) to the list of known hosts.
[[email protected] ~]# exit    #退出容器
logout
Connection to localhost closed.

4. 安装Hdoop2.6

  首先先下载好hadoop安装包。

  建立文件夹,并在文件夹下建立如下几个文件。

  编辑core-site.xml文件

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>file:/data/hadoop/tmp</value>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
        </property>
</configuration>

  编辑hdfs-site.xml文件

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/data/hadoop/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/data/hadoop/dfs/data</value>
        </property>
</configuration>

  复制Hadoop安装包中/etc/hadoop/hadoop-env.sh,编辑hadoop-env.sh文件,将export JAVA_HOME替换为jdk的安装位置

export JAVA_HOME=/usr/local/java/jdk

  在其下建立Dokcerfile文件,其内容为:

# build a new image with  centos-jdk-ssh
FROM centos-jdk-ssh
# who is the author
MAINTAINER amei

# install some important software
RUN yum -y install net-tools  which

# copy the hadoop  archive to the image,and it will automaticlly unzip the tar file
ADD hadoop-2.6.0.tar.gz /usr/local/

# make a symbol link
RUN ln -s /usr/local/hadoop-2.6.0 /usr/local/hadoop

# copy the configuration file to image
COPY core-site.xml /usr/local/hadoop/etc/hadoop/
COPY hdfs-site.xml /usr/local/hadoop/etc/hadoop/
COPY hadoop-env.sh /usr/local/hadoop/etc/hadoop/

# set environment variables
ENV HADOOP_HOME /usr/local/hadoop
ENV PATH ${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH

  建立镜像:

docker build -t "centos-hadoop" .

  查看镜像:

[[email protected] centos-hadoop]# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
centos-hadoop       latest              64b9d221973b        29 minutes ago      930 MB
centos-jdk-ssh      latest              5286623a6cc0        About an hour ago   600 MB
centos-jdk          latest              ad1110b84433        2 hours ago         503 MB

  建立容器测试镜像:

[[email protected] centos-hadoop]# docker run -it centos-hadoop /bin/bash  #开启容器
[[email protected] /]#/usr/sbin/sshd            #开启sshd服务
[[email protected] /]# hdfs namenode -format        #格式化namenode
16/08/06 22:56:34 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = 889d94ef9cbc/172.17.0.2
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
............................................................
16/08/06 22:56:36 INFO common.Storage: Storage directory /data/hadoop/dfs/name has been successfully formatted.
16/08/06 22:56:37 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/08/06 22:56:37 INFO util.ExitUtil: Exiting with status 0
16/08/06 22:56:37 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at 889d94ef9cbc/172.17.0.2
************************************************************/
[[email protected] /]# start-dfs.sh   # 开启hdfs
[[email protected] /]# jps      #查看开启的应用程序
576 SecondaryNameNode
410 DataNode
684 Jps
328 NameNode
[[email protected] /]# hadoop dfsadmin -report  #查看HDFS状况
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Configured Capacity: 10726932480 (9.99 GB)
Present Capacity: 9748041728 (9.08 GB)
DFS Remaining: 9748037632 (9.08 GB)
DFS Used: 4096 (4 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: 889d94ef9cbc
Decommission Status : Normal
Configured Capacity: 10726932480 (9.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 978890752 (933.54 MB)
DFS Remaining: 9748037632 (9.08 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.87%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Aug 06 23:29:09 UTC 2016

  

时间: 2024-10-05 23:55:00

Dockerfile完成Hadoop2.6的伪分布式搭建的相关文章

hadoop2.2.0伪分布式搭建

一.准备linux环境 1.更改VMware适配器设置 由于是在单机环境下进行学习的,因此选择适配器模式是host-only模式,如果想要联网,可以选择桥接模式,配置的方式差不多. 点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip 设置网段:192.168.85.0 子网掩码:255.255.255.0 -> apply -> ok 回到windows --> 打开网络

Hadoop2.2.0伪分布式搭建简述

简述了自己搭建Hadoop伪分布式的过程,方便以后查看参考. 环境:Vmware10+RedHat6.3+hadoop2.2.0+JDK1.7 Hadoop模式: 本地模式:只能其一个reduce和一个map,用于调试 伪分布式模式:通过一台机器模拟分布式,在学习时使用.验证逻辑是否正确 集群模式:工作的模式,有几百上千台机器. linux环境配 关闭防火墙 若是对外网提供的服务是绝对不能关闭防火墙的.而Hadoop一般是公司内部使用,有多台节点,且之间需要通信,此时若防火前将通信的端口屏蔽则无

hadoop2.4.1伪分布式搭建

1.准备Linux环境 1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip 设置网段:192.168.244.131. 子网掩码:255.255.255.0 -> apply -> ok 回到windows --> 打开网络和共享中心 -> 更改适配器设置 -> 右键VMnet1 -> 属性 -> 双击IPv4 -> 设置windows

hadoop2.2.0伪分布式搭建3--安装Hadoop

3.1上传hadoop安装包 3.2解压hadoop安装包 mkdir /cloud #解压到/cloud/目录下 tar -zxvf hadoop-2.2.0.tar.gz -C /cloud/ 3.3修改配置文件(5个) 第一个:hadoop-env.sh #在27行修改 export JAVA_HOME=/usr/java/jdk1.7.0_55 第二个:core-site.xml <configuration> <!-- 指定HDFS老大(namenode)的通信地址 -->

hadoop:hadoop2.2.0伪分布式搭建

1.准备Linux环境     1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip 设置网段:192.168.68.0 子网掩码:255.255.255.0 -> apply -> ok          回到windows --> 打开网络和共享中心 -> 更改适配器设置 -> 右键VMnet1 -> 属性 -> 双击IPv4 ->

hadoop2.6.0 伪分布式搭建

haoop2.0的架构图 HDFS2的架构 负责数据的分布式存储 主从结构 主节点,可以有2个: namenode 从节点,有很多个: datanode namenode负责: 接收用户操作请求,是用户操作的入口 维护文件系统的目录结构,称作命名空间 datanode负责: 存储文件 Yarn的架构 资源的调度和管理平台 主从结构 主节点,可以有2个: ResourceManager 从节点,有很多个: NodeManager ResourceManager负责: 集群资源的分配与调度 MapR

大数据 Hadoop2.6.5 伪分布式搭建

1.安装jdk rpm -i jdk-8u231-linux-x64.rpm 2.配置java环境变量 vi /etc/profile export JAVA_HOME=/usr/java/jdk1.8.0_231-amd64 PATH=$PATH:$JAVA_HOME/bin source /etc/profile 3.配置ssh免密钥登陆 ssh localhost ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub

hadoop2.2.0伪分布式搭建2--安装JDK

2.1上传 2.2解压jdk #创建文件夹 mkdir /usr/java #解压 tar -zxvf jdk-7u55-linux-i586.tar.gz -C /usr/java/ 2.3将java添加到环境变量中 vim /etc/profile #在文件最后添加 export JAVA_HOME=/usr/java/jdk1.7.0_55 export PATH=$PATH:$JAVA_HOME/bin #刷新配置 source /etc/profile

hadoop2.2.0伪分布式搭建1--准备Linux环境

1.0点击VMware快捷方式,右键打开文件所在位置 -> 双击vmnetcfg.exe -> VMnet1 host-only ->修改subnet ip 设置网段:192.168.1.0 子网掩码:255.255.255.0 -> apply -> ok 回到windows --> 打开网络和共享中心 -> 更改适配器设置 -> 右键VMnet1 -> 属性 -> 双击IPv4 -> 设置windows的IP:192.168.1.110