最新Hadoop-2.7.2+hbase-1.2.0+zookeeper-3.4.8 HA高可用集群配置安装


Ip


主机名


程序


进程


192.168.128.11


h1


Jdk

Hadoop

hbase


Namenode

DFSZKFailoverController

Hamster


192.168.128.12


h2


Jdk

Hadoop

hbase


Namenode

DFSZKFailoverController

Hamster


192.168.128.13


h3


Jdk

Hadoop


resourceManager


192.168.128.14


h4


Jdk

Hadoop


resourceManager


192.168.128.15


h5


Jdk

Hadoop

Zookeeper

Hbase


Datanode

nodeManager

JournalNode

QuorumPeerMain

HRegionServer


192.168.128.16


h6


Jdk

Hadoop

Zookeeper

Hbase


Datanode

nodeManager

JournalNode

QuorumPeerMain

HRegionServer


192.168.128.17


h7


Jdk

Hadoop

Zookeeper

hbase


Datanode

nodeManager

JournalNode

QuorumPeerMain

HRegionServer

关于准备工作  我这里就不一一写出来了,总结一下有主机名,ip,主机名和ip的映射关系,防火墙,ssh免密码,jdk的安装及环境变量的设置。

安装zookeeper到 h5、h6、h7上面

修改 /home/zookeeper-3.4.8/conf的zoo_sample.cfg

cp zoo_sample.cfg zoo.cfg

# The number of milliseconds of each tick

tickTime=2000

# The number of ticks that the initial

# synchronization phase can take

initLimit=10

# The number of ticks that can pass between

# sending a request and getting an acknowledgement

syncLimit=5

# the directory where the snapshot is stored.

# do not use /tmp for storage, /tmp here is just

# example sakes.

dataDir=/home/zookeeper-3.4.8/data

# the port at which the clients will connect

clientPort=2181

# the maximum number of client connections.

# increase this if you need to handle more clients

#maxClientCnxns=60

#

# Be sure to read the maintenance section of the

# administrator guide before turning on autopurge.

#

# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

#

# The number of snapshots to retain in dataDir

#autopurge.snapRetainCount=3

# Purge task interval in hours

# Set to "0" to disable auto purge feature

#autopurge.purgeInterval=1

server.1=h5:2888:3888

server.2=h6:2888:3888

server.3=h7:2888:3888

创建 data文件夹  和在里面  创建文件myid  并写入数字1

touch data/myid

echo 1 > data/myid

拷贝整个zookeeper到另外两个节点上

scp -r /home/zookeeper-3.4.8  h6:/home/

scp -r /home/zookeeper-3.4.8  h7:/home/

其他两个节点的myid  修改为 2  3

安装hadoop

/home/hadoop-2.7.2/etc/Hadoop

hadoop-env.sh:

export JAVA_HOME=/home/jdk

core-site.xml:

<configuration>

<!-- 指定hdfs的nameservice为masters -->

<property>

<name>fs.defaultFS</name>

<value>hdfs://masters</value>

</property>

<!-- 指定hadoop临时目录 -->

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop-2.7.2/tmp</value>

</property>

<!-- 指定zookeeper地址 -->

<property>

<name>ha.zookeeper.quorum</name>

<value>h5:2181,h6:2181,h7:2181</value>

</property>

</configuration>

hdfs-site.xml:

<configuration>

<!--指定hdfs的nameservice为masters,需要和core-site.xml中的保持一致 -->

<property>

<name>dfs.nameservices</name>

<value>masters</value>

</property>

<!-- h1下面有两个NameNode,分别是h1,h2 -->

<property>

<name>dfs.ha.namenodes.masters</name>

<value>h1,h2</value>

</property>

<!-- h1的RPC通信地址 -->

<property>

<name>dfs.namenode.rpc-address.masters.h1</name>

<value>h1:9000</value>

</property>

<!-- h1的http通信地址 -->

<property>

<name>dfs.namenode.http-address.masters.h1</name>

<value>h1:50070</value>

</property>

<!-- h2的RPC通信地址 -->

<property>

<name>dfs.namenode.rpc-address.masters.h2</name>

<value>h2:9000</value>

</property>

<!-- h2的http通信地址 -->

<property>

<name>dfs.namenode.http-address.masters.h2</name>

<value>h2:50070</value>

</property>

<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://h5:8485;h6:8485;h7:8485/masters</value>

</property>

<!-- 指定JournalNode在本地磁盘存放数据的位置 -->

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/home/hadoop-2.7.2/journal</value>

</property>

<!-- 开启NameNode失败自动切换 -->

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

<!-- 配置失败自动切换实现方式 -->

<property>

<name>dfs.client.failover.proxy.provider.masters</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->

<property>

<name>dfs.ha.fencing.methods</name>

<value>

sshfence

shell(/bin/true)

</value>

</property>

<!-- 使用sshfence隔离机制时需要ssh免登陆 -->

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/root/.ssh/id_rsa</value>

</property>

<!-- 配置sshfence隔离机制超时时间 -->

<property>

<name>dfs.ha.fencing.ssh.connect-timeout</name>

<value>30000</value>

</property>

</configuration>

mapred-site.xml:

<configuration>

<!-- 指定mr框架为yarn方式 -->

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

yarn-site.xml:

<configuration>

<!-- 开启RM高可靠 -->

<property>

<name>yarn.resourcemanager.ha.enabled</name>

<value>true</value>

</property>

<!-- 指定RM的cluster id -->

<property>

<name>yarn.resourcemanager.cluster-id</name>

<value>RM_HA_ID</value>

</property>

<!-- 指定RM的名字 -->

<property>

<name>yarn.resourcemanager.ha.rm-ids</name>

<value>rm1,rm2</value>

</property>

<!-- 分别指定RM的地址 -->

<property>

<name>yarn.resourcemanager.hostname.rm1</name>

<value>h3</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm2</name>

<value>h4</value>

</property>

<property>

<name>yarn.resourcemanager.recovery.enabled</name>

<value>true</value>

</property>

<property>

<name>yarn.resourcemanager.store.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

</property>

<!-- 指定zk集群地址 -->

<property>

<name>yarn.resourcemanager.zk-address</name>

<value>h5:2181,h6:2181,h7:2181</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

Slaves:

h5

h6

h7

然后 拷贝到其他节点

scp -r hadoop-2.7.2 h2:/home/    等等

这个地方说明一下  yarn 的HA  是在  h3和h4  上面

启动顺序

###注意:严格按照下面的步骤

  1. 启动zookeeper集群

[[email protected] ~]# cd /home/zookeeper-3.4.8/bin/

[[email protected] bin]# ./zkServer.sh start

H5  h6  h7  都一样

[[email protected] bin]# ./zkServer.sh status

查看状态

  1. 启动journalnode

[[email protected] bin]# cd /home/hadoop-2.7.2/sbin/

[[email protected] sbin]# ./hadoop-daemons.sh start journalnode

h5: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h5.out

h7: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h7.out

h6: starting journalnode, logging to /home/hadoop-2.7.2/logs/hadoop-root-journalnode-h6.out

[[email protected] sbin]# jps

2420 JournalNode

2309 QuorumPeerMain

2461 Jps

[[email protected] sbin]# ^C

  1. 格式化HDFS

在h1上执行命令:

hdfs namenode -format

格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件

拷贝tmp 到 h2

[[email protected] hadoop-2.7.2]# scp -r tmp/ h2:/home/hadoop-2.7.2/

4. 格式化ZK(在h1上执行即可)

[[email protected] hadoop-2.7.2]# hdfs zkfc -formatZK

5. 启动HDFS(在h1上执行)

[[email protected] hadoop-2.7.2]# sbin/start-dfs.sh

16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns1 remains unresolved for ID null.  Check your hdfs-site.xml file to ensure namenodes are configured properly.

16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns2 remains unresolved for ID null.  Check your hdfs-site.xml file to ensure namenodes are configured properly.

16/02/25 05:01:14 WARN hdfs.DFSUtil: Namenode for ns3 remains unresolved for ID null.  Check your hdfs-site.xml file to ensure namenodes are configured properly.

Starting namenodes on [h1 h2 masters masters masters]

masters: ssh: Could not resolve hostname masters: Name or service not known

masters: ssh: Could not resolve hostname masters: Name or service not known

masters: ssh: Could not resolve hostname masters: Name or service not known

h2: starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h2.out

h1: starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h1.out

h5: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h5.out

h7: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h7.out

h6: starting datanode, logging to /home/hadoop-2.7.2/logs/hadoop-root-datanode-h6.out

Starting journal nodes [h5 h6 h7]

h5: journalnode running as process 2420. Stop it first.

h6: journalnode running as process 2885. Stop it first.

h7: journalnode running as process 2896. Stop it first.

Starting ZK Failover Controllers on NN hosts [h1 h2 masters masters masters]

masters: ssh: Could not resolve hostname masters: Name or service not known

masters: ssh: Could not resolve hostname masters: Name or service not known

masters: ssh: Could not resolve hostname masters: Name or service not known

h2: starting zkfc, logging to /home/hadoop-2.7.2/logs/hadoop-root-zkfc-h2.out

h1: starting zkfc, logging to /home/hadoop-2.7.2/logs/hadoop-root-zkfc-h1.out

[[email protected] hadoop-2.7.2]#

6. 启动YARN(是在h3上执行start-yarn.sh,把namenode和resourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动)

[[email protected] sbin]# ./start-yarn.sh

[[email protected]4 sbin]# ./yarn-daemons.sh start resourcemanager

验证:

http://192.168.128.11:50070

Overview ‘h1:9000‘ (active)

http://192.168.128.12:50070

Overview ‘h2:9000‘ (standby)

上传文件

[[email protected] bin]# hadoop fs -put /etc/profile /profile

[[email protected] bin]# hadoop fs -ls

ls: `.‘: No such file or directory

[[email protected] bin]# hadoop fs -ls /

Found 1 items

-rw-r--r--   3 root supergroup       1814 2016-02-26 19:08 /profile

[[email protected] bin]#

杀死h1

[[email protected] sbin]# jps

2480 NameNode

2868 Jps

2775 DFSZKFailoverController

[[email protected] sbin]# kill -9 2480

[[email protected] sbin]# jps

2880 Jps

2775 DFSZKFailoverController

[[email protected] sbin]# hadoop fs -ls /

Found 1 items

-rw-r--r--   3 root supergroup       1814 2016-02-26 19:08 /profile

此时 h2  变为active

手动启动 h1的 namenode

[[email protected] sbin]# ./hadoop-daemon.sh start namenode

starting namenode, logging to /home/hadoop-2.7.2/logs/hadoop-root-namenode-h1.out

[[email protected] sbin]# hadoop jar /home/hadoop-2.7.2/s

观察  h1 状态为standby

验证yarn

[[email protected] sbin]# hadoop jar /home/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /profile /out

16/02/26 19:14:23 INFO input.FileInputFormat: Total input paths to process : 1

16/02/26 19:14:23 INFO mapreduce.JobSubmitter: number of splits:1

16/02/26 19:14:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456484773347_0001

16/02/26 19:14:24 INFO impl.YarnClientImpl: Submitted application application_1456484773347_0001

16/02/26 19:14:24 INFO mapreduce.Job: The url to track the job: http://h3:8088/proxy/application_1456484773347_0001/

16/02/26 19:14:24 INFO mapreduce.Job: Running job: job_1456484773347_0001

16/02/26 19:14:49 INFO mapreduce.Job: Job job_1456484773347_0001 running in uber mode : false

16/02/26 19:14:49 INFO mapreduce.Job:  map 0% reduce 0%

16/02/26 19:15:05 INFO mapreduce.Job:  map 100% reduce 0%

16/02/26 19:15:22 INFO mapreduce.Job:  map 100% reduce 100%

16/02/26 19:15:23 INFO mapreduce.Job: Job job_1456484773347_0001 completed successfully

16/02/26 19:15:23 INFO mapreduce.Job: Counters: 49

File System Counters

FILE: Number of bytes read=2099

FILE: Number of bytes written=243781

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=1901

HDFS: Number of bytes written=1470

HDFS: Number of read operations=6

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=1

Launched reduce tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=13014

Total time spent by all reduces in occupied slots (ms)=13470

Total time spent by all map tasks (ms)=13014

Total time spent by all reduce tasks (ms)=13470

Total vcore-milliseconds taken by all map tasks=13014

Total vcore-milliseconds taken by all reduce tasks=13470

Total megabyte-milliseconds taken by all map tasks=13326336

Total megabyte-milliseconds taken by all reduce tasks=13793280

Map-Reduce Framework

Map input records=80

Map output records=256

Map output bytes=2588

Map output materialized bytes=2099

Input split bytes=87

Combine input records=256

Combine output records=156

Reduce input groups=156

Reduce shuffle bytes=2099

Reduce input records=156

Reduce output records=156

Spilled Records=312

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=395

CPU time spent (ms)=4100

Physical memory (bytes) snapshot=298807296

Virtual memory (bytes) snapshot=4201771008

Total committed heap usage (bytes)=138964992

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=1814

File Output Format Counters

Bytes Written=1470

[[email protected] sbin]# hadoop fs -ls /

Found 3 items

drwxr-xr-x   - root supergroup          0 2016-02-26 19:15 /out

-rw-r--r--   3 root supergroup       1814 2016-02-26 19:08 /profile

drwx------   - root supergroup          0 2016-02-26 19:14 /tmp

[[email protected] sbin]#

Hadoop ha  集群搭建完成

安装hbase

hbase-env.sh:

export JAVA_HOME=/home/jdk

export HBASE_MANAGES_ZK=false

hbase-site.xml:

<configuration>

<property>

<name>hbase.rootdir</name>

<value>hdfs://h1:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.master</name>

<value>h1:60000</value>

</property>

<property>

<name>hbase.master.port</name>

<value>60000</value>

<description>The port master should bind to.</description>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>h5,h6,h7</value>

</property>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

</configuration>

注意:$HBASE_HOME/conf/hbase-site.xml的hbase.rootdir的主机和端口号与$HADOOP_HOME/conf/core-site.xml的fs.default.name的主机和端口号一致

Regionservers:内容为:

h5

h6

h7

复制到h2  h5,h6,h7上面

整个启动顺序

按照上面启动hadoop  ha  的顺序  先启动好

然后在h1,h2上启动hbase

./start-hbase.sh

测试进入 hbase

[[email protected] bin]# hbase shell

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hbase-1.2.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

HBase Shell; enter ‘help<RETURN>‘ for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 1.2.0, r25b281972df2f5b15c426c8963cbf77dd853a5ad, Thu Feb 18 23:01:49 CST 2016

hbase(main):001:0> esit

NameError: undefined local variable or method `esit‘ for #<Object:0x7ad1caa2>

hbase(main):002:0> exit

至此全部结束。

时间: 2024-12-27 20:12:45

最新Hadoop-2.7.2+hbase-1.2.0+zookeeper-3.4.8 HA高可用集群配置安装的相关文章

Hadoop 2.6.0 HA高可用集群配置详解

1 Hadoop HA架构详解 1.1 HDFS HA背景 HDFS集群中NameNode 存在单点故障(SPOF).对于只有一个NameNode的集群,如果NameNode机器出现意外情况,将导致整个集群无法使用,直到NameNode 重新启动. 影响HDFS集群不可用主要包括以下两种情况:一是NameNode机器宕机,将导致集群不可用,重启NameNode之后才可使用:二是计划内的NameNode节点软件或硬件升级,导致集群在短时间内不可用. 为了解决上述问题,Hadoop给出了HDFS的高

Hadoop HA高可用集群搭建(2.7.2)

1.集群规划: 主机名        IP                安装的软件                            运行的进程 drguo1  192.168.80.149 jdk.hadoop                         NameNode.DFSZKFailoverController(zkfc).ResourceManager drguo2 192.168.80.150  jdk.hadoop                         Nam

Hadoop(25)-高可用集群配置,HDFS-HA和YARN-HA

一. HA概述 1. 所谓HA(High Available),即高可用(7*24小时不中断服务). 2. 实现高可用最关键的策略是消除单点故障.HA严格来说应该分成各个组件的HA机制:HDFS的HA和YARN的HA. 3. Hadoop2.0之前,在HDFS集群中NameNode存在单点故障(SPOF). 4 .   NameNode主要在以下两个方面影响HDFS集群 NameNode机器发生意外,如宕机,集群将无法使用,直到管理员重启 NameNode机器需要升级,包括软件.硬件升级,此时集

大数据高可用集群环境安装与配置(07)——安装HBase高可用集群

1. 下载安装包 登录官网获取HBase安装包下载地址 https://hbase.apache.org/downloads.html 2. 执行命令下载并安装 cd /usr/local/src/ wget http://mirrors.tuna.tsinghua.edu.cn/apache/hbase/2.1.8/hbase-2.1.8-bin.tar.gz tar -zxvf hbase-2.1.8-bin.tar.gz mv hbase-2.1.8 /usr/local/hbase/ 3

大数据高可用集群环境安装与配置(06)——安装Hadoop高可用集群

下载Hadoop安装包 登录 https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/ 镜像站,找到我们要安装的版本,点击进去复制下载链接 安装Hadoop时要注意版本与后续安装的HBase.Spark等相关组件的兼容,不要安装了不匹配的版本,而导致某些组件需要重装 输入命令进行安装操作 cd /usr/local/src/ wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/

Hadoop完全高可用集群安装

架构图(HA模型没有SNN节点) 用vm规划了8台机器,用到了7台,SNN节点没用   NN DN SN ZKFC ZK JNN RM NM node1 *     *         node2 *     *         node3                 node4       *     *   node5       *     *   node6   *     * *   * node7   *     * *   * node8   *     * *   * 集群搭

基于 ZooKeeper 搭建 Hadoop 高可用集群

一.高可用简介 二.集群规划 三.前置条件 四.集群配置 五.启动集群 六.查看集群 七.集群的二次启动 一.高可用简介 Hadoop 高可用 (High Availability) 分为 HDFS 高可用和 YARN 高可用,两者的实现基本类似,但 HDFS NameNode 对数据存储及其一致性的要求比 YARN ResourceManger 高得多,所以它的实现也更加复杂,故下面先进行讲解: 1.1 高可用整体架构 HDFS 高可用架构如下: 图片引用自:https://www.edure

Hbase 完全分布式 高可用 集群搭建

1.准备 Hadoop 版本:2.7.7 ZooKeeper 版本:3.4.14 Hbase 版本:2.0.5 四台主机: s0, s1, s2, s3 搭建目标如下: HMaster:s0,s1(备份HMaster) HRegionServer:s1, s2, s3 主机映射信息如下 192.168.32.100 s0 192.168.32.101 s1 192.168.32.102 s2 192.168.32.103 s3 Hadoop 安装步骤参考(示例版本与HDFS端口配置略有差异,根据

Hadoop zookeeper HA高可靠集群部署搭建,及错误诊断

http://archive-primary.cloudera.com/cdh5/cdh/5/ 一.准备工作1.修改Linux主机名,每台都得配置[[email protected] ~]# vim /etc/sysconfig/networkNETWORKING=yesHOSTNAME=h2012.修改IP /etc/sysconfig/network-scripts/ifcfg-eth03.修改主机名和IP的映射关系(h24,h25为主,h21,h22,h23为从)[[email prote