hadoop配置名称节点HA原理

Architecture



In a typical HA clusiter, two separate machines are configured as NameNodes. At any point in time, exactly one of the NameNodes is in an Active state, and the other is in a Standby state. The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover if necessary.

In order for the Standby node to keep its state synchronized with the Active node, both nodes communicate with a group of separate daemons called “JournalNodes” (JNs). When any namespace modification is performed by the Active node, it durably logs a record of the modification to a majority of these JNs. The Standby node is capable of reading the edits from the JNs, and is constantly watching them for changes to the edit log. As the Standby Node sees the edits, it applies them to its own namespace. In the event of a failover, the Standby will ensure that it has read all of the edits from the JounalNodes before promoting itself to the Active state. This ensures that the namespace state is fully synchronized before a failover occurs.

In order to provide a fast failover, it is also necessary that the Standby node have up-to-date information regarding the location of blocks in the cluster. In order to achieve this, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.

It is vital for the correct operation of an HA cluster that only one of the NameNodes be Active at a time. Otherwise, the namespace state would quickly diverge between the two, risking data loss or other incorrect results. In order to ensure this property and prevent the so-called “split-brain scenario,” the JournalNodes will only ever allow a single NameNode to be a writer at a time. During a failover, the NameNode which is to become active will simply take over the role of writing to the JournalNodes, which will effectively prevent the other NameNode from continuing in the Active state, allowing the new Active to safely proceed with failover.

HardWare Resources



In order to deploy an HA cluster, you should prepare the following:

  • NameNode machines - the machines on which you run the Active and Standby NameNodes should have equivalent hardware to each other, and equivalent hardware to what would be used in a non-HA cluster.
  • JournalNode machines - the machines on which you run the JournalNodes. The JournalNode daemon is relatively lightweight, so these daemons may reasonably be collocated on machines with other Hadoop daemons, for example NameNodes, the JobTracker, or the YARN ResourceManager. Note: There must be at least 3 JournalNode daemons, since edit log modifications must be written to a majority of JNs. This will allow the system to tolerate the failure of a single machine. You may also run more than 3 JournalNodes, but in order to actually increase the number of failures the system can tolerate, you should run an odd number of JNs, (i.e. 3, 5, 7, etc.). Note that when running with N JournalNodes, the system can tolerate at most (N - 1) / 2 failures and continue to function normally.

Note that, in an HA cluster, the Standby NameNode also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.

Deployment



hdfs-site.xml

<property>

<name>dfs.nameservices</name>

<value>mycluster</value>

</property>

<property>

<name>dfs.ha.namenodes.mycluster</name>

<value>nn1,nn2</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn1</name>

<value>192.168.153.201:8020</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn2</name>

<value>192.168.153.205:8020</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn1</name>

<value>192.168.153.201:50070</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn2</name>

<value>192.168.153.205:50070</value>

</property>

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://192.168.153.202:8485;192.168.153.203:8485;192.168.153.204:8485/mycluster</value>

</property>

<property>

<name>dfs.client.failover.proxy.provider.mycluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<property>

<name>dfs.ha.fencing.methods</name>

<value>sshfence</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/home/centos/.ssh/id_rsa</value>

</property>

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/home/centos/hadoop/hdfs/journal</value>

</property>

core-site.xml

<property>

<name>fs.defaultFS</name>

<value>hdfs://mycluster</value>

</property>

Notes



1、Currently, only a maximum of two NameNodes may be configured per nameservice.

时间: 2024-10-22 01:50:30

hadoop配置名称节点HA原理的相关文章

hadoop配置名称节点HA基本流程

hadoop配置HA(NN) 配置信息参考hadoop配置名称节点HA原理 1.停止所有进程 stop-dfs.sh 2.配置hdfs-site.xml和core-site.xml 3.将s201的id_rsa发送到s205(确保两个NN能同时ssh到各个DN) 4.将s201的工作目录复制到s205 5.启动服务journalnode hadoop-daemons.sh start journalnode 6.初始化journalnode hdfs namenode -initializeSh

Hadoop 2.6.0 HA高可用集群配置详解

1 Hadoop HA架构详解 1.1 HDFS HA背景 HDFS集群中NameNode 存在单点故障(SPOF).对于只有一个NameNode的集群,如果NameNode机器出现意外情况,将导致整个集群无法使用,直到NameNode 重新启动. 影响HDFS集群不可用主要包括以下两种情况:一是NameNode机器宕机,将导致集群不可用,重启NameNode之后才可使用:二是计划内的NameNode节点软件或硬件升级,导致集群在短时间内不可用. 为了解决上述问题,Hadoop给出了HDFS的高

Hadoop 配置及hadoop HA 的配置

注:本文中提到的ochadoop 不要感到奇怪,是亚信公司内部自己合成的一个包,把所有的组件都放在一个包内了,免去了组件的下载过程和解决兼容问题,其实也可以自己下载的,不要受到影响. 另,转载请注明出处,谢谢 修改静态IP和hostname 在 /etc/sysconfig/network-scripts/ifcfg-eth0配置IPADDR地址 运行以下命令生效 service network restart 在/etc/sysconfig/network中配置hostname配置完成后重启生

在Hadoop集群实施成功后再次格式化名称节点,datanode无法加入集群的处理办法

格式化namenode后,通过jps可发现datanode没有启动成功,查看datanode上的日志/home/wukong/usr/hadoop-1.2.1/logs/hadoop-wukong-datanode-bd12.log,可以发现是namespaceid不对. 解决办法: 1.查看namenode上hadoop.tmp.dir参数路径 /usr/hadoop-tmp/dfs/name/current/VERSION中的namespaceid: 2.在其他数据节点上修改上dfs.dat

hadoop 2.2 第二步 HA zookeeper 配置

第一篇文章还有要修改的地方,现在我的集群已经扩展到5台(虚拟机)有些配置还要改,这一篇记录一下Hadoop HA 和zookeeper的配置,方便自己以后看. 新的HDFS中的NameNode不再是只有一个了,可以有多个(目前只支持2个).每一个都有相同的职能. 在HDFS(HA) 集群中,Standby 节点还执行着对namespace 状态的checkpoint 功能,因此没有必要再运行SecondaryNameNode. 这两个NameNode的地位如何:一个是active状态的,一个是s

HA机制下的Hadoop配置

[版权申明:本文系作者原创,转载请注明出处] 文章出处:http://www.cnblogs.com/sdksdk0/p/5585355.html 作者: 朱培    ID:sdksdk0 -------------------------------------------------- 在我之前的一篇博客中,已经分享了关于hadoop的基本配置,地址:http://blog.csdn.net/sdksdk0/article/details/51498775,但是那个是使用与初学者学习和测试的

hadoop生态搭建(3节点)-04.hadoop配置

如果之前没有安装jdk和zookeeper,安装了的请直接跳过 # https://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html # ==================================================================安装 jdk mkdir -p /usr/java tar -zxvf ~/jdk-8u111-linux-x64

Hadoop集群实施成功后,再次格式化名称节点

Hadoop集群实施成功后,再次格式化名称节点. 环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1 1.格式化之前hadoop集群正常. [[email protected] hadoop-1.2.1]$ jps7927 SecondaryNameNode7773 NameNode8017 JobTracker9983 Jps [[email protected] ~]$ jps5384 DataNode6592 Jps5474 TaskTracker 2.关闭集

Hadoop 2.0 NameNode HA和Federation实践

参考链接:Hadoop 2.0 NameNode HA和Federation实践 Posted on 2012/12/10 一.背景 天云趋势在2012年下半年开始为某大型国有银行的历史交易数据备份及查询提供基于Hadoop的技术解决方案,由于行业的特殊性,客户对服务的可用性有着非常高的要求,而HDFS长久以来都被单点故障的问题所困扰,直到Apache Hadoop在2012年5月发布了2.0的alpha版本,其中MRv2还很不成熟,可HDFS的新功能已经基本可用,尤其是其中的的High Ava