【HA】high available高可用集群配置

主:hh.huangmingming.cn 192.168.1.31

从:yo.huangmingming.cn 192.168.1.250

主和从hosts文件配置:

192.168.1.31 hh hh.huangmingming.cn

192.168.1.250 yo yo.huangmingming.cn

一、安装epel扩展源

[[email protected] ~]# wget http://mirrors.sohu.com/fedora-epel/6/i386/epel-release-6-8.noarch.rpm

[[email protected] ~]# yum list |grep heartbeat

heartbeat.x86_64                            3.0.4-2.el6                  epel

heartbeat-devel.i686                        3.0.4-2.el6                  epel

heartbeat-devel.x86_64                      3.0.4-2.el6                  epel

heartbeat-libs.i686                         3.0.4-2.el6                  epel

heartbeat-libs.x86_64                       3.0.4-2.el6                  epel

二、安装heartbeat、libnet(主从都需要安装)

[[email protected] ~]# yum -y install heartbeat

[[email protected] ~]# yum -y install libnet

[[email protected] ~]# yum -y install nginx  (安装nginx做测试)

三、配置heartbeat(在主上配置)

1、拷贝配置文件authkeys、ha.cf、haresources到目录/etc/ha.d/下

[[email protected] ~]# cd /usr/share/doc/heartbeat-3.0.4/

[[email protected] heartbeat-3.0.4]# ls

apphbd.cf  authkeys  AUTHORS  ChangeLog  COPYING  COPYING.LGPL  ha.cf  haresources  README

[[email protected] heartbeat-3.0.4]# cp authkeys ha.cf haresources /etc/ha.d/

2、配置authkeys文件

[[email protected] ~]# cd /etc/ha.d/

[[email protected] ha.d]# vim authkeys

auth 3

#1 crc

#2 sha1 HI!

3 md5 Hello!

[[email protected] ha.d]# chmod 600 authkeys   (authkeys文件的权限要注意改成600)

3、配置haresources文件

[[email protected] ha.d]# vim haresources

hh.huangmingming.cn     192.168.1.13/24/eth0:0 nginx  (指定集群IP)

4、配置ha.cf文件

[[email protected] ha.d]# vim ha.cf

debugfile /var/log/ha-debug

logfile /var/log/ha-log

logfacility     local0

keepalive 2

deadtime 30

warntime 10

initdead 60

udpport 694

ucast eth0 192.168.1.250

auto_failback on

node hh.huangmingming.cn

node yo.huangmingming.cn

ping 192.168.1.1

respawn hacluster /usr/lib64/heartbeat/ipfail

5、把这三个文件发送到yo(从机器上)

[[email protected] ha.d]# scp authkeys haresources ha.cf yo:/etc/ha.d/

6、启动heartbeat服务,先主后从

[[email protected] ha.d]# /etc/init.d/heartbeat start

Starting High-Availability services: INFO:  Resource is stopped

Done.

四、启动产生的错误

1、<--respawn hacluster /usr/lib64/heartbeat/ipfail (注意当前系统是64位还是32位的否则会产生以下错误)--/>

[[email protected] ha.d]# /etc/init.d/heartbeat start

heartbeat: udpport setting must precede media statementsheartbeat[4227]: 2016/01/01_05:04:26 ERROR: Client child command [/usr/lib/heartbeat/ipfail] is not executable

heartbeat[4227]: 2016/01/01_05:04:26 ERROR: Heartbeat not started: configuration error.

heartbeat[4227]: 2016/01/01_05:04:26 ERROR: Configuration error, heartbeat not started.

2、<--以下两个错误都是因为主机名配置不对造成的,在heartbeat配置中指定主机名的地方要与当前主机名对应,否则启不来--/>

[[email protected] ha.d]# /etc/init.d/heartbeat start

heartbeat: udpport setting must precede media statementsheartbeat: baudrate setting must precede media statementsheartbeat[28195]: 2015/10/29_21:31:14 info: Pacemaker support: false

heartbeat[28195]: 2015/10/29_21:31:14 ERROR: Current node [hh.huangmingming.com] not in configuration!

heartbeat[28195]: 2015/10/29_21:31:14 info: By default, cluster nodes are named by `uname -n` and must be declared with a ‘node‘ directive in the ha.cf file.

heartbeat[28195]: 2015/10/29_21:31:14 info: See also: http://linux-ha.org/wiki/Ha.cf#node_directive

heartbeat[28195]: 2015/10/29_21:31:14 WARN: Logging daemon is disabled --enabling logging daemon is recommended

heartbeat[28195]: 2015/10/29_21:31:14 ERROR: Configuration error, heartbeat not started.

[[email protected] ha.d]# /etc/init.d/heartbeat start

Starting High-Availability services: INFO:  Resource is stopped

Heartbeat failure [rc=6]. Failed.

heartbeat: udpport setting must precede media statementsheartbeat: baudrate setting must precede media statementsheartbeat[30724]: 2015/10/29_21:56:29 info: Pacemaker support: false

heartbeat[30724]: 2015/10/29_21:56:29 WARN: Logging daemon is disabled --enabling logging daemon is recommended

heartbeat[30724]: 2015/10/29_21:56:29 info: **************************

heartbeat[30724]: 2015/10/29_21:56:29 info: Configuration validated. Starting heartbeat 3.0.4

heartbeat[30724]: 2015/10/29_21:56:29 ERROR: Bad nodename in /etc/ha.d//haresources [hh]

heartbeat[30724]: 2015/10/29_21:56:29 ERROR: Configuration error, heartbeat not started.

五、故障模拟测试,如主down掉,看会不会切换到从继续提供服务

1、正常情况下

[[email protected] ha.d]# /etc/init.d/heartbeat start

Starting High-Availability services: INFO:  Resource is stopped

Done.

[[email protected] ~]# ifconfig

eth0      Link encap:Ethernet  HWaddr 00:0C:29:97:EE:BF

inet addr:192.168.1.31  Bcast:192.168.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe97:eebf/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:37359 errors:0 dropped:0 overruns:0 frame:0

TX packets:22139 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:36402808 (34.7 MiB)  TX bytes:2377053 (2.2 MiB)

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:97:EE:BF

inet addr:192.168.1.13  Bcast:192.168.1.255  Mask:255.255.255.0

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth0:1    Link encap:Ethernet  HWaddr 00:0C:29:97:EE:BF

inet addr:192.168.1.144  Bcast:192.168.1.255  Mask:255.255.255.0

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback

inet addr:127.0.0.1  Mask:255.0.0.0

inet6 addr: ::1/128 Scope:Host

UP LOOPBACK RUNNING  MTU:16436  Metric:1

RX packets:28 errors:0 dropped:0 overruns:0 frame:0

TX packets:28 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:2727 (2.6 KiB)  TX bytes:2727 (2.6 KiB)

[[email protected] ha.d]# ps aux |grep nginx    (主)

root      31367  0.0  0.1  96488  1728 ?        Ss   21:58   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf

nginx     31368  0.0  0.2  96876  2500 ?        S    21:58   0:00 nginx: worker process

root      31643  0.0  0.0 103252   824 pts/0    S+   22:16   0:00 grep nginx

[[email protected] ha.d]# netstat -tnlp |grep nginx   (从)

[[email protected] ha.d]# ps aux |grep nginx

root       5217  0.0  0.0 103256   828 pts/0    S+   06:11   0:00 grep nginx

2、创建测试页面,并在正常情况下测试访问,此时由主提供服务

[[email protected] ha.d]# echo "AAAAAAAAAAAAAAAAhh192.168.1.31" >/usr/share/nginx/html/index.html

[[email protected] ha.d]# echo "AAAAAAAAAAAAAAAAyo192.168.1.250" >/usr/share/nginx/html/index.html

3、主服务节点down掉之后(测试)

[[email protected] ~]# iptables -A INPUT -p icmp -j DROP

[[email protected] ~]# tail /var/log/ha-log   (查看日志信息)

ResourceManager(default)[32095]: 2015/10/29_22:24:20 info: Running /etc/ha.d/resource.d/IPaddr 192.168.1.13/24/eth0:0 start

IPaddr(IPaddr_192.168.1.13)[32256]: 2015/10/29_22:24:20 INFO: Adding inet address 192.168.1.13/24 with broadcast address 192.168.1.255 to device eth0 (with label eth0:0)

IPaddr(IPaddr_192.168.1.13)[32256]: 2015/10/29_22:24:20 INFO: Bringing device eth0 up

IPaddr(IPaddr_192.168.1.13)[32256]: 2015/10/29_22:24:20 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.1.13 eth0 192.168.1.13 auto not_used not_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.1.13)[32230]: 2015/10/29_22:24:20 INFO:  Success

ResourceManager(default)[32095]: 2015/10/29_22:24:20 info: Running /etc/init.d/nginx  start

Oct 29 22:24:20 hh.huangmingming.cn ipfail: [30881]: info: NS: We are still alive!

Oct 29 22:24:20 hh.huangmingming.cn ipfail: [30881]: info: Link Status update: Link yo.huangmingming.cn/eth0 now has status dead

Oct 29 22:24:22 hh.huangmingming.cn ipfail: [30881]: info: Asking other side for ping node count.

Oct 29 22:24:22 hh.huangmingming.cn ipfail: [30881]: info: Checking remote count of ping nodes.

4、在从(yo)上查看有没有自动启动nginx,并在客户端访问,此时由从提供服务

[[email protected] ha.d]# ps aux |grep nginx

root       5534  0.0  0.1  96496  1972 ?        Ss   06:20   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf

nginx      5535  0.0  0.2  96884  2960 ?        S    06:20   0:00 nginx: worker process

root       5546  0.0  0.0 103256   828 pts/0    S+   06:24   0:00 grep nginx

[[email protected] ha.d]# ifconfig

eth0      Link encap:Ethernet  HWaddr 00:0C:29:8B:40:4A

inet addr:192.168.1.250  Bcast:192.168.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe8b:404a/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:36106 errors:0 dropped:0 overruns:0 frame:0

TX packets:21435 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:39751462 (37.9 MiB)  TX bytes:2077943 (1.9 MiB)

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:8B:40:4A

inet addr:192.168.1.13  Bcast:192.168.1.255  Mask:255.255.255.0

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback

inet addr:127.0.0.1  Mask:255.0.0.0

inet6 addr: ::1/128 Scope:Host

UP LOOPBACK RUNNING  MTU:16436  Metric:1

RX packets:31 errors:0 dropped:0 overruns:0 frame:0

TX packets:31 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

RX bytes:3005 (2.9 KiB)  TX bytes:3005 (2.9 KiB)

删除防火墙再测试

[[email protected] ~]# iptables -nvL

[[email protected] ~]# iptables -D INPUT -p icmp -j DROP

时间: 2024-12-25 23:33:08

【HA】high available高可用集群配置的相关文章

MongoDB高可用集群配置的方案

>>高可用集群的解决方案 高可用性即HA(High Availability)指的是通过尽量缩短因日常维护操作(计划)和突发的系统崩溃(非计划)所导致的停机时间,以提高系统和应用的可用性. 计算机系统的高可用在不同的层面上有不同的表现: (1)网络高可用 由于网络存储的快速发展,网络冗余技术被不断提升,提高IT系统的高可用性的关键应用就是网络高可用性,网络高可用性与网络高可靠性是有区别的,网络高可用性是通过匹配冗余的网络设备实现网络设备的冗余,达到高可用的目的.比如冗余的交换机,冗余的路由器等

基于Keepalived构建高可用集群配置实例(HA Cluster)

什么是集群 简单的讲集群(cluster)就是一组计算机,它们作为一个整体向用户提供一组网络资源.这些单个的计算机系统就是集群的节点(node).一个理想的集群是,用户从来不会意识到集群系统底层的节点,在他/她们看来,集群是一个系统,而非多个计算机系统.并且集群系统的管理员可以随意增加和删改集群系统的节点. 关于更详细的高可用集群我们在后面再做详解,先来说说Keepalived Keepalived是什么 Keepalived是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbea

Hadoop 2.6.0 HA高可用集群配置详解

1 Hadoop HA架构详解 1.1 HDFS HA背景 HDFS集群中NameNode 存在单点故障(SPOF).对于只有一个NameNode的集群,如果NameNode机器出现意外情况,将导致整个集群无法使用,直到NameNode 重新启动. 影响HDFS集群不可用主要包括以下两种情况:一是NameNode机器宕机,将导致集群不可用,重启NameNode之后才可使用:二是计划内的NameNode节点软件或硬件升级,导致集群在短时间内不可用. 为了解决上述问题,Hadoop给出了HDFS的高

HA高可用集群配置

高可用集群 说明:关键业务节点,需做高可用 HA==high available heartbeat -->HA 结构图: 1. 准备:需要两台机器,一主一从,同一局域网 主:192.168.11.160 从:192.168.11.20 2. 修改下主从主机的hostname,并用bash切换 主:hostname master ; bash 从:hostname slave ; bash 3. 关闭主从机器的防火墙,配置文件/etc/selinux/config,修改SELINUX=disab

Hadoop(25)-高可用集群配置,HDFS-HA和YARN-HA

一. HA概述 1. 所谓HA(High Available),即高可用(7*24小时不中断服务). 2. 实现高可用最关键的策略是消除单点故障.HA严格来说应该分成各个组件的HA机制:HDFS的HA和YARN的HA. 3. Hadoop2.0之前,在HDFS集群中NameNode存在单点故障(SPOF). 4 .   NameNode主要在以下两个方面影响HDFS集群 NameNode机器发生意外,如宕机,集群将无法使用,直到管理员重启 NameNode机器需要升级,包括软件.硬件升级,此时集

最新Hadoop-2.7.2+hbase-1.2.0+zookeeper-3.4.8 HA高可用集群配置安装

Ip 主机名 程序 进程 192.168.128.11 h1 Jdk Hadoop hbase Namenode DFSZKFailoverController Hamster 192.168.128.12 h2 Jdk Hadoop hbase Namenode DFSZKFailoverController Hamster 192.168.128.13 h3 Jdk Hadoop resourceManager 192.168.128.14 h4 Jdk Hadoop resourceMan

ha高可用集群配置(双机热备)

我们用heartbeat开源软件来做ha高可用 准备两台主机A和B A  : eth1 192.168.1.100 B : eth1 192.168.1.101 给主机A修改主机名 hostname master 或者vim /etc/sysconfig/network 给主机B修改主机名 hostname slave 或者vim /etc/sysconfig/network 修改/etc/hosts文件 两台机器上都加入 192.168.1.100  master 192.168.1.101

搭建三节点高可用集群配置步骤,zookeeper

步骤一:干净的集群,全新的hdfs在第一台主机上配置配置文件core-site.xml:<configuration><property> <name>fs.defaultFS</name> <value>hdfs://bcqm1711</value></property><property> <name>hadoop.tmp.dir</name> <value>/home/

keepalived高可用集群配置

[软件安装] yum install -y keepalived rpm -qa keepalived LB01配置文件 [[email protected]_01 ~]# cat /etc/keepalived/keepalived.conf !Configuration file for keepalived global_defs {         notification_email {         [email protected]         }         notif