一定要先安装openssh和openssh-clients两个包
192.168.139.2
[[email protected] .ssh]# ssh-keygen -t rsa -P ‘‘ //做ssh双机互信
[[email protected] .ssh]# ssh-copy-id -i ./id_rsa.pub [email protected]
___________________________________________________________________________________________
192.168.139.4
[[email protected] .ssh]# ssh-keygen -t rsa -P ‘‘
[[email protected] .ssh]# ssh-copy-id -i ./id_rsa.pub [email protected]
___________________________________________________________________________________________
做时间同步,并且写入计划任务每五分钟同步一次时间,我使用的时互联网上的ntp_server,必须能连上网
192.168.139.2
[[email protected] ~]# ntpdate 0.uk.pool.ntp.org
2 Nov 19:43:56 ntpdate[1715]: step time server 109.74.192.97 offset -28799.081856 sec
[[email protected] yum.repos.d]# vim /var/spool/cron/root
*/5 * * * * /usr/sbin/ntpdate 0.uk.pool.ntp.org > /dev/null
___________________________________________________________________________________________
192.168.139.4
[[email protected] ~]# ntpdate 0.uk.pool.ntp.org
2 Nov 19:43:56 ntpdate[1715]: step time server 109.74.192.97 offset -28799.081856 sec
[[email protected] yum.repos.d]# vim /var/spool/cron/root
*/5 * * * * /usr/sbin/ntpdate 0.uk.pool.ntp.org > /dev/null
___________________________________________________________________________________________
安装软件,以下只演示在192.168.139.2上过程,192.168.139.4上一样
heartbeat官网www.linux-ha.org
pacemaker官网www.clusterlabs.org
EPEL www.fedoraproject.org/wiki/EPEL fedora的开源站点
或者直接像我一样安装fedoraprojict的yum源为第三方yum源,然后直接用yum直接进行本地安装
[[email protected] ~]#rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[[email protected] tool]# rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6
[[email protected] tool]# yum -y install heartbeat
[[email protected] yum.repos.d]# rpm -q heartbeat //查看安装后的版本号为3.0
heartbeat-3.0.4-2.el6.x86_64
[[email protected] yum.repos.d]# rpm -qi heartbeat //查询heartbeat包的说明信息。包括版本号,来源,描 // 述等
Name : heartbeat Relocations: (not relocatable)
Version : 3.0.4 Vendor: Fedora Project
Release : 2.el6 Build Date: Tue 03 Dec 2013 12:37:21 AM CST
Install Date: Wed 02 Nov 2016 11:48:07 PM CST Build Host: buildvm-14.phx2.fedoraproject.org
Group : System Environment/Daemons Source RPM: heartbeat-3.0.4-2.el6.src.rpm
Size : 269152 License: GPLv2 and LGPLv2+
Signature : RSA/8, Tue 03 Dec 2013 06:59:13 AM CST, Key ID 3b49df2a0608b895
Packager : Fedora Project
URL : http://linux-ha.org/
Summary : Messaging and membership subsystem for High-Availability Linux
Description :
heartbeat is a basic high-availability subsystem for Linux-HA.
It will run scripts at initialization, and when machines go up or down.
This version will also perform IP address takeover using gratuitous ARPs.
Heartbeat contains a cluster membership layer, fencing, and local and
clusterwide resource management functionality.
When used with Pacemaker, it supports "n-node" clusters with significant
capabilities for managing resources and dependencies.
In addition it continues to support the older release 1 style of
2-node clustering.
It implements the following kinds of heartbeats:
- Serial ports
- UDP/IP multicast (ethernet, etc)
- UDP/IP broadcast (ethernet, etc)
- UDP/IP heartbeats
- "ping" heartbeats (for routers, switches, etc.)
(to be used for breaking ties in 2-node systems)
[[email protected] yum.repos.d]# rpm -ql heartbeat //heartbeat安装完后生成的文件
/etc/ha.d
/etc/ha.d/README.config
/etc/ha.d/harc
/etc/ha.d/rc.d
/etc/ha.d/rc.d/ask_resources
/etc/ha.d/rc.d/hb_takeover
/etc/ha.d/rc.d/ip-request
/etc/ha.d/rc.d/ip-request-resp
/etc/ha.d/rc.d/status
/etc/ha.d/resource.d
/etc/ha.d/resource.d/AudibleAlarm
/etc/ha.d/resource.d/Delay
/etc/ha.d/resource.d/Filesystem
[[email protected] ha.d]# ls /usr/share/doc//heartbeat-3.0.4/
apphbd.cf authkeys AUTHORS ChangeLog COPYING COPYING.LGPL ha.cf haresources
authkeys文件权限为600,是节点间进行通信的密钥文件,可以通过密钥验证节点的合法性防止随便加台 服务器,配置好VIP,资源便可以加入集群
ha.cf文件为heartbeat服务的配置文件
haresources文件为资源管理配置文件,即CRM;heartbeat v3的CRM被独立了出去叫pacemaker
[[email protected] etc]# cp /usr/share/doc/heartbeat-3.0.4/{authkeys,haresources,ha.cf} /etc/ha.d/ -p
[[email protected] etc]# dd if=/dev/random bs=512 count=1 |md5sum //产生512长的随机数在用MD5加密
0+1 records in
0+1 records out
53 bytes (53 B) copied, 0.000186312 s, 284 kB/s
e4b8f2837725f10ed16bfd1738b89541 -
[[email protected] etc]# vim /etc/ha/authkeys
auth 1
1 md5 e4b8f2837725f10ed16bfd1738b89541 //采用MD5加密通信
[[email protected] etc]# vim /etc/ha/ha.cf
#
# File to write debug messages to
#debugfile /var/log/ha-debug //debug的调试日志
#
#
# File to write other messages to
#
logfile /var/log/ha-log //ha的日志
#
#
# Facility to use for syslog()/logger
#
#logfacility local0 //local0表示一个日志设施,表示用syslog来记录日志,不能与logfile同 //时启用
#
#
# A note on specifying "how long" times below...
#
# The default time unit is seconds
# 10 means ten seconds
#
# You can also specify them in milliseconds
# 1500ms means 1.5 seconds
#
#
# keepalive: how long between heartbeats?
#
keepalive 2 //每两秒发一次心跳信息
deadtime 30 //30秒未收到对方心跳信息就认为对方挂掉了
#udpport 694 //以UDP/694传输心跳信息
#bcast eth0 # Linux //以广播形式传递心跳信息,且从eth0网卡传输
#bcast eth1 eth2 # Linux
#mcast eth0 225.0.0.1 694 1 0 //以组播255.0.0.1从eth0传输,TTL值,循环值
#ucast eth0 192.168.1.2 //以单播192.1681.2传输
#auto_failback on //节点恢复正常后是否再将资源转移回来
#node ken3
#node kathy //在此处下方要加入你的集群节点,要与uname -n命令显示一致
node www.rs1.com
node www.rs2.com
#
ping 10.10.10.254 //可以ping网关192.168.139.1来判断自己是否挂掉了
#ping_group group1 10.10.10.254 10.10.10.253//还可以通过Ping这个组中任意一个来判断自己是否 //挂掉
#respawn hacluster /usr/lib/heartbeat/ipfail //定义节点挂掉后是否进行重启
[[email protected] etc]# cat ./ha/ha.cf |grep -v "^#.*" //最后只需启用这些便可,甚至只要启用节点和 //广播便可
logfile/var/log/ha-log
keepalive 2
deadtime 30
bcast eth0# Linux
node www.rs1.com
node www.rs2.com
ping 192.168.139.1
定义集群资源
[[email protected] etc]# vim /etc/ha/haresources
#
# An example where a shared filesystem is to be used.
# Note that multiple aguments are passed to this script using
# the delimiter ‘::‘ to separate each argument.
#
#node1 10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
每一行定义一个集群服务
node1 主节点的节点名 一定要与uname -n显示一致
10.0.0.170 VIP 为定义的第一个资源
Filesystem 资源代理,后面用::隔离多个参数,Filesystem为定义的第二个资源
1 /dev/sda1 Filesystem的第一个参数 挂载的设备
2 /data1 Filesystem的第二个参数 挂载点
3 ext2 Filesystem的第三个参数 文件系统
1.2.3表示将/dev/sda1 挂载到/data1 且以ext2方式挂载
#just.linux-ha.org 135.9.216.110 135.9.215.111 135.9.216.112 httpd //多个VIP httpd服务,则此服务会运行在多个节点上
[[email protected] etc]# cd /etc/ha.d/resource.d/
[[email protected] resource.d]# ls //有许多资源代理Filesystem IPaddr 就在这
apache AudibleAlarm db2 Delay Filesystem hto-mapfuncs ICP ids IPaddr IPaddr2 IPsrcaddr IPv6addr LinuxSCSI LVM MailTo OCF portblock Raid1 SendArp ServeRAID WAS WinPopup Xinetd
[[email protected] resource.d]# chkconfig httpd off //千万别让服务开机自启动,要有CRM决定,本实验采 //用的CRM为haresources
[[email protected] ha]# vim /etc/ha/haresources //定义资源
www.rs1.com IPaddr::192.168.139.10/24/eth0 httpd //定义www.rs1.com为主节点,VIP为 192.168.139.10 掩码为24 将VIP配在eth0的别名上
[[email protected] ha]# scp -p authkeys ha.cf haresources 192.168.139.4:/etc/ha.d/
//两个节点上的配置文件一样复制过去
[[email protected] etc]# service heartbeat start //先启动主节点
[[email protected] ha.d]# ssh 192.168.139.4 service heartbeat start //ssh启动备份节点
[[email protected] ~]# ip addr show //可以看到VIP已经启用
: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:1c:13:12 brd ff:ff:ff:ff:ff:ff
inet 192.168.139.2/24 brd 192.168.139.255 scope global eth0
inet 192.168.139.10/24 brd 192.168.139.255 scope global secondary eth0
inet6 fe80::20c:29ff:fe1c:1312/64 scope link
valid_lft forever preferred_lft forever
[[email protected] ~]# netstat -unlp //heartbeat也启动了
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address PID/Program name
udp 0 0 127.0.0.1:659 1323/rpc.statd
udp 0 0 0.0.0.0:694 1604/heartbeat: wri
udp 0 0 0.0.0.0:55890 1604/heartbeat: wri
udp 0 0 0.0.0.0:111 1301/rpcbind
udp 0 0 0.0.0.0:628 1301/rpcbind
udp 0 0 0.0.0.0:52727 1323/rpc.statd
udp 0 0 :::46780 1323/rpc.statd
udp 0 0 :::111 1301/rpcbind
udp 0 0 :::628 1301/rpcbind
[[email protected] ~]# netstat -tnlp |grep httpd //httpd服务也启动了
tcp 0 0 :::80 :::*
LISTEN 2284/httpd
[[email protected] ~]# iptables -F //清空iptables规则
测试访问192.168.139.10 VIP,显示主节点RS1
[[email protected] ~]# service heartbeat stop //停止192.168.139.2上的heartbeat,看是否资源转移
Stopping High-Availability services: Done.
___________________________________________________________________________________________
192.168.139.4
[[email protected] ~]# vim /var/log/ha-log //看日志可知道VIP httpd服务在192.168.139.4上启用了
Nov 03 17:52:46 www.rs1.com heartbeat: [2393]: info: Local Resource acquisition completed.
harc(default)[2629]: 2016/11/03_17:52:46 info: Running /etc/ha.d//rc.d/status status
mach_down(default)[2645]: 2016/11/03_17:52:46 info: mach_down takeover complete for node www.rs2.com.
Nov 03 17:52:46 www.rs1.com heartbeat: [2385]: info: Initial resource acquisition complete (status)
harc(default)[2671]: 2016/11/03_17:52:46 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
ip-request-resp(default)[2671]: 2016/11/03_17:52:46 received ip-request-resp IPaddr::192.168.139.10/24/eth1 OK yes
ResourceManager(default)[2692]: 2016/11/03_17:52:46 info: Acquiring resource group: www.rs1.com IPaddr::192.168.139.10/24/eth1 httpd
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10)[2719]: 2016/11/03_17:52:47 INFO: Resource is stopped
ResourceManager(default)[2692]: 2016/11/03_17:52:47 info: Running /etc/ha.d/resource.d/IPaddr 192.168.139.10/24/eth1 start
IPaddr(IPaddr_192.168.139.10)[2842]: 2016/11/03_17:52:47 INFO: Adding inet address 192.168.139.10/24 with broadcast address 192.168.139.255 to device eth1
IPaddr(IPaddr_192.168.139.10)[2842]: 2016/11/03_17:52:47 INFO: Bringing device eth1 up
IPaddr(IPaddr_192.168.139.10)[2842]: 2016/11/03_17:52:47 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.139.10 eth1 192.168.139.10 auto not_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10)[2816]: 2016/11/03_17:52:47 INFO: Success
ResourceManager(default)[2692]: 2016/11/03_17:52:47 info: Running /etc/init.d/httpd start
harc(default)[2954]: 2016/11/03_17:52:48 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
ip-request-resp(default)[2954]: 2016/11/03_17:52:48 received ip-request-resp IPaddr::192.168.139.10/24/eth1 OK yes
ResourceManager(default)[2983]: 2016/11/03_17:52:48 info: Acquiring resource group: www.rs1.com IPaddr::192.168.139.10/24/eth1 httpd
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.139.10)[3010]: 2016/11/03_17:52:48 INFO: Running OK
Nov 03 17:54:30 www.rs1.com heartbeat: [2385]: info: Link www.rs1.com:eth1 up.
[[email protected] ~]# ip addr show //VIP 已经进行转移
eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:5f:68:2f brd ff:ff:ff:ff:ff:ff
inet 192.168.139.4/24 brd 192.168.139.255 scope global eth1
inet 192.168.139.10/24 brd 192.168.139.255 scope global secondary eth1
inet6 fe80::20c:29ff:fe5f:682f/64 scope link
valid_lft forever preferred_lft forever
[[email protected] ~]# netstat -tnlp //httpd服务进行了转移
tcp 0 0 :::80 LISTEN :::* 2952/httpd
[[email protected] ~]# iptables -F //清空iptables规则
浏览器测试一下显示RS2
___________________________________________________________________________________________
192.168.139.2
[[email protected] ~]# service heartbeat start //重新启动heartbeat
Starting High-Availability services: INFO: Resource is stopped
Done.
___________________________________________________________________________________________ 192.168.139.4
[[email protected] ~]# service heartbeat stop //关闭heartbeat
Stopping High-Availability services: Done.
___________________________________________________________________________________________
192.168.139.2
用浏览器再测一下,资源又转移回来了。而如果开启了auto_failback on //节点RS1恢复后会自动将资源转移回来,此实验未启用
再加一台主机192.168.139.3,当做NFS-Server来挂载共享页面
___________________________________________________________________________________________
192.168.139.2
[[email protected] htdocs]# ssh 192.168.139.4 service heartbeat stop //先关闭备份节点的heartbeat
Stopping High-Availability services: Done.
[[email protected] htdocs]# service heartbeat stop //关闭主节点heartbeat
Stopping High-Availability services: Done.
___________________________________________________________________________________________
192.168.139.3
[[email protected] htdocs]# vim /etc/exports //编辑nfs的配置文件
/web/htdocs 192.168.139.0/24(ro) 将/web/htdocs目录以只读方式共享给192.168.139.0/24网段
[[email protected] ~]# mkdir -pv /web/htdocs
[[email protected] ~]#cd /web/htdocs
[[email protected] htdocs]# vim index.html //编辑主页面文件,作为挂载后浏览器的访问
<h1>www.NFS.com</h1>
[[email protected] ~]# service rpcbind start //启动rpcbind
Starting rpcbind: [ OK ]
[[email protected] ~]# service nfs start //启动NFS
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS mountd: [ OK ]
Starting NFS daemon: [ OK ]
Starting RPC idmapd: [ OK ]
showmount -e 查看是否共享出去,但使用这个命令好像192.168.139.2上也要service NFS start 才会查看到共享的目录,否则会报出错误:
clnt_create: RPC: Port mapper failure - Unable to receive: errno 113 (No route to host)
但是任然可以通过192.168.139.2挂载上去:# mount 192.168.139.3:/web/htdocs /mnt
[[email protected] ~]# showmount -e 192.168.139.2
Export list for 192.168.139.2:
/web/htdocs 192.168.139.0/24
___________________________________________________________________________________________
192.168.139.2
[[email protected] /]# mount 192.168.139.3:/web/htdocs /mnt
[[email protected] mnt]# ll
total 4
-rw-r--r--. 1 nobody nobody 21 Nov 4 12:55 index.html
[[email protected] /]# umount 192.168.139.3:/web/htdocs /mnt
[[email protected] /]# vim /etc/ha.d/haresources
www.rs1.com IPaddr::192.168.139.10/24/eth0 Filesystem::192.168.139.3:/web/htdocs::/var/www/html::nfs httpd
[[email protected] /]#scp /etc/ha.d/haresources 192.168.139.4:/etc/ha.d/haresources
[[email protected] mnt]# service heartbeat start
[[email protected] mnt]# ssh 192.168.139.4 service heartbeat start
[[email protected] html]#iptables -F
[[email protected] html]# setenforce 0
浏览器测试 可能第一次会出现Apache的主页面。再刷新一下
[[email protected] html]# service heartbeat stop
Stopping High-Availability services:
___________________________________________________________________________________________
192.168.139.4
[[email protected] html]# netstat -tnlp |grep httpd //资源已经转移
tcp 0 0 :::80 LISTEN 2445/httpd
[[email protected] html]#iptables -F
[[email protected] html]# setenforce 0
浏览器测试
这样就实现了将两台主/备节点同时共享一个NFS-Server,主节点挂掉后资源转移到备节点后,仍然会挂载原来的NFS存储设备,从而保持页面内容的一致。使用三个节点一个简单的HA+NFS集群就实现了