HA Cluster基础及heartbeat实现HA
配置环境
node1:192.168.1.121 CentOS6.7
node2:192.168.1.122 CentOS6.7
node3:192.168.1.123 CentOS6.7
vip 192.168.1.88
配置前准备
# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.121 node1
192.168.1.122 node2
192.168.1.123 node3
# ssh-keygen -t rsa -P ‘‘
# ssh-copy-id -i ~/.ssh/id_rsa.pub node1
# ssh-copy-id -i ~/.ssh/id_rsa.pub node2
# ssh-copy-id -i ~/.ssh/id_rsa.pub node3
# rpm -ivh epel-release-latest-6.noarch.rpm
# yum -y install ansible
# cat /etc/ansible/hosts
[ha]
192.168.1.121
192.168.1.122
192.168.1.123
# ansible ha -m copy -a ‘src=/etc/hosts dest=/etc‘
# ansible ha -m shell -a ‘ntpdate 192.168.1.62‘
# ansible ha -m cron -a ‘minute="*/3" job="/usr/sbin/ntpdate 192.168.1.62" name="ntpdate"‘
01 HA Cluster及Corosync
[[email protected] ~]# yum info corosync
Loaded plugins: fastestmirror, refresh-packagekit, security
Determining fastest mirrors
epel/metalink | 5.8 kB 00:00
* base: mirrors.163.com
* epel: mirrors.tuna.tsinghua.edu.cn
* extras: mirrors.zju.edu.cn
* updates: mirrors.163.com
base | 3.7 kB 00:00
extras | 3.4 kB 00:00
updates | 3.4 kB 00:00
updates/primary_db | 2.6 MB 00:00
Available Packages
Name : corosync
Arch : x86_64
Version : 1.4.7
Release : 5.el6
Size : 216 k
Repo : base
Summary : The Corosync Cluster Engine and Application Programming Interfaces
URL : http://ftp.corosync.org
License : BSD
Description : This package contains the Corosync Cluster Engine Executive,
: several default APIs and libraries, default configuration files,
: and an init script.
[[email protected] ~]# yum -y install corosync pacemaker
[[email protected] ~]# yum -y install corosync pacemaker
[[email protected] ~]# cd /etc/corosync/
[[email protected] corosync]# cp corosync.conf.example corosync.conf
[[email protected] corosync]# vim corosync.conf
1、开启安全认证
修改
secauth: off
为
secauth: on
2、修改网络地址
bindnetaddr: 192.168.1.0 #本次测试不需要修改
3、末行添加:
service {
ver: 0
name: pacemaker
use_mgmtd: yes
}
aisexec {
user: root
group: root
}
验证网卡是否支持MULTICAST
命令:ip link show
生成安全验证文件
[[email protected] corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
#备注:如果随机文件不够1024字节导致无法生成验证文件,可以在服务器随机操作
[[email protected] corosync]# ll
total 24
-r-------- 1 root root 128 Oct 10 09:49 authkey
[[email protected] corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
[[email protected] corosync]# ansible ha -m shell -a ‘service corosync start‘
查看corosync引擎是否正常启动:
[[email protected] corosync]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Oct 10 09:55:06 corosync [MAIN ] Corosync Cluster Engine (‘1.4.7‘): started and ready to provide service.
Oct 10 09:55:06 corosync [MAIN ] Successfully read main configuration file ‘/etc/corosync/corosync.conf‘.
查看初始化成员节点通知是否正常发出:
[[email protected] corosync]# grep TOTEM /var/log/cluster/corosync.log
Oct 10 09:55:06 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 10 09:55:06 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Oct 10 09:55:06 corosync [TOTEM ] The network interface [192.168.1.121] is now up.
检查启动过程中是否有错误产生。下面的错误信息表示packmaker不久之后将不再作为corosync的插件运行,因此,建议使用cman作为集群基础架构服务;此处可安全忽略。
[[email protected] corosync]# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
Oct 10 09:55:06 corosync [pcmk ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.
Oct 10 09:55:06 corosync [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of ‘Clusters from Scratch‘ (http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN
Oct 10 09:55:07 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child process mgmtd exited (pid=3434, rc=100)
查看pacemaker是否正常启动:
[[email protected] corosync]# grep pcmk_startup /var/log/cluster/corosync.log
Oct 10 09:55:06 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Oct 10 09:55:06 corosync [pcmk ] Logging: Initialized pcmk_startup
Oct 10 09:55:06 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Oct 10 09:55:06 corosync [pcmk ] info: pcmk_startup: Service: 9
Oct 10 09:55:06 corosync [pcmk ] info: pcmk_startup: Local hostname: node1
[[email protected] ~]# yum --nogpgcheck localinstall crmsh-2.1-1.6.x86_64.rpm pssh-2.0-1.el6.rf.noarch.rpm
[[email protected] ~]# yum --nogpgcheck localinstall crmsh-2.1-1.6.x86_64.rpm pssh-2.0-1.el6.rf.noarch.rpm
02 使用crmsh配置pacemaker
crmsh没有配置成功,实验无法正常操作
03 drbd基础及应用实现
配置环境
node1:192.168.1.151 CentOS6.5
node2:192.168.1.152 CentOS6.5
配置前提:时间同步、基于主机名访问
在两台主机上各配置一个5G大小的磁盘分区(注:不要格式)
[[email protected] ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm
[[email protected] ~]#
[[email protected] ~]# rpm -qa | grep drbd
drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64
drbd-8.4.3-33.el6.x86_64
[[email protected] ~]# cd /etc/drbd.d/
[[email protected] drbd.d]# vim global_common.conf
修改
usage-count yes;
为
usage-count no;
在diks {}段内添加
on-io-error detach;
在net{}段内添加
cram-hmac-alg "sha1";
shared-secret "mydrbdshared123";
在net{}段后添加
syncer {
rate 500M;
}
[[email protected] drbd.d]# vim mystore.res
resource mystore {
device /dev/drbd0;
disk /dev/sda4;
meta-disk internal;
on node1 {
address 192.168.1.151:7789;
}
on node2 {
address 192.168.1.152:7789;
}
}
[[email protected] ~]# scp -r /etc/drbd.* node2:/etc
drbd.conf 100% 133 0.1KB/s 00:00
global_common.conf 100% 1942 1.9KB/s 00:00
mystore.res 100% 169 0.2KB/s 00:00
[[email protected] ~]# drbdadm create-md mystore
[[email protected] ~]# drbdadm create-md mystore
启动brdb
[[email protected] ~]# service drbd start
[[email protected] ~]# service drbd start
查看brdb的运行状态
[[email protected] ~]# cat /proc/drbd
指定当前结点为主结点
[[email protected] ~]# drbdadm primary --force mystore
查看结点同步过程
[[email protected] ~]# watch -n1 ‘cat /proc/drbd‘
[[email protected] ~]# mke2fs -t ext4 /dev/drbd0
[[email protected] ~]# mount /dev/drbd0 /mnt
[[email protected] ~]# cd /mnt/
[[email protected] mnt]# ls
lost+found
[[email protected] mnt]# cp /etc/issue .
切换主从结点
[[email protected] mnt]# cd
[[email protected] ~]# umount /mnt
[[email protected] ~]# drbdadm secondary mystor #把自己降为从结点
[[email protected] ~]# drbd-overview
0:mystore/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----
[[email protected] ~]# drbdadm primary mystore
[[email protected] ~]# mount /dev/drbd0 /mnt
[[email protected] ~]# cd /mnt/
[[email protected] mnt]# ls
issue lost+found
[[email protected] mnt]# vim issue
添加
hell drbd #该内容随机
重新切换主从结果,查看添加的内容,结果一致
04 用drbd实现HA的MySQL
[[email protected] ~]# ansible ha -m shell -a ‘service drbd stop‘
[[email protected] ~]# ansible ha -m shell -a ‘chkconfig drbd off‘
由于crm问题,本次测试没有成功
05 corosync、pacemaker集群及pcs
[[email protected] ~]# ansible ha -m shell -a ‘yum -y install corosync pacemaker‘
[[email protected] ~]# cd /etc/corosync/
[[email protected] corosync]# cp corosync.conf.example corosync.conf
[[email protected] corosync]# vim corosync.conf
修改
secauth: off
为
secauth: on
在to_syslog: yes前面加#
在末尾添加
service {
ver: 0
name: pacemaker
}
生成密钥文件
[[email protected] corosync]# corosync-keygen
[[email protected] corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
[[email protected] corosync]# service corosync start
[[email protected] corosync]# service corosync start
[[email protected] corosync]# yum -y install pcs
[[email protected] ~]# yum -y install pcs
[[email protected] corosync]# pcs status
[[email protected] corosync]# pcs property set stonith-enabled=false
[[email protected] corosync]# pcs property set no-quorum-policy=ignore
[[email protected] ~]# service pcsd start
[[email protected] ~]# service pcsd start
定义webip
[[email protected] corosync]# pcs resource create webip ocf:heartbeat:IPaddr params ip=192.168.1.88 op monitor interval=10s timeout=20s
[[email protected] corosync]# pcs status
Cluster name:
Last updated: Wed Oct 12 13:22:01 2016 Last change: Wed Oct 12 13:21:48 2016 by root via cibadmin on node1
Stack: classic openais (with plugin)
Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 1 resource configured, 2 expected votes
Online: [ node1 node2 ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started node1
定义webserver
[[email protected] corosync]# ansible ha -m shell -a ‘yum -y install httpd‘
[[email protected] corosync]# vim /var/www/html/index.html
<h1>node1</h1>
[[email protected] corosync]# chkconfig httpd off
[[email protected] ~]# vim /var/www/html/index.html
<h1>node2</h1>
[[email protected] ~]# chkconfig httpd off
[[email protected] ~]# pcs resource create webserver lsb:httpd op monitor interval=20s timeout=30s
[[email protected] ~]# pcs status
Cluster name:
Last updated: Wed Oct 12 13:30:34 2016 Last change: Wed Oct 12 13:30:14 2016 by root via cibadmin on node1
Stack: classic openais (with plugin)
Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ node1 node2 ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started node1
webserver (lsb:httpd): Started node2
添加约束
1)colocation约束
[[email protected] ~]# pcs constraint colocation add webserver with webip
[[email protected] ~]# pcs status
Cluster name:
Last updated: Wed Oct 12 14:34:11 2016 Last change: Wed Oct 12 14:33:50 2016 by root via cibadmin on node1
Stack: classic openais (with plugin)
Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ node1 node2 ]
2)order约束
[[email protected] ~]# pcs constraint order webip then webserver
Adding webip webserver (kind: Mandatory) (Options: first-action=start then-action=start)
3)位置约束
[[email protected] ~]# pcs constraint location webip prefers node1=300
查看约束
1)order约束
[[email protected] ~]# pcs constraint order show
Ordering Constraints:
start webip then start webserver (kind:Mandatory)
2)colocation约束
[[email protected] ~]# pcs constraint colocation show
Colocation Constraints:
webserver with webip (score:INFINITY)
3)查看位置约束
[[email protected] ~]# pcs constraint location show
Location Constraints:
Resource: webip
Enabled on: node1 (score:300)
使node1节点离线
[[email protected] ~]# pcs cluster standby node1
[[email protected] ~]# pcs status
Cluster name:
Last updated: Wed Oct 12 14:49:08 2016 Last change: Wed Oct 12 14:48:54 2016 by root via crm_attribute on node1
Stack: classic openais (with plugin)
Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Node node1: standby
Online: [ node2 ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started node2
webserver (lsb:httpd): Started node2
离线结点重新上线
[[email protected] ~]# pcs cluster unstandby node1
[[email protected] ~]# pcs status
Cluster name:
Last updated: Wed Oct 12 14:50:46 2016 Last change: Wed Oct 12 14:50:37 2016 by root via crm_attribute on node1
Stack: classic openais (with plugin)
Current DC: node2 (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ node1 node2 ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started node1
webserver (lsb:httpd): Started node1
[[email protected] ~]# vim /usr/lib/python2.6/site-packages/pcs/utils.py
[[email protected] src]# ls *rpm
crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm
[[email protected] src]# yum localinstall crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm --nogpgcheck -y
测试成功