15、 Heartbeat+DRBD+MySQL高可用架构方案与实施过程细节

15、 Heartbeat+DRBD+MySQL高可用架构方案与实施过程细节

参考自:http://oldboy.blog.51cto.com/2561410/1240412


heartbeat和keepalived应用场景及区别


很多网友说为什么不使用keepalived而使用长期不更新的heartbeat,下面说一下它们之间的应用场景及区别:

1、对于web,db,负载均衡(lvs,haproxy,nginx)等,heartbeat和keepalived都可以实现

2、lvs最好和keepalived结合,因为keepalived最初就是为lvs产生的,(heartbeat没有对RS的健康检查功能,heartbeat可以通过ldircetord来进行健康检查的功能)

3、mysql双主多从,NFS/MFS存储,他们的特点是需要数据同步,这样的业务最好使用heartbeat,因为heartbeat有自带的drbd脚本

总结:

无数据同步的应用程序高可用可选择keepalived

有数据同步的应用程序高可用可选择heartbeat(DRBD)


1、

安装部署准备


(1)架构拓扑

架构说明:

一主多从最常用的架构,多个从库可以使用lvs来提供读的负载均衡。

解决一主单点的问题,当主库宕机后,可以实现主库宕机后备节点自动接管,所有的从库会自动和新的主库进行同步,实现了mysql主库的热备方案

(2)系统环境:

(3)部署环境

(4)主库服务器数据分区信息


2、heatbeat安装部署


(1)、配置服务器间心跳连接路由

主节点

[[email protected] ~]# route add -host 172.16.4.3 dev eth2<==到对端的心跳路由

[[email protected] ~]# route add -host 172.168.4.3 dev eth3<==到对端的DRBD数据路由

备节点

[[email protected] ~]# route add -host 172.16.4.2 dev eth2

[[email protected] ~]# route add -host 172.168.4.2 dev eth3

(2)、安装heartbeat

[[email protected] ~]# yum install heartbeat -y

[[email protected] ~]# yum install heartbeat -y

提示:需要执行两遍安装heartbeat操作

(3)、配置heartbeat

主备节点两端的配置文件(ha.cfauthkeysharesources)完全相同

1)、ha.cf

[[email protected] ~]# vim /etc/ha.d/ha.cf

#log configure

debugfile /var/log/ha-debug

logfile /var/log/ha-log

logfacility local1

#options configure

keepalive 2

deadtime 30

warntime 10

initdead 120

#bcast eth2

mcast eth2 225.0.0.7 694 1 0

#node configure

auto_failback on

node master1 <==主节点主机名

node master2 <==备节点主机名

crm no

2)、配置authkeys

[[email protected] ~]# vim /etc/ha.d/authkeys

auth 1

1 sha1 47e9336850f1db6fa58bc470bc9b7810eb397f04

3)、配置haresources

[[email protected] ~]# vim /etc/ha.d/haresources

master1 IPaddr::192.168.4.1/16/eth1

#master1 IPaddr::192.168.4.1/16/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld

说明:

drbddisk::data <==启动drbd data资源,相当于执行/etc/ha.d/resource.d/drbddisk data stop/start操作

Filesystem::/dev/drbd1::/data::ext3 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 stop/start <==相当于系统中执行mount /dev/drbd1 /data

mysql <==启动mysql服务脚本,相当于/etc/init.d/mysql stop/start

(4)、启动heartbeat

[[email protected] ~]# /etc/init.d/heartbeat start

[[email protected] ~]# chkconfig heartbeat off

说明:关闭开机自启动,当服务器重启时,需要人工去启动

(5)、测试heartbeat

正常状态

[[email protected]master1 ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.2/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

[[email protected] ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.3/16 brd 192.168.255.255 scope global eth1

说明:master1节点拥有vip地址,master2节点没有

模拟主节点宕机后的状态

[[email protected] ~]# /etc/init.d/heartbeat stop

[[email protected]master2 ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.3/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

说明:master1宕机后,vip地址漂移到master2节点上,master2成为主节点

模拟主节点故障恢复后的状态

[[email protected] ~]# /etc/init.d/heartbeat start

[[email protected]master1 ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.2/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

说明:master1抢占vip资源


3、DRBD

安装部署


(1)、新添加硬盘

[[email protected] ~]# fdisk /dev/sdb               #sdb磁盘分两个分区sdb1和sdb2

[[email protected] ~]# partprobe

[[email protected] ~]# mkfs.ext3 /dev/sdb1          #sdb2分区为meta data分区,不需要格式化操作

[[email protected] ~]# tune2fs -c -1 /dev/sdb1      #设置最大挂载数为-1

(2)、安装DRBD

[[email protected] ~]# yum install kmod-drbd83 drbd83 -y

[[email protected] ~]# modprobe drbd

#注意:不要设置echo ‘modprobe drbd‘ >>/etc/rc.loca开机自动加载drbd模块,如果drbd服务是开机自启动的,会先启动drbd服务在加载drbd的顺序,导致drbd启动不了出现的问题

(3)、配置DRBD

主备节点两端配置文件完全一致

[[email protected] ~]# cat /etc/drbd.conf

global {

# minor-count 64;

# dialog-refresh 5; # 5 seconds

# disable-ip-verification;

usage-count no;

}

common {

protocol C;

disk {

on-io-error   detach;

#size 454G;

no-disk-flushes;

no-md-flushes;

}

net {

sndbuf-size 512k;

# timeout       60;    #  6 seconds  (unit = 0.1 seconds)

# connect-int   10;    # 10 seconds  (unit = 1 second)

# ping-int      10;    # 10 seconds  (unit = 1 second)

# ping-timeout   5;    # 500 ms (unit = 0.1 seconds)

max-buffers     8000;

unplug-watermark   1024;

max-epoch-size  8000;

# ko-count 4;

# allow-two-primaries;

cram-hmac-alg "sha1";

shared-secret "hdhwXes23sYEhart8t";

after-sb-0pri disconnect;

after-sb-1pri disconnect;

after-sb-2pri disconnect;

rr-conflict disconnect;

# data-integrity-alg "md5";

# no-tcp-cork;

}

syncer {

rate 120M;

al-extents 517;

}

}

resource data {

on master1 {

device     /dev/drbd1;

disk       /dev/sdb1;

address    192.168.4.2:7788;

meta-disk  /dev/sdb2 [0];

}

on master2 {

device     /dev/drbd1;

disk       /dev/sdb1;

address    192.168.4.3:7788;

meta-disk  /dev/sdb2 [0];

}

}

(4)、初始化meta分区

[[email protected] ~]# drbdadm create-md data

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

(5)、初始化设备同步(覆盖备节点,保持数据一致)

[[email protected] ~]# drbdadm -- --overwrite-data-of-peer primary data

(6)、启动drbd

[[email protected] ~]# drbdadm up all

[[email protected] ~]# chkconfig drbd off

(7)、挂载drbd分区到data数据目录

[[email protected] ~]# drbdadm primary all

[[email protected] ~]# mount /dev/drbd1 /data          #说明:/data目录为数据库的数据目录

(8)、测试DRBD

正常状态

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:497984 nr:0 dw:1 dr:498116 al:1 bm:31 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

ns:0 nr:497984 dw:497984 dr:0 al:0 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

#说明:master1为主节点,master2为备节点

模拟master1宕机

[[email protected] ~]# umount /dev/drbd1

[[email protected] ~]# drbdadm down all

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----

ns:0 nr:497985 dw:497985 dr:0 al:0 bm:30 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[[email protected] ~]# drbdadm primary all

[[email protected] ~]# mount /dev/drbd1 /data

[[email protected] ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda3 19G 5.1G 13G 29% /

/dev/sda1 190M 18M 163M 10% /boot

tmpfs 60M 0 60M 0% /dev/shm

/dev/drbd1 471M 11M 437M 3% /data

#说明:master1宕机后,master2可以升级为主节点,可挂载drbd分区继续使用


4、MySQL

安装部署


注意:三台数据库都安装mysql服务,master2只安装到makeinstall即可,mysqld服务不要设置为开机自启动

(1)、解决perl编译问题

echo ‘export LC_ALL=C‘>> /etc/profile

source /etc/profile

(2)、安装CAMKE

cd /home/xu/tools

wget http://www.cmake.org/files/v2.8/cmake-2.8.4.tar.gz

tar zxf cmake-2.8.4.tar.gz

cd cmake-2.8.4

./configure

make & make install

(3)、创建用户

groupadd mysql

useradd -g mysql mysql

(4)、编译安装mysql

wget http://mysql.ntu.edu.tw/Downloads/MySQL-5.5/mysql-5.5.27.tar.gz

tar zxf mysql-5.5.27.tar.gz

cd mysql-5.5.27

cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql \

-DMYSQL_UNIX_ADDR=/tmp/mysql.sock \

-DDEFAULT_CHARSET=utf8 \

-DDEFAULT_COLLATION=utf8_general_ci \

-DWITH_EXTRA_CHARSETS=complex \

-DWITH_READLINE=1 \

-DENABLED_LOCAL_INFILE=1

make -j 4

make install

(5)、设置mysql环境变量

[[email protected] ~]# echo ‘PATH=$PATH:/usr/local/mysql/bin‘ >>/etc/profile

[[email protected] ~]# source /etc/profile

(6)、初始化数据库

[[email protected] ~]# mount /dev/drbd1 /data                 #说明:数据库存放数据的目录是drbd分区

[[email protected] ~]# cd /usr/local/mysql/

[[email protected] ~]# ./scripts/mysql_install_db --datadir=/data/ --user=mysql

(7)、启动数据库

[[email protected] ~]# vim /etc/init.d/mysqld

datadir=/data                                            #说明:修改mysql启动脚本,指定数据库的目录为/data

[[email protected] ~]# /etc/init.d/mysqld start

[[email protected] ~]# chkconfig mysqld off

(8)、测试数据库

[[email protected] ~]# mysql -uroot -e "show databases;"

+--------------------+

| Database |

+--------------------+

| information_schema |

| mysql |

| performance_schema |

+--------------------+


5、故障切换测试


(1)、架构正常状态

master1主节点正常状态

[[email protected] ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.2/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:39558 nr:12 dw:39570 dr:151 al:16 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[[email protected] ~]# mysql -uroot -e "create database coral;"

[[email protected] ~]# mysql -uroot -e "show databases like ‘coral‘;"

+------------------+

| Database (coral) |

+------------------+

| coral |

+------------------+

#说明:master1为主节点,拥有VIP地址,为drbd的主节点

master2备节点正常状态

[[email protected] ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.3/16 brd 192.168.255.255 scope global eth1

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----

ns:0 nr:48 dw:48 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

#说明:master2备节点没有VIP地址,为drbd备节点

(2)、模拟master1宕机故障状态

[[email protected] ~]# /etc/init.d/heartbeat stop <==模拟master1故障宕机

[[email protected] ~]# tailf /var/log/ha-log <==查看备节点接管日志

heartbeat[13209]: 2013/01/23_04:09:36 info: Received shutdown notice from ‘master1‘.

heartbeat[13209]: 2013/01/23_04:09:36 info: Resources being acquired from master1.

heartbeat[15293]: 2013/01/23_04:09:36 info: acquire local HA resources (standby).

heartbeat[15294]: 2013/01/23_04:09:37 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys master2] to acquire.

heartbeat[15293]: 2013/01/23_04:09:37 info: local HA resource acquisition completed (standby).

heartbeat[13209]: 2013/01/23_04:09:37 info: Standby resource acquisition done [foreign].

harc[15319]: 2013/01/23_04:09:37 info: Running /etc/ha.d/rc.d/status status

mach_down[15335]: 2013/01/23_04:09:37 info: Taking over resource group IPaddr::192.168.4.1/16/eth1

ResourceManager[15361]: 2013/01/23_04:09:37 info: Acquiring resource group: master1 IPaddr::192.168.4.1/16/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld

IPaddr[15388]: 2013/01/23_04:09:37 INFO: Resource is stopped

ResourceManager[15361]: 2013/01/23_04:09:37 info: Running /etc/ha.d/resource.d/IPaddr 192.168.4.1/16/eth1 start

IPaddr[15486]: 2013/01/23_04:09:38 INFO: Using calculated netmask for 192.168.4.1: 255.255.0.0

IPaddr[15486]: 2013/01/23_04:09:38 INFO: eval ifconfig eth1:0 192.168.4.1 netmask 255.255.0.0 broadcast 192.168.255.255

IPaddr[15457]: 2013/01/23_04:09:38 INFO: Success

ResourceManager[15361]: 2013/01/23_04:09:38 info: Running /etc/ha.d/resource.d/drbddisk data start

Filesystem[15636]: 2013/01/23_04:09:39 INFO: Resource is stopped

ResourceManager[15361]: 2013/01/23_04:09:39 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 start

Filesystem[15717]: 2013/01/23_04:09:39 INFO: Running start for /dev/drbd1 on /data

Filesystem[15706]: 2013/01/23_04:09:39 INFO: Success

ResourceManager[15361]: 2013/01/23_04:09:40 info: Running /etc/init.d/mysqld start

mach_down[15335]: 2013/01/23_04:09:44 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

mach_down[15335]: 2013/01/23_04:09:44 info: mach_down takeover complete for node master1.

heartbeat[13209]: 2013/01/23_04:09:44 info: mach_down takeover complete.

heartbeat[13209]: 2013/01/23_04:10:09 WARN: node master1: is dead

heartbeat[13209]: 2013/01/23_04:10:09 info: Dead node master1 gave up resources.

heartbeat[13209]: 2013/01/23_04:10:09 info: Link master1:eth2 dead.

#说明:当备节点无法检测到主节点的心跳时,自动接管资源,启动VIP地址、drbd服务,自动挂载drbd,启动mysqld服务,备节点接管后,数据依然存在,检测启动的服务如下:

[[email protected] ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.3/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

[[email protected] ~]# cat /proc/drbd

version: 8.3.13 (api:88/proto:86-96)

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by [email protected], 2012-05-07 11:56:36

1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:3 nr:95 dw:98 dr:10 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[[email protected] ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda3 19G 4.7G 14G 26% /

/dev/sda1 190M 18M 163M 10% /boot

tmpfs 60M 0 60M 0% /dev/shm

/dev/drbd1 471M 40M 408M 9% /data

[[email protected] ~]# mysql -uroot -e "show databases like ‘coral‘;"

+------------------+

| Database (coral) |

+------------------+

| coral |

+------------------+

(3)、模拟master1宕机恢复状态

启动的顺序是:先启动VIP--启动drbd资源--挂载drbd分区--启动mysqld服务,日志如下:

[[email protected] ~]# /etc/init.d/heartbeat start

[[email protected] ~]# tailf /var/log/ha-log

heartbeat[27970]: 2013/01/09_17:34:14 info: Version 2 support: no

heartbeat[27970]: 2013/01/09_17:34:14 WARN: Logging daemon is disabled --enabling logging daemon is recommended

heartbeat[27970]: 2013/01/09_17:34:14 info: **************************

heartbeat[27970]: 2013/01/09_17:34:14 info: Configuration validated. Starting heartbeat 2.1.3

heartbeat[27971]: 2013/01/09_17:34:14 info: heartbeat: version 2.1.3

heartbeat[27971]: 2013/01/09_17:34:14 info: Heartbeat generation: 1351554533

heartbeat[27971]: 2013/01/09_17:34:14 info: glib: UDP multicast heartbeat started for group 225.0.0.7 port 694 interface eth2 (ttl=1 loop=0)

heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_TriggerHandler: Added signal manual handler

heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_TriggerHandler: Added signal manual handler

heartbeat[27971]: 2013/01/09_17:34:14 info: G_main_add_SignalHandler: Added signal handler for signal 17

heartbeat[27971]: 2013/01/09_17:34:14 info: Local status now set to: ‘up‘

heartbeat[27971]: 2013/01/09_17:34:16 info: Link master2:eth2 up.

heartbeat[27971]: 2013/01/09_17:34:16 info: Status update for node master2: status active

harc[27978]: 2013/01/09_17:34:16 info: Running /etc/ha.d/rc.d/status status

heartbeat[27971]: 2013/01/09_17:34:17 info: Comm_now_up(): updating status to active

heartbeat[27971]: 2013/01/09_17:34:17 info: Local status now set to: ‘active‘

heartbeat[27971]: 2013/01/09_17:34:17 info: remote resource transition completed.

heartbeat[27971]: 2013/01/09_17:34:17 info: remote resource transition completed.

heartbeat[27971]: 2013/01/09_17:34:17 info: Local Resource acquisition completed. (none)

heartbeat[27971]: 2013/01/09_17:34:18 info: master2 wants to go standby [foreign]

heartbeat[27971]: 2013/01/09_17:34:20 info: standby: acquire [foreign] resources from master2

heartbeat[27997]: 2013/01/09_17:34:20 info: acquire local HA resources (standby).

ResourceManager[28010]: 2013/01/09_17:34:20 info: Acquiring resource group: master1 IPaddr::192.168.4.1/16/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld

IPaddr[28037]: 2013/01/09_17:34:21 INFO: Resource is stopped

ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/IPaddr 192.168.4.1/16/eth1 start

IPaddr[28135]: 2013/01/09_17:34:21 INFO: Using calculated netmask for 192.168.4.1: 255.255.0.0

IPaddr[28135]: 2013/01/09_17:34:21 INFO: eval ifconfig eth1:0 192.168.4.1 netmask 255.255.0.0 broadcast 192.168.255.255

IPaddr[28106]: 2013/01/09_17:34:21 INFO: Success

ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/drbddisk data start

Filesystem[28286]: 2013/01/09_17:34:21 INFO: Resource is stopped

ResourceManager[28010]: 2013/01/09_17:34:21 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 start

Filesystem[28367]: 2013/01/09_17:34:21 INFO: Running start for /dev/drbd1 on /data

Filesystem[28356]: 2013/01/09_17:34:21 INFO: Success

ResourceManager[28010]: 2013/01/09_17:34:22 info: Running /etc/init.d/mysqld start

heartbeat[27997]: 2013/01/09_17:34:25 info: local HA resource acquisition completed (standby).

heartbeat[27971]: 2013/01/09_17:34:25 info: Standby resource acquisition done [foreign].

heartbeat[27971]: 2013/01/09_17:34:25 info: Initial resource acquisition complete (auto_failback)

heartbeat[27971]: 2013/01/09_17:34:25 info: remote resource transition completed.

备节点释放资源顺序:停止mysqld服务--卸载drbd1分区--设置drbd为备节点--关闭VIP地址,日志如下:

[[email protected] ~]# tailf /var/log/ha-log

heartbeat[13209]: 2013/01/23_04:26:53 info: Heartbeat restart on node master1

heartbeat[13209]: 2013/01/23_04:26:53 info: Link master1:eth2 up.

heartbeat[13209]: 2013/01/23_04:26:53 info: Status update for node master1: status init

heartbeat[13209]: 2013/01/23_04:26:53 info: Status update for node master1: status up

harc[16151]: 2013/01/23_04:26:53 info: Running /etc/ha.d/rc.d/status status

harc[16167]: 2013/01/23_04:26:53 info: Running /etc/ha.d/rc.d/status status

heartbeat[13209]: 2013/01/23_04:26:53 info: all clients are now paused

heartbeat[13209]: 2013/01/23_04:26:55 info: Status update for node master1: status active

harc[16183]: 2013/01/23_04:26:55 info: Running /etc/ha.d/rc.d/status status

heartbeat[13209]: 2013/01/23_04:26:55 info: all clients are now resumed

heartbeat[13209]: 2013/01/23_04:26:55 info: remote resource transition completed.

heartbeat[13209]: 2013/01/23_04:26:55 info: master2 wants to go standby [foreign]

heartbeat[13209]: 2013/01/23_04:26:55 info: standby: master1 can take our foreign resources

heartbeat[16199]: 2013/01/23_04:26:55 info: give up foreign HA resources (standby).

ResourceManager[16212]: 2013/01/23_04:26:55 info: Releasing resource group: master1 IPaddr::192.168.4.1/16/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext3 mysqld

ResourceManager[16212]: 2013/01/23_04:26:55 info: Running /etc/init.d/mysqld stop

ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 stop

Filesystem[16297]: 2013/01/23_04:26:57 INFO: Running stop for /dev/drbd1 on /data

Filesystem[16297]: 2013/01/23_04:26:57 INFO: Trying to unmount /data

Filesystem[16297]: 2013/01/23_04:26:57 INFO: unmounted /data successfully

Filesystem[16286]: 2013/01/23_04:26:57 INFO: Success

ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/drbddisk data stop

ResourceManager[16212]: 2013/01/23_04:26:57 info: Running /etc/ha.d/resource.d/IPaddr 192.168.4.1/16/eth1 stop

IPaddr[16445]: 2013/01/23_04:26:58 INFO: ifconfig eth1:0 down

IPaddr[16416]: 2013/01/23_04:26:58 INFO: Success

heartbeat[16199]: 2013/01/23_04:26:58 info: foreign HA resource release completed (standby).

heartbeat[13209]: 2013/01/23_04:26:58 info: Local standby process completed [foreign].

heartbeat[13209]: 2013/01/23_04:27:02 WARN: 1 lost packet(s) for [master1] [15:17]

heartbeat[13209]: 2013/01/23_04:27:02 info: remote resource transition completed.

heartbeat[13209]: 2013/01/23_04:27:02 info: No pkts missing from master1!

heartbeat[13209]: 2013/01/23_04:27:02 info: Other node completed standby takeover of foreign resources.


6、从库同VIP同步


(1)、master配置

1)、设置server-id值并开启Binlog参数

[[email protected] ~]# vim /etc/my.cnf

log-bin=/usr/local/mysql/mysql-bin

server-id = 3

[[email protected] ~]# /etc/init.d/mysqld restart

#注意:只有master1有重启操作,master2无需重启操作,因为备节点的mysql是未启动状态,备节点只有heartbeat才能启动mysql

2)、授权并建立同步账户rep

[[email protected] ~]# mysql -uroot -p

mysql> GRANT REPLICATION SLAVE ON *.* TO ‘rep‘@‘192.168.4.%‘ IDENTIFIED BY ‘rep‘;

(2)、slave配置

1)、设置server-id值并关闭binlog设置

[[email protected] ~]# vim /etc/my.cnf

#log-bin=mysql-bin

server-id = 4

[[email protected] ~]# /etc/init.d/mysqld restart

#说明:从库无需开启binlog日志功能,除非有需求做级联复制架构或对mysql增量备份操作才开启

2)、配置同步参数

[[email protected] ~]# mysql -uroot

CHANGE MASTER TO

MASTER_HOST=‘192.168.4.1‘,

MASTER_PORT=3306,

MASTER_USER=‘rep‘,

MASTER_PASSWORD=‘rep‘,

MASTER_LOG_FILE=‘mysql-bin.000001‘,

MASTER_LOG_POS=0;

3)、检查是否主从同步

[[email protected] ~]# mysql -uroot

mysql> show slave status\G

...

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

...

(3)、模拟高可用宕机切换是否影响从库同步

1)、主从正常状态

[[email protected] ~]# mysql -uroot

mysql> create database coral1;

Query OK, 1 row affected (0.02 sec)

[[email protected] ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

[[email protected] ~]# mysql -uroot -e "show databases like ‘coral%‘;"

+-------------------+

| Database (coral%) |

+-------------------+

| coral1 |

+-------------------+

2)、模拟高可用主节点宕机

[[email protected] ~]# /etc/init.d/heartbeat stop             #说明:模拟主节点宕机

[[email protected] ~]# ip addr|grep eth1

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

inet 192.168.4.3/16 brd 192.168.255.255 scope global eth1

inet 192.168.4.1/16 brd 192.168.255.255 scope global secondary eth1:0

[[email protected] ~]# mysql -uroot

mysql> create database coral2;

Query OK, 1 row affected (0.08 sec)                      #说明:VIP地址已经漂移到master2上面

[[email protected] ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

[[email protected] ~]# mysql -uroot -e "show databases like ‘coral%‘"

+-------------------+

| Database (coral%) |

+-------------------+

| coral1 |

| coral2 |

+-------------------+

#注意:高可用主备节点切换过程中,会有一段时间从库才能连接上,大于在60秒内

#说明:此时主从同步是正常的

3)、模拟高可用主节点宕机恢复

[[email protected] ~]# /etc/init.d/heartbeat start

[[email protected] ~]# mysql -uroot

mysql> create database coral3;

[[email protected] ~]# mysql -uroot -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running"

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

[[email protected] ~]# mysql -uroot -e "show databases like ‘coral%‘"

+-------------------+

| Database (coral%) |

+-------------------+

| coral1 |

| coral2 |

| coral3 |

+-------------------+

#说明:高可用主节点故障恢复后也不影响主从库的同步


7、高可用脑裂问题及解决方案


(1)、导致裂脑发生的原因

1、高可用服务器之间心跳链路故障,导致无法相互检查心跳

2、高可用服务器上开启了防火墙,阻挡了心跳检测

3、高可用服务器上网卡地址等信息配置不正常,导致发送心跳失败

4、其他服务配置不当等原因,如心跳方式不同,心跳广播冲突,软件BUG等

(2)、防止裂脑一些方案

1、加冗余线路

2、检测到裂脑时,强行关闭心跳检测(远程关闭主节点,控制电源的电路fence)

3、做好脑裂的监控报警

4、报警后,备节点在接管时设置比较长的时间去接管,给运维人员足够的时间去处理(人为处理)

5、启动磁盘锁,正在服务的一方锁住磁盘,裂脑发生时,让对方完全抢不走"共享磁盘资源"

磁盘锁存在的问题:

使用锁磁盘会有死锁的问题,如果占用共享磁盘的一方不主动"解锁"另一方就永远得不到共享磁盘,假如服务器节点突然死机或崩溃,就不可能执行解锁命令,备节点也就无法接管资源和服务了,有人在HA中设计了智能锁,正在提供服务的一方只在发现心跳全部断开时才会启用磁盘锁,平时就不上锁

时间: 2024-10-10 22:29:36

15、 Heartbeat+DRBD+MySQL高可用架构方案与实施过程细节的相关文章

Heartbeat+DRBD+MySQL高可用方案

Heartbeat+DRBD+MySQL高可用方案 =============================================================================== 概述: =============================================================================== 方案介绍  1.方案介绍及优缺点 ★方案介绍 本方案采用Heartbeat双机热备软件来保证数据库的高稳定性和连续性,数

Heartbeat+DRBD+MySQL高可用方案【转】

转自Heartbeat+DRBD+MySQL高可用方案 - yayun - 博客园 http://www.cnblogs.com/gomysql/p/3674030.html 1.方案简介 本方案采用Heartbeat双机热备软件来保证数据库的高稳定性和连续性,数据的一致性由DRBD这个工具来保证.默认情况下只有一台mysql在工作,当主mysql服务器出现问题后,系统将自动切换到备机上继续提供服务,当主数据库修复完毕,又将服务切回继续由主mysql提供服务. 2.方案优缺点 优点:安全性高.稳

Heartbeat+Drbd+MySQL高可用

一.环境介绍 继续使用之前heartbeat+drbd+nfs的环境,192.168.49.0/24网段用来ssh远程连接,172.16.49.0/24用来做心跳连接,并且也做drbd同步数据使用.因为中间做了好多改变,这里再次给出环境的配置情况. 主机名 角色 IP地址 heartbeat01.contoso.com heartbeat+drbd+mysql(节点1) eth0:192.168.49.133 eth1:172.16.49.133 heartbeat02.contoso.com

centos7.5部署heartbeat+DRBD+mysql高可用方案

做双机热备方案需要用到Hearbeat和存储设备(如果没存储设备,可以用DRBD代替,但是最好用存储设备). Heartbeat:如果热备服务器在规定的时间内没有收到主服务器心跳消息那么热备服务器会认为主服务器宕机了,热备服务器就开始工作启动IP.服务等也就是启动故障转移程序.启动故障转移程序的同时并取得主服务器上相关资源服务的控制权,接替主服务器继续不间断的提供服务,从而达到资源及服务高可用性的目的. DRBD(代替存储设备):Distributed Replicated Block Devi

mysql高可用架构方案之二(keepalived+lvs+读写分离+负载均衡)

mysql主从复制与lvs+keepalived实现负载高可用 目录 1.前言    4 2.原理    4 2.1.概要介绍    4 2.2.工作原理    4 2.3.实际作用    4 3方案    4 3.1.环境    4 3.2.架构图    5 3.3.设计原理    6 4.相关软件安装    6 4.配置mysql的主从    7 5.通过lvs+keepalived实现负载与热备,并实现读写分离    8 1.前言 最近研究了下高可用的东西,这里总结一下mysql主从复制读

mysql高可用架构方案之一(keepalived+主主双活)

Mysql双主双活+keepalived实现高可用 目录 1.前言... 4 2.方案... 4 2.1.环境及软件... 4 2.2.IP规划... 4 2.3.架构图... 4 3.安装设置MYSQL半同步... 5 4.Keepalived实现MYSQL的高可用... 11 1.前言 最近研究了下高可用的东西,这里总结一下mysql主主双活的架构方案,整体上提高服务的高可用性,出现问题也不需要手动切换,提高整体的维护效率.确定改造的话,只需要让他们的程序中使用vip地址就可以,实现起来比较

Heartbeat+Drbd+Mysql高可用(HA)集群架构的部署

主机环境 redhat6.5 64位 实验环境 服务端1 ip 172.25.25.111   主机名:server1.example.com          服务端2 ip172.25.25.112    主机名:server2.example.com 安装包   heartbeat-3.0.4-2.el6.x86_64.rpm             heartbeat-devel-3.0.4-2.el6.x86_64.rpm   ldirectord-3.9.5-3.1.x86_64.r

mysql高可用架构谁能提供具体实践实例!!!

mysql高可用架构目前只查到4中解决方案,如下所示,但是没有具体实践,看到本博客的大神们,能不能给我提供一些实践的实例,谢谢!!!!! 1  Lvs+keeplived+mysql 的方案 单点写入读负载均衡主主同步高可用方案 2 Heartbeat 高可用MySQL 主主同步方案 3 Heartbeat+DRBD+mysql 高可用方案 4 MMM 高可用 mysql 方案

MySQL数据库的优化(下)MySQL数据库的高可用架构方案

MySQL数据库的优化(下)MySQL数据库的高可用架构方案 2011-03-09 08:53 抚琴煮酒 51CTO 字号:T | T 在上一篇MySQL数据库的优化中,我们跟随笔者学习了单机MySQL数据库的优化,今天我们继续跟随笔者学习MySQL优化的集群方案. AD:51CTO 网+首届APP创新评选大赛火热启动——超百万资源等你拿! [51CTO独家特稿]在上一篇MySQL数据库的优化中,我们跟随笔者学习了单机MySQL数据库的优化,今天我们继续跟随笔者学习MySQL优化的集群方案. M