MySQL高可用系列之MHA（二） / 憋错料

一.参数说明

MHA提供了一系列配置参数，深入理解每个参数的具体含义，对优化配置、合理使用MHA非常重要，很多高可用性也都是通过合理配置一些参数而实现的。

MHA包括如下配置参数，分别说明如下：

hostname/ip/port (Local Only)

hostname为MySQL Server的IP地址或主机名；

ip为MySQL Server的IP地址，缺省从$hostname中获取；port为MySQL Server的端口号，缺省为3306

ssh_host/ssh_ip/ssh_port (Local Only)

这三个参数是从版本0.53才引入的，其中ssh_host同$hostname，ssh_ip同$ip；ssh_port为SSH通信使用的系统端口号，缺省为22。

ssh_connection_timeout (Local/App/Global) SSH连接超时阀值，缺省为5秒，该参数从版本0.54开始引入。

ssh_options (Local/App/Global) 额外的SSH命令选项，该参数从版本0.53开始引入。

candidate_master (Local Only)

该参数用于设置某个Slave是否可以优先成为Master。

若设置为1，则对应的那个Slave可优先成为新的Master；若多个Slave均设置该参数为1，则成为Master的优先级按照[server_1]/[server_2]/…排序；

缺省值为0，表示不设置某个Slave优先成为Master，即所有Slave成为Master的优先级一样，此时MHA会选择一个延迟最小的Slave成为新的Master。

no_master (Local Only)

是否禁止某个Slave成为Master，缺省值为0，表示每个Slave都有机会成为新的Master；

若设置为1，则对应的那个Slave永远不会成为Master。

ignore_fail (Local Only)

缺省情况下，当某个Slave故障时（比如：不能通过MySQL/SSH连接，SQL线程因错误停止等），MHA不开启故障切换；若设置为1，则对应的那个Slave出现故障时自动切换。

user/password (Local/App/Global)

MySQL数据库管理账户及密码，因为要执行一些必要的管理命令，比如：Stop Slave、Change Master、Reset Slave，所以该账户应该为root，这也是缺省值。

repl_user /repl_password (Local/App/Global) MySQL复制账户及密码

disable_log_bin (Local/App/Global)

若设置该参数，则在Slave应用差异日志时，自身不生成二进制日志；MHA内部是通过在调用mysqlbinlog工具命令时加参数—disable-log-bin实现的，缺省值为0。

master_pid_file (Local/App/Global) 设置Master实例的pid文件，该参数适用于一台服务器安装多个MySQL实例的情况。

ssh_user (Local/App/Global)

MHA Manager和Node访问MySQL Server所使用的OS用户，在多种情况下，都会用到该账户，比如：远程执行命令、在Slave间拷贝差异的Relay logs等。

该用户至少要拥有读取MySQL binary/relay log文件和relay_log.info文件的权限，以及日志目录的写权限（remote_workdir参数指定的路径）。

该用户不需要交互即可连接到其它服务器上，所以建议使用SSH公共密钥认证，即配置SSH等效性；缺省情况下，ssh_user为系统管理账户，即root账户。

remote_workdir (Local/App/Global)

每个MHA Node节点（运行MySQL实例的服务器）的工作目录全路径，其中会生成日志文件，缺省为/var/tmp；若路径不存在，MHA Node会自动创建，当然这需要拥有足够的权限。

注意：不管是Manager还是Node，都会检查目录的可用磁盘空间。

master_binlog_dir (Local/App/Global)

MySQL主库的binlog日志文件的全路径，缺省为/var/lib/mysql，根据实际情况设置为真实的路径。

该参数适用于这么一种情况：Master实例故障，但OS系统运行正常，此时MHA可通过SSH登录，然后读取并拷贝必须的二进制日志事件（即差异的日志）。

可见该参数是必要而有用的，因为Master死掉后，MHA无法自动获取binlog日志文件的路径。

另外，可用逗号隔开设置多个路径。

log_level (App/Global) MHA Manager的日志等级，可设置为debug/info/warning/error，缺省为info

manager_workdir (App) MHA Manger节点生的工作目录全路径，其中生成各种相关的状态文件，若未设置，则缺省为/var/tmp。

manager_log (App)

MHA Manager日志的全路径名称，若未设置，则输出到STDOUT/STDERR；

需要注意的是：在执行手动故障切换时，MHA则忽略参数设置，而直接输出到STDOUT/STDERR。

check_repl_delay (App/Global)

缺省情况下，如果某个Slave延迟超过100MB的relay logs，MHA则不会选择该Slave成为新的Master，因为这需要太长的时间来恢复。

若设置该参数为0，MHA在选择新的Master时，会忽略复制延迟。

当为某个Slave设置candidate_master＝1，使其优先成为新的Master时，该参数非常有用。

check_repl_filter (App/Global)

缺省情况下，如果Master、Slaves拥有不同的binlog/replication过滤规则，MHA会报错并且不开启监控或故障切换，这是为了避免一些意想不到的恢复错误，比如：Table not exists等。

如果你百分百确信这些不同的过滤设置不会导致恢复问题，则设置该参数为0。

需要注意的是：当应用差异日志时，MHA并不检查过滤规则，若设置该参数为0，可能会遇到“Table not exists”之类的错误，所以设置该参数时一定要小心，缺省设置为1。

latest_priority (App/Global)

缺省情况下，MHA选择最新的Slave（即延迟最少的Slave）优先成为新的Master。如果你想完全控制每个Slave成为Master的顺序，则可设置该参数为0，此时优先级由candidate_master参数

和每个Slave的[server_xxx]顺序决定。

multi_tier_slave (App/Global)

缺省情况下，MHA是不允许配置多层（3层及以上）复制结构的，例如：host1－>host2－>host3，此时MHA会报错而停止。

从版本0.52开始，MHA引入了新的参数――multi_tier_slave，以此来支持多层复制配置。

若设置了该参数，MHA不会因为3层复制结构而终止，而是忽略第三层复制；此时，若host1（master）崩溃，则host2被选择成为新的Master，host3继续从host2复制，好像第三层复制不存在一样。

ping_interval (App/Global)

该参数用于设置MHA Manager多长时间ping（执行ping SQL语句）一次Master，即ping Master实例的时间间隔，缺省为3秒。

当连续丢失3次连接间隔，即连续ping了3次都失败后，MHA Manager则认为Master已经死掉；所以通过这种ping机制发现故障的最长时间为ping_interval的4倍，即12秒。

注意：由于身份认证错误或MySQL实例连接数过多而导致的连接失败次数不计入Master死亡统计数。

ping_type (App/Global)

缺省情况下，MHA创建一个到Master的持久连接，然后定期（由ping_interval参数决定）执行“SELECT 1”（ping_type=SELECT），以此来检查Master的可用性。

但在有些情况下，通过定期地“创建/断开连接”方式效果会更好，因为这种方式相对来说更严格，并且可以更快地监测到TCP连接级故障；若采用这种方式，需设置ping_type=CONNECT。

该参数是从版本0.53开始引入的，可设置为CONNECT或SELECT，缺省为SELECT。

secondary_check_script (App/Global)

默认情况下，MHA通过单个路由（即从Manager到Master）来检查Master的可用性，这显然是不够完善的，强烈建议采用两个或多个网络路由来检查Master的可用性。

MHA正是通过调用secondary_check_script参数定义的外部脚本来实现多路由监测的，比如：

secondary_check_script＝masterha_secondary_check -s remote_host1 -s remote_host2

masterha_secondary_check包含于MHA Manager包中，其内置的脚本在多数情况下还是比较好用的；当然，若需要更多的功能，也可自定义一个网络检查脚本，然后通过该参数调用。

在上面的例子中，MHA通过如下两条路径来监测MySQL Master Server的活动：

Manager(A)->remote_host1(B)->master_host

Manager(A)->remote_host2(B)->master_host

通过这两条路径，若连接A成功，连接B失败，masterha_secondary_check退出并返回代码0，MHA Manager判断Master已经真正死掉，并开始故障切换；如果A不成功，

masterha_secondary_check退出并返回代码2，MHA Manager猜猜可能发生了网络问题，并不开始故障切换；若B成功，则退出并返回代码3，此时MHA Manager认为Master实际上是活着的，

也不开始故障切换。

一般来说，remote_host1和remote_host2这两条从MHA Manager到MySQL Server的通道应该位于为不同的网段。

MHA调用该参数定义的脚本时，会自动传递user/master_host/master_ip/master_port这几个参数，所以无需重复定义。

使用masterha_secondary_check，有以下几点需要说明：

――内置脚本依赖于IO::Socket::INET Perl包，该包从Perl v5.6.0缺省已包含；

――内置脚本需要通过SSH连接到其它远程服务器，所以需要设置SSH公共密钥认证；

――另外，内置脚本尝试从remote server建立到MySQL Master的TCP连接，这意味着MySQL配置文件中的max_connections设置不受影响，如果TCP连接成功，

则MySQL的aborted_connects状态值递增1。

master_ip_failover_script (App/Global)

常用的HA环境，通常是通过VIP来实现的，在Master上绑定一个VIP，Master崩溃后，由HA将VIP切换到Standby上。

另一种常见方法是创建一个全局目录数据库，里面存放所有的应用和Writer/Reader IP地址映射列表，以此来代替VIP，这种情况下，若Master故障，则需更新映射列表。

这两种方法各有利弊，MHA不限制使用哪一种，允许用户使用任何基于IP地址的故障切换方案；该参数就是用于此目的，换句话说，就是需要自行编写一个脚本来保证应用可以透明地

连接到新的Master上，并通过该参数调用，比如：master_ip_failover_script=/usr/local/sample/bin/master_ip_failover

示例脚本为(MHA Manager package)/samples/scripts/master_ip_failover。

整个运行期间，MHA Manager需要调用该脚本3次，第一次是在开始监控之前，目的是检查脚本的可用性，第二次是在调用shutdown_script脚本之前，第三次是在新的Master应用完所有的

relay logs之后。调用期间，MHA Manager会传递如下参数：

 Checking phase

--command=status

--ssh_user=(current master‘s ssh username)

--orig_master_host=(current master‘s hostname)

--orig_master_ip=(current master‘s ip address)

--orig_master_port=(current master‘s port number)

 Current master shutdown phase

--command=stop or stopssh

--ssh_user=(dead master‘s ssh username,if reachable via ssh)

--orig_master_host=(current(dead) master‘s hostname)

--orig_master_ip=(current(dead) master‘s ip address)

--orig_master_port=(current(dead) master‘s port number)

 New master activation phase

--command=start

--ssh_user=(new master‘s ssh username)

--orig_master_host=(dead master‘s hostname)

--orig_master_ip=(dead master‘s ip address)

--orig_master_port=(dead master‘s port number)

--new_master_host=(new master‘s hostname)

--new_master_ip=(new master‘s ip address)

--new_master_port=(new master‘s port number)

--new_master_user=(new master‘s user)

--new_master_password=(new master‘s password)

如果采用的是在Master上绑定共享VIP方式，那么在Master关闭阶段，只需shutdown_script之后关闭主机电源即可，不需要做其它的事情；在新的Master启动阶段，将这个VIP分配到新的Master上。

如果采用的是目录数据库方式，那么在Master关闭阶段，需要删除或修改dead master这条记录；在新的Master启动阶段，需要插入或修改new master这条记录。

此外，可能还需要做一些其它的事情，比如：SET GLOBAL read_only=0，创建拥有写权限的用户等，以便应用可以在新的Master上执行写操作。

MHA Manager检查脚本返回的代码，并据此执行不同的操作，若脚本退出时返回代码0或10，MHA Manager继续操作；若脚本退出时返回0或10之外的代码，

MHA Manager将终止而不会继续故障切换。该参数默认值为空，所以MHA Manager缺省也不会调用任何东西。

master_ip_online_change_script (App/Global)

这个参数与master_ip_failover_script类似，但不是用于Master故障切换的，而是一个Master在线修改命令（masterha_master_switch –master_state=alive），不同阶段传递的参数也不一样，如下：

 Current master write freezing phase（当前Master写冻结阶段）

--command=stop or stopssh

--orig_master_host=(current master‘s hostname)

--orig_master_ip=(current master‘s ip address)

--orig_master_port=(current master‘s port number)

--orig_master_user=(current master‘s user)

--orig_master_password=(current master‘s password)

 New master granting write phase（新Master授权写阶段）

--command=start

--orig_master_host=(orig master‘s hostname)

--orig_master_ip=(orig master‘s ip address)

--orig_master_port=(orig master‘s port number)

--new_master_host=(new master‘s hostname)

--new_master_ip=(new master‘s ip address)

--new_master_port=(new master‘s port number)

--new_master_user=(new master‘s user)

--new_master_password=(new master‘s password)

在写操作禁用阶段后，MHA在当前的Master上执行“FLUSH TABLES WITH READ LOCK”命令，此时可通过一些逻辑操作来执行Master切换。在给新的Master赋予写操作权限阶段，可以执行一些类似master_ip_failover_script功能的操作，比如：创建一个特权用户，执行“SET GLOBAL read_only=0”命令，修改目录数据库等。如果脚本返回的代码不是0或10，MHA将终止操作，不会进行Master切换。

该参数默认值为空，所以MHA Manager缺省不会调用任何东西。

有些情况下，为了避免脑裂问题，可能需要强制关闭Master以隔离该节点，防止其重启服务。

该参数就是为了实现这个目的的，它调用一个强制关闭master的脚本，比如：

shutdown_script=/usr/local/sample/bin/power_manager

（示例脚本位于MHA Manager源码包中）

在调用shutdown_script之前，MHA Manager先执行一个内部检查，看Master所在的服务器是否还可以通过SSH方式连接，若可以连接（比如：操作正常，而mysqld故障的情况），MHA Manager将传递如下参数：

--command=stopssh

--ssh_user=(ssh username so that you can connect to the master)

--host=(master‘s hostname)

--ip=(master‘s ip address)

--port=(master‘s port number)

--pid_file=(master‘s pid file)

如果Master所在的服务器已经故障，无法通过SSH连接，则传递如下参数：

--command=stop

--host=(master‘s hostname)

--ip=(master‘s ip address)

该示例脚本的工作原理大致如下：

如果成功传递了--command=stopssh，那么说明Master所在的服务器系统是正常运行的，此时将通过SSH方式连接到该服务器，通过系统命令“kill -9”杀掉所有的mysqld和mysqld_safe进程；

如果--pid_file也成功传递，那么脚本将试图杀掉指定的某个进程，而不是所有的mysqld进程，这适用于一台服务器上安装多个MySQL实例的情况。若mysqld进程成功杀掉，则脚本返回代码10，MHA Manager据此再次连接Master，并保存必要的二进制日志。

如果不能通过SSH方式连接到Master服务器，或者传递的命令为--command=stop，那么脚本将尝试关闭该服务器的电源。关闭电源命令依赖于H/W，不同服务器也不尽相同，对于HP(iLO)来说，使用ipmitool或SSL命令；对于Dell(DRAC)来说，使用dracadm命令。若成功关闭了服务器电源，则脚本返回代码0，否则返回代码1。

如果返回的代码为0，MHA Manager则开始故障切换过程，如果返回的是0或10之外的代码，MHA Manager将终止故障切换。

该参数缺省值为空，所以不会执行任何操作。

另外，在启动监控时，MHA Manager会调用shutdown_scrip脚本，并传递如下参数：

--command=status

--host=(master‘s hostname)

--ip=(master‘s ip address)

这里，我们可以检查脚本设置，电源控制依赖于H/W，因此强烈建议检查电源状态，若有某些错误，也可在启动监控之前注意到。

report_script (App/Global)

当故障转移成功完成，或者因错误而结束时，可以通过该参数来发送一个报告，此时会传递如下参数：

--orig_master_host=(dead master‘s hostname)

--new_master_host=(new master‘s hostname)

--new_slave_hosts=(new slaves‘ hostnames,delimited by commas：多个slave用逗号分割)

--subject=(mail subject：报告主题)

--body=(body：报告内容)

该参数缺省为空，示例脚本为(MHA Manager package)/samples/send_report。

init_conf_load_script App/Global

若想在配置文件中使用纯文本（比如password和repl_password），可使用该参数，脚本返回“name=value”对，可以覆盖全局配置文件中的参数，比如：

#! /usr/bin/perl

Print "password=$ROOT_PASS\n";

Print "repl_password=$REPL_PASS\n"

该参数默认为空。

备注：

 Local Scope――作用于每个服务器本地，在应用配置文件（app1.conf）中的[server_xxx]下配置；

 App Scope――作用于每一套MySQL Replication，在应用配置文件（app1.conf）中的[server_default]下配置；

 Global Scope――全局配置参数，适用于用一个Manager节点管理多套MySQL Replication的情况，在全局配置文件中（masterha_default.cnf)配置。

了解了原理，我们再来坐下以下实验，来真正试下MySQL高可用架构

二.MHA+Keepalived

在mastersql和backupsql中都安装keepalived软件（可参考http://blog.csdn.net/dbaxiaosa/article/details/22940483）

（1）安装依赖包

（2）编译安装

# tar zxvf keepalived-1.1.19.tar.gz

# cd keepalived-1.1.19

# ./configure --sysconf=/etc/ --with-kernel-dir=/usr/src/kernels/2.6.18-308.el5-x86_64/

（3）配置mastermysql上的keepalived

[[email protected] ~]# more /etc/keepalived/keepalived.conf

#writed by test 20140722

#global define

global_defs {

router_id mysqlmha

}

vrrp_script check_run {

script "/etc/keepalived/check_mysql.sh"

interval 1

}

############################################################

# internet

############################################################

vrrp_instance VI_1 {

state MASTER

interface eth0

virtual_router_id 51

priority 100 #master>slave slave90

advert_int 1

authentication {

auth_type PASS

auth_pass 1111

}

track_script {

check_run

}

virtual_ipaddress {

192.168.3.33

}

（4）配置backupmysql上的keepalived

[[email protected] keepalived]# more /etc/keepalived/keepalived.conf

#writed by test 20140722

#global define

global_defs {

router_id mysqlmha

}

vrrp_script check_run {

script "/etc/keepalived/check_mysql.sh"

interval 1

}

############################################################

vrrp_instance VI_1 {

state BACKUP

interface eth0

virtual_router_id 51

priority 90 #master>slave slave90

advert_int 1

authentication {

auth_type PASS

auth_pass 1111

}

track_script {

check_run

}

virtual_ipaddress {

192.168.3.33

}

（5）编辑脚本文件

大体意思是只要检测到mysql服务停止keepalived服务也停止，因为keepalived是通过组播方式告诉本网段自己还活着当mysql服务停止后keepalived还依然运行这时就需要停止keepalived让另一个主机获得虚拟IP，可以在后台运行这个脚本也可以在keepalived配置文件加入这个脚本。

mastermysql上

[[email protected] ~]# more /etc/keepalived/check_mysql.sh

#20140722

#!/bin/bash

MYSQL=/usr/bin/mysql

MYSQL_HOST=192.168.3.27

MYSQL_USER=root

MYSQL_PASSWORD=mysql

CHECK_TIME=3

#mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0

MYSQL_OK=1

function check_mysql_helth (){

$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1

if [ $? = 0 ] ;then

MYSQL_OK=1

else

MYSQL_OK=0

return $MYSQL_OK

}

while [ $CHECK_TIME -ne 0 ]

let "CHECK_TIME -= 1"

check_mysql_helth

if [ $MYSQL_OK = 1 ] ; then

CHECK_TIME=0

exit 0

if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]

then

pkill keepalived

exit 1

sleep 1

done

该脚本需要有执行权限，通过以下命令加上执行权限

[[email protected] keepalived]# chmod +x check_mysql.sh

[[email protected] keepalived]# ll

total 8

-rwxr-xr-x 1 root root 654 Jul 24 17:15 check_mysql.sh

-rw-r--r-- 1 root root 634 Jul 29 16:16 keepalived.conf

backupmysql上（脚本需要执行权限）

[[email protected] keepalived]# more check_mysql.sh

#20140722

#!/bin/bash

MYSQL=/usr/bin/mysql

MYSQL_HOST=192.168.3.28

MYSQL_USER=root

MYSQL_PASSWORD=mysql

CHECK_TIME=3

#mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0

MYSQL_OK=1

function check_mysql_helth (){

$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1

if [ $? = 0 ] ;then

MYSQL_OK=1

else

MYSQL_OK=0

return $MYSQL_OK

}

while [ $CHECK_TIME -ne 0 ]

let "CHECK_TIME -= 1"

check_mysql_helth

if [ $MYSQL_OK = 1 ] ; then

CHECK_TIME=0

exit 0

if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]

then

pkill keepalived

exit 1

sleep 1

done

（6）启动keepalived，查看虚拟IP是否绑定成功

mastermysql

[[email protected] ~]# service keepalived start

查看keepalived状态

[[email protected] ~]# service keepalived status

keepalived (pid 15712) is running...

查看是否绑定虚拟IP

[[email protected] ~]# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:0c:29:19:27:ad brd ff:ff:ff:ff:ff:ff

inet 192.168.3.27/24 brd 192.168.3.255 scope global eth0

inet 192.168.3.33/32 scope global eth0

backupMySQL

[[email protected] keepalived]# service keepalived status

keepalived (pid 13808) is running...

[[email protected] keepalived]# ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000

link/ether 00:0c:29:de:b3:a1 brd ff:ff:ff:ff:ff:ff

inet 192.168.3.28/24 brd 192.168.3.255 scope global eth0

从上可以看到masteMySQL上虚拟IP绑定成功，backupMySQL上keepalived服务正常运行，但无虚拟IP(这属于正常)

三.测试验证

通过VIP 192.168.3.33登陆MySQL，正常使用

1.停止master上MySQL服务

[[email protected] ~]# service mysql stop

Shutting down MySQL... [ OK ]

[[email protected] ~]# service mysql status

MySQL is not running [FAILED]

MySQL服务停止后，查看keepalived服务是否运行

[[email protected] ~]# service keepalived status

keepalived dead but subsys locked

可见keepalived已经停止了

2.查看backupMySQL

[[email protected] keepalived]# service mysql status

MySQL running (13740) [ OK ]

[[email protected] keepalived]# service keepalived status

keepalived (pid 13808) is running...

mysql> show variables like ‘read_only‘;

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| read_only | OFF |

+---------------+-------+

1 row in set (0.00 sec)

可见MySQL服务及keepalived都正常运行，切该库处于可写状态

3.查看只读库slave状态

mysql> show slave status \G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.3.28

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000011

Read_Master_Log_Pos: 107

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 253

Relay_Master_Log_File: mysql-bin.000011

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 107

Relay_Log_Space: 409

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 28

1 row in set (0.00 sec)

可见master已由192.168.3.27切换为192.168.3.28

4.通过VIP访问MySQL数据库，我们继续测试

删除test库下的student表

DROP TABLE student;

查看从库，test库下的student表已不存在，数据同步成功。

5.管理节点查看MHA运行情况

[[email protected] appl]# /usr/bin/masterha_check_status --conf=/etc/appl.cnf

appl is stopped(2:NOT_RUNNING).

注意：

a、切换之后需要删除手工删除/masterha/app1/app1.failover.complete，才能进行第二次测试

b、一旦发生切换管理进程将会退出，无法进行再次测试，需将故障数据库加入到MHA环境中来

c、原主节点重新加入到MHA时只能设置为slave

手工删除appl.failover.complete，启动MHA

[[email protected] appl]# rm -f appl.failover.complete

重新启动MHA

[[email protected] appl]# nohup /usr/bin/masterha_manager --conf=/etc/appl.cnf &

[1] 9659

[[email protected] appl]# nohup: appending output to `nohup.out‘

查看MHA状态

[[email protected] appl]# /usr/bin/masterha_check_status --conf=/etc/appl.cnf

appl (pid:9659) is running(0:PING_OK), master:192.168.3.28

6.主节点启动mysql服务，keepalived服务，同时配置成新master（原backupMySQL）的从库

[[email protected] ~]# service mysql start

Starting MySQL. [ OK ]

[[email protected] ~]# service keepalived start

Starting keepalived: [ OK ]

[[email protected] ~]# mysql -uroot -p

Enter password:

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 25

Server version: 5.5.17-log MySQL Community Server (GPL)

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

通过查询只读库节点错误日志文件可以得到刚切换到新master的日志文件及位置

140729 17:03:29 [Note] ‘CHANGE MASTER TO executed‘. Previous state master_host=‘192.168.3.27‘, master_port=‘3306‘, master_log_file=‘‘

, master_log_pos=‘4‘. New state master_host=‘192.168.3.28‘, master_port=‘3306‘, master_log_file=‘mysql-bin.000011‘, master_log_pos=‘107‘.

mysql> change master to

-> master_host=‘192.168.3.28‘,

-> master_user=‘repl‘,

-> master_password=‘repl_pwd‘,

-> master_log_file=‘mysql-bin.000011‘,

-> master_log_pos=107;

Query OK, 0 rows affected (0.01 sec)

mysql> slave start;

Query OK, 0 rows affected (0.00 sec)

mysql> show slave status \G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.3.28

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000011

Read_Master_Log_Pos: 216

Relay_Log_File: pid-relay-bin.000002

Relay_Log_Pos: 362

Relay_Master_Log_File: mysql-bin.000011

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 216

Relay_Log_Space: 516

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 28

1 row in set (0.00 sec)

7.停止新mastermysql（原backupMySQL）mysql服务

[[email protected] keepalived]# service mysql stop

Shutting down MySQL... [ OK ]

[[email protected] keepalived]# service keepalived status

keepalived dead but subsys locked

可见mysql服务停止后，keepalived服务也停止了

--查询MasterMySQL

登陆mysql，test库下的student表不存在，数据同步成功。

mysql> use test;

Database changed

mysql> show tables;

+----------------+

| Tables_in_test |

+----------------+

| deadlocks |

| test1 |

+----------------+

2 rows in set (0.00 sec)

mysql> show variables like ‘read_only‘;

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| read_only | OFF |

+---------------+-------+

1 row in set (0.00 sec)

8.查看只读库slave状态

mysql> show slave status \G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.3.27

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000013

Read_Master_Log_Pos: 107

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 253

Relay_Master_Log_File: mysql-bin.000013

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 107

Relay_Log_Space: 409

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 27

1 row in set (0.00 sec)

可见master 已切换回MasterMySQL

9.启动backupMySQL mysql服务、keepalived服务，配置成只读库

[[email protected] keepalived]# service mysql start

Starting MySQL.. [ OK ]

[[email protected] keepalived]# service keepalived start

Starting keepalived: [ OK ]

[[email protected] keepalived]# mysql -uroot -p

Enter password:

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 9

Server version: 5.5.17-log MySQL Community Server (GPL)

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

mysql> change master to

-> master_host=‘192.168.3.27‘,

-> master_user=‘repl‘,

-> master_password=‘repl_pwd‘,

-> master_log_file=‘mysql-bin.000013‘,

-> master_log_pos=107;

Query OK, 0 rows affected (0.02 sec)

mysql> slave start;

Query OK, 0 rows affected (0.00 sec)

mysql> show slave status \G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.3.27

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000013

Read_Master_Log_Pos: 107

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 253

Relay_Master_Log_File: mysql-bin.000013

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 107

Relay_Log_Space: 409

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 27

1 row in set (0.00 sec)

10.管理节点重新启动MHA

[[email protected] appl]# rm appl.failover.complete

rm: remove regular empty file `appl.failover.complete‘? yes

[[email protected] appl]# nohup /usr/bin/masterha_manager --conf=/etc/appl.cnf &

[1] 9890

[[email protected] appl]# nohup: appending output to `nohup.out‘

[[email protected] appl]# /usr/bin/masterha_check_status --conf=/etc/appl.cnf

appl (pid:9890) is running(0:PING_OK), master:192.168.3.27

至此测试完毕，成功实现了MySQL高可用架构

MySQL高可用系列之MHA（二）,布布扣,bubuko.com

时间： 2024-10-17 21:49:51

MySQL高可用系列之MHA（二）

一.参数说明

二.MHA+Keepalived

（1）安装依赖包

（2）编译安装

（3）配置mastermysql上的keepalived

（4）配置backupmysql上的keepalived

（5）编辑脚本文件

（6）启动keepalived，查看虚拟IP是否绑定成功

三.测试验证

1.停止master上MySQL服务

2.查看backupMySQL

3.查看只读库slave状态

4.通过VIP访问MySQL数据库，我们继续测试

5.管理节点查看MHA运行情况

6.主节点启动mysql服务，keepalived服务，同时配置成新master（原backupMySQL）的从库

7.停止新mastermysql（原backupMySQL）mysql服务

8.查看只读库slave状态

9.启动backupMySQL mysql服务、keepalived服务，配置成只读库

10.管理节点重新启动MHA

MySQL高可用系列之MHA（二）的相关文章

MySQL高可用系列之MHA（一）

第22章 mysql 高可用MMM、MHA

探索MySQL高可用架构之MHA(7)

MySQL高可用架构之MHA （未完，待续）

探索MySQL高可用架构之MHA(4)

探索MySQL高可用架构之MHA(6)

mysql高可用集群——MHA架构

探索MySQL高可用架构之MHA(5)

配置MySQL高可用集群MHA