mysql MHA高可用测试

【环境介绍】

系统环境:Red Hat Enterprise Linux 7 + 5.7.18 + MHA version 0.57

当前数据库状态:


系统


IP


主机名


备注


版本


xx系统


192.168.142.111


mysqlmha1


主库


5.7.18 -log MySQL Community Server (GPL)


192.168.142.112


mysqlmha2


备库(预主库)


192.168.142.113


mysqlmha3


备库&MHA MGM


192.168.142.111


mysqlmha1


VIP

切换后数据库状态:


系统


IP


主机名


备注


版本


xx系统


192.168.142.111


mysqlmha1


备库(修复后)


5.7.18 -log MySQL Community Server (GPL)


192.168.142.112


mysqlmha2


主库


192.168.142.113


mysqlmha3


备库&MHA MGM


192.168.142.112


mysqlmha1


VIP

【测试步骤:自动切换】

确认当前数据库状态情况,每次检查或者操作前都需要确认当前数据库状态:

检查节点间的ssh互信状态是否正常,如果有报错,确认用户,用户互信及密码

$masterha_check_ssh --conf=/etc/masterha/app1.cnf

检查mysql主从复制是否正常,如果有报错,确认脚本文件权限是否准确,配置文件信息是否准确

$masterha_check_repl --conf=/etc/masterha/app1.cnf

在mysqlmha3管理节点启动MHA自动切换脚本,此脚本支持一次切换后则自动停止自己的监控进程:

[[email protected] shell]$ sh masterha_manager.sh start
[1]+  完成                  sh masterha_manager.sh start
[[email protected] shell]$
[[email protected] shell]$ ps -ef |grep masterha_manager
mha       3154     1  2 23:34 pts/2    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf --ignore_last_failover
mha       3169  2977  0 23:34 pts/2    00:00:00 grep --color=auto masterha_manager
[[email protected] shell]$

在mysqlmha1节点上模拟数据DOWN掉:

[[email protected] ~]# ps -ef|grep mysqld|awk ‘{print "kill -9 "$2}‘|sh
sh: 第 2 行:kill: (3602) - 没有那个进程
[1]+  已杀死               mysqld --defaults-file=/etc/mymha.cnf
[[email protected] ~]#

查看mysqlmha3节点查看自动切换日志:

[[email protected] app1]# cat  manager.log
Wed May 16 23:34:34 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed May 16 23:34:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed May 16 23:34:34 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed May 16 23:45:49 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed May 16 23:45:49 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed May 16 23:45:49 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
34:35 2018 - [info]   192.168.142.112(192.168.142.112:3306)
Wed May 16 23:34:35 2018 - [info]   192.168.142.113(192.168.142.113:3306)
Wed May 16 23:34:35 2018 - [info] Alive Slaves:
Wed May 16 23:34:35 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:34:35 2018 - [info]     GTID ON
Wed May 16 23:34:35 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:34:35 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:34:35 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:34:35 2018 - [info]     GTID ON
Wed May 16 23:34:35 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:34:35 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:34:35 2018 - [info] Current Alive Master: 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:34:35 2018 - [info] Checking slave configurations..
Wed May 16 23:34:35 2018 - [info]  read_only=1 is not set on slave 192.168.142.112(192.168.142.112:3306).
Wed May 16 23:34:35 2018 - [info]  read_only=1 is not set on slave 192.168.142.113(192.168.142.113:3306).
Wed May 16 23:34:35 2018 - [info] Checking replication filtering settings..
Wed May 16 23:34:35 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Wed May 16 23:34:35 2018 - [info]  Replication filtering check ok.
Wed May 16 23:34:35 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Wed May 16 23:34:35 2018 - [info] Checking SSH publickey authentication settings on the current master..
Wed May 16 23:34:35 2018 - [info] HealthCheck: SSH to 192.168.142.111 is reachable.
Wed May 16 23:34:35 2018 - [info]
192.168.142.111(192.168.142.111:3306) (current master)  ###当前数据库信息
 +--192.168.142.112(192.168.142.112:3306)
 +--192.168.142.113(192.168.142.113:3306)

Wed May 16 23:34:35 2018 - [info] Checking master_ip_failover_script status:
Wed May 16 23:34:35 2018 - [info]   /usr/bin/master_ip_failover --command=status --ssh_user=mha --orig_master_host=192.168.142.111 --orig_master_ip=192.168.142.111 --orig_master_port=3306

IN SCRIPT TEST====sudo /sbin/ifconfig eno16777736:2 down==sudo /sbin/ifconfig eno16777736:2 192.168.142.114 netmask 255.255.255.0;/sbin/arping -I eno16777736 -c 3 -s 192.168.142.114 192.168.142.2 >/dev/null 2>&1===

Checking the Status of the script.. OK
Wed May 16 23:34:38 2018 - [info]  OK.
Wed May 16 23:34:38 2018 - [warning] shutdown_script is not defined.
Wed May 16 23:34:38 2018 - [info] Set master ping interval 5 seconds.
Wed May 16 23:34:38 2018 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 192.168.142.112 -s 192.168.142.113  --user=root --master_host=mysqlmha1 --master_ip=192.168.142.111 --master_port=3306
Wed May 16 23:34:38 2018 - [info] Starting ping health check on 192.168.142.111(192.168.142.111:3306)..
Wed May 16 23:34:38 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn‘t respond..
Wed May 16 23:43:53 2018 - [warning] Got timeout on MySQL Ping(SELECT) child process and killed it! at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 431.
Wed May 16 23:43:53 2018 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 192.168.142.112 -s 192.168.142.113  --user=root --master_host=mysqlmha1 --master_ip=192.168.142.111 --master_port=3306  --user=mha  --master_host=192.168.142.111  --master_ip=192.168.142.111  --master_port=3306 --master_user=mha --master_password=Mha_ahm%0118 --ping_type=SELECT
Wed May 16 23:43:53 2018 - [info] Executing SSH check script: exit 0
Wed May 16 23:43:53 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn‘t respond..
Wed May 16 23:43:53 2018 - [info] HealthCheck: SSH to 192.168.142.111 is reachable.
Master is reachable from 192.168.142.112!
Wed May 16 23:43:54 2018 - [warning] Master is reachable from at least one of other monitoring servers. Failover should not happen.
Wed May 16 23:45:33 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Wed May 16 23:45:33 2018 - [info] Executing SSH check script: exit 0
Wed May 16 23:45:33 2018 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 192.168.142.112 -s 192.168.142.113  --user=root --master_host=mysqlmha1 --master_ip=192.168.142.111 --master_port=3306  --user=mha  --master_host=192.168.142.111  --master_ip=192.168.142.111  --master_port=3306 --master_user=mha --master_password=Mha_ahm%0118 --ping_type=SELECT
Wed May 16 23:45:33 2018 - [info] HealthCheck: SSH to 192.168.142.111 is reachable.
Monitoring server 192.168.142.112 is reachable, Master is not reachable from 192.168.142.112. OK.
Monitoring server 192.168.142.113 is reachable, Master is not reachable from 192.168.142.113. OK.
Wed May 16 23:45:34 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Wed May 16 23:45:38 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.142.111‘ (111))
Wed May 16 23:45:38 2018 - [warning] Connection failed 2 time(s)..
Wed May 16 23:45:43 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.142.111‘ (111))
Wed May 16 23:45:43 2018 - [warning] Connection failed 3 time(s)..
Wed May 16 23:45:48 2018 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.142.111‘ (111))
Wed May 16 23:45:48 2018 - [warning] Connection failed 4 time(s)..
Wed May 16 23:45:48 2018 - [warning] Master is not reachable from health checker!
Wed May 16 23:45:48 2018 - [warning] Master 192.168.142.111(192.168.142.111:3306) is not reachable!
Wed May 16 23:45:48 2018 - [warning] SSH is reachable.
Wed May 16 23:45:48 2018 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Wed May 16 23:45:48 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed May 16 23:45:48 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed May 16 23:45:48 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed May 16 23:45:49 2018 - [info] GTID failover mode = 1
Wed May 16 23:45:49 2018 - [info] Dead Servers:
Wed May 16 23:45:49 2018 - [info]   192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:49 2018 - [info] Alive Servers:
Wed May 16 23:45:49 2018 - [info]   192.168.142.112(192.168.142.112:3306)
Wed May 16 23:45:49 2018 - [info]   192.168.142.113(192.168.142.113:3306)
Wed May 16 23:45:49 2018 - [info] Alive Slaves:
Wed May 16 23:45:49 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:49 2018 - [info]     GTID ON
Wed May 16 23:45:49 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:49 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:45:49 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:49 2018 - [info]     GTID ON
Wed May 16 23:45:49 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:49 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:45:49 2018 - [info] Checking slave configurations..
Wed May 16 23:45:49 2018 - [info]  read_only=1 is not set on slave 192.168.142.112(192.168.142.112:3306).
Wed May 16 23:45:49 2018 - [info]  read_only=1 is not set on slave 192.168.142.113(192.168.142.113:3306).
Wed May 16 23:45:49 2018 - [info] Checking replication filtering settings..
Wed May 16 23:45:49 2018 - [info]  Replication filtering check ok.
Wed May 16 23:45:49 2018 - [info] Master is down!
Wed May 16 23:45:49 2018 - [info] Terminating monitoring script.
Wed May 16 23:45:49 2018 - [info] Got exit code 20 (Master dead).
Wed May 16 23:45:49 2018 - [info] MHA::MasterFailover version 0.57.
Wed May 16 23:45:49 2018 - [info] Starting master failover.
Wed May 16 23:45:49 2018 - [info]
Wed May 16 23:45:49 2018 - [info] * Phase 1: Configuration Check Phase..
Wed May 16 23:45:49 2018 - [info]
Wed May 16 23:45:50 2018 - [info] GTID failover mode = 1
Wed May 16 23:45:50 2018 - [info] Dead Servers:
Wed May 16 23:45:50 2018 - [info]   192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info] Checking master reachability via MySQL(double check)...
Wed May 16 23:45:50 2018 - [info]  ok.
Wed May 16 23:45:50 2018 - [info] Alive Servers:
Wed May 16 23:45:50 2018 - [info]   192.168.142.112(192.168.142.112:3306)
Wed May 16 23:45:50 2018 - [info]   192.168.142.113(192.168.142.113:3306)
Wed May 16 23:45:50 2018 - [info] Alive Slaves:
Wed May 16 23:45:50 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:45:50 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:45:50 2018 - [info] Starting GTID based failover.
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] ** Phase 1: Configuration Check Phase completed.
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] * Phase 2: Dead Master Shutdown Phase..
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Wed May 16 23:45:50 2018 - [info] Executing master IP deactivation script:
Wed May 16 23:45:50 2018 - [info]   /usr/bin/master_ip_failover --orig_master_host=192.168.142.111 --orig_master_ip=192.168.142.111 --orig_master_port=3306 --command=stopssh --ssh_user=mha

IN SCRIPT TEST====sudo /sbin/ifconfig eno16777736:2 down==sudo /sbin/ifconfig eno16777736:2 192.168.142.114 netmask 255.255.255.0;/sbin/arping -I eno16777736 -c 3 -s 192.168.142.114 192.168.142.2 >/dev/null 2>&1===

Disabling the VIP on old master: 192.168.142.111
Wed May 16 23:45:50 2018 - [info]  done.
Wed May 16 23:45:50 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Wed May 16 23:45:50 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] * Phase 3: Master Recovery Phase..
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] The latest binary log file/position on all slaves is binlog01.000009:246
Wed May 16 23:45:50 2018 - [info] Retrieved Gtid Set: 8d7abed9-d4cd-11e7-a165-000c29c913a2:6-7
Wed May 16 23:45:50 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Wed May 16 23:45:50 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:45:50 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:45:50 2018 - [info] The oldest binary log file/position on all slaves is binlog01.000009:246
Wed May 16 23:45:50 2018 - [info] Retrieved Gtid Set: 8d7abed9-d4cd-11e7-a165-000c29c913a2:6-7
Wed May 16 23:45:50 2018 - [info] Oldest slaves:
Wed May 16 23:45:50 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:45:50 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] * Phase 3.3: Determining New Master Phase..
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] Searching new master from slaves..
Wed May 16 23:45:50 2018 - [info]  Candidate masters from the configuration file:
Wed May 16 23:45:50 2018 - [info]   192.168.142.112(192.168.142.112:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed May 16 23:45:50 2018 - [info]  Non-candidate masters:
Wed May 16 23:45:50 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:45:50 2018 - [info]     GTID ON
Wed May 16 23:45:50 2018 - [info]     Replicating from 192.168.142.111(192.168.142.111:3306)
Wed May 16 23:45:50 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:45:50 2018 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Wed May 16 23:45:50 2018 - [info] New master is 192.168.142.112(192.168.142.112:3306)
Wed May 16 23:45:50 2018 - [info] Starting master failover..
Wed May 16 23:45:50 2018 - [info]
From:
192.168.142.111(192.168.142.111:3306) (current master)
 +--192.168.142.112(192.168.142.112:3306)
 +--192.168.142.113(192.168.142.113:3306)

To:
192.168.142.112(192.168.142.112:3306) (new master)
 +--192.168.142.113(192.168.142.113:3306)
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info] * Phase 3.3: New Master Recovery Phase..
Wed May 16 23:45:50 2018 - [info]
Wed May 16 23:45:50 2018 - [info]  Waiting all logs to be applied..
Wed May 16 23:45:50 2018 - [info]   done.
Wed May 16 23:45:50 2018 - [info] Getting new master‘s binlog name and position..
Wed May 16 23:45:50 2018 - [info]  binlog01.000008:286
Wed May 16 23:45:50 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST=‘192.168.142.112‘, MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER=‘repl‘, MASTER_PASSWORD=‘xxx‘;
Wed May 16 23:45:50 2018 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: binlog01.000008, 286, 42f239e7-5908-11e8-8214-000c2926d694:1,
8d7abed9-d4cd-11e7-a165-000c29c913a2:1-7,
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-15:1000002-1000075
Wed May 16 23:45:50 2018 - [info] Executing master IP activate script:
Wed May 16 23:45:50 2018 - [info]   /usr/bin/master_ip_failover --command=start --ssh_user=mha --orig_master_host=192.168.142.111 --orig_master_ip=192.168.142.111 --orig_master_port=3306 --new_master_host=192.168.142.112 --new_master_ip=192.168.142.112 --new_master_port=3306 --new_master_user=‘mha‘   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password

IN SCRIPT TEST====sudo /sbin/ifconfig eno16777736:2 down==sudo /sbin/ifconfig eno16777736:2 192.168.142.114 netmask 255.255.255.0;/sbin/arping -I eno16777736 -c 3 -s 192.168.142.114 192.168.142.2 >/dev/null 2>&1===

Enabling the VIP - 192.168.142.114 on the new master - 192.168.142.112
Wed May 16 23:45:54 2018 - [info]  OK.
Wed May 16 23:45:54 2018 - [info] ** Finished master recovery successfully.
Wed May 16 23:45:54 2018 - [info] * Phase 3: Master Recovery Phase completed.
Wed May 16 23:45:54 2018 - [info]
Wed May 16 23:45:54 2018 - [info] * Phase 4: Slaves Recovery Phase..
Wed May 16 23:45:54 2018 - [info]
Wed May 16 23:45:54 2018 - [info]
Wed May 16 23:45:54 2018 - [info] * Phase 4.1: Starting Slaves in parallel..
Wed May 16 23:45:54 2018 - [info]
Wed May 16 23:45:54 2018 - [info] -- Slave recovery on host 192.168.142.113(192.168.142.113:3306) started, pid: 3382. Check tmp log /var/log/masterha/app1/192.168.142.113_3306_20180516234549.log if it takes time..
Wed May 16 23:45:55 2018 - [info]
Wed May 16 23:45:55 2018 - [info] Log messages from 192.168.142.113 ...
Wed May 16 23:45:55 2018 - [info]
Wed May 16 23:45:54 2018 - [info]  Resetting slave 192.168.142.113(192.168.142.113:3306) and starting replication from the new master 192.168.142.112(192.168.142.112:3306)..
Wed May 16 23:45:54 2018 - [info]  Executed CHANGE MASTER.
Wed May 16 23:45:54 2018 - [info]  Slave started.
Wed May 16 23:45:54 2018 - [info]  gtid_wait(42f239e7-5908-11e8-8214-000c2926d694:1,
8d7abed9-d4cd-11e7-a165-000c29c913a2:1-7,
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-15:1000002-1000075) completed on 192.168.142.113(192.168.142.113:3306). Executed 2 events.
Wed May 16 23:45:55 2018 - [info] End of log messages from 192.168.142.113.
Wed May 16 23:45:55 2018 - [info] -- Slave on host 192.168.142.113(192.168.142.113:3306) started.
Wed May 16 23:45:55 2018 - [info] All new slave servers recovered successfully.
Wed May 16 23:45:55 2018 - [info]
Wed May 16 23:45:55 2018 - [info] * Phase 5: New master cleanup phase..
Wed May 16 23:45:55 2018 - [info]
Wed May 16 23:45:55 2018 - [info] Resetting slave info on the new master..
Wed May 16 23:45:55 2018 - [info]  192.168.142.112: Resetting slave info succeeded.
Wed May 16 23:45:55 2018 - [info] Master failover to 192.168.142.112(192.168.142.112:3306) completed successfully.
Wed May 16 23:45:55 2018 - [info]

----- Failover Report -----

app1: MySQL Master failover 192.168.142.111(192.168.142.111:3306) to 192.168.142.112(192.168.142.112:3306) succeeded

Master 192.168.142.111(192.168.142.111:3306) is down!

Check MHA Manager logs at mysqlmha3:/var/log/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.142.111(192.168.142.111:3306)
Selected 192.168.142.112(192.168.142.112:3306) as a new master.
192.168.142.112(192.168.142.112:3306): OK: Applying all logs succeeded.
192.168.142.112(192.168.142.112:3306): OK: Activated master IP address.
192.168.142.113(192.168.142.113:3306): OK: Slave started, replicating from 192.168.142.112(192.168.142.112:3306)
192.168.142.112(192.168.142.112:3306): Resetting slave info succeeded.
Master failover to 192.168.142.112(192.168.142.112:3306) completed successfully.
[[email protected] app1]#

在节点2上面检查VIP是否漂移:

[[email protected]mysqlmha2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:26:d6:94 brd ff:ff:ff:ff:ff:ff
    inet 192.168.142.112/24 brd 192.168.142.255 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet 192.168.142.114/24 brd 192.168.142.255 scope global secondary eno16777736:2
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe26:d694/64 scope link
       valid_lft forever preferred_lft forever
[[email protected] ~]#

查看mysqlmha3节点查看当前集群状态:

[[email protected] shell]$ masterha_check_repl --conf=/etc/masterha/app1.cnf
Wed May 16 23:55:37 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed May 16 23:55:37 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed May 16 23:55:37 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed May 16 23:55:37 2018 - [info] MHA::MasterMonitor version 0.57.
Wed May 16 23:55:38 2018 - [info] GTID failover mode = 1
Wed May 16 23:55:38 2018 - [info] Dead Servers:
Wed May 16 23:55:38 2018 - [info]   192.168.142.111(192.168.142.111:3306)
Wed May 16 23:55:38 2018 - [info] Alive Servers:
Wed May 16 23:55:38 2018 - [info]   192.168.142.112(192.168.142.112:3306)
Wed May 16 23:55:38 2018 - [info]   192.168.142.113(192.168.142.113:3306)
Wed May 16 23:55:38 2018 - [info] Alive Slaves:
Wed May 16 23:55:38 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Wed May 16 23:55:38 2018 - [info]     GTID ON
Wed May 16 23:55:38 2018 - [info]     Replicating from 192.168.142.112(192.168.142.112:3306)
Wed May 16 23:55:38 2018 - [info]     Not candidate for the new Master (no_master is set)
Wed May 16 23:55:38 2018 - [info] Current Alive Master: 192.168.142.112(192.168.142.112:3306)
Wed May 16 23:55:38 2018 - [info] Checking slave configurations..
Wed May 16 23:55:38 2018 - [info]  read_only=1 is not set on slave 192.168.142.113(192.168.142.113:3306).
Wed May 16 23:55:38 2018 - [info] Checking replication filtering settings..
Wed May 16 23:55:38 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Wed May 16 23:55:38 2018 - [info]  Replication filtering check ok.
Wed May 16 23:55:38 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln364] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
Wed May 16 23:55:38 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /bin/masterha_check_repl line 48.
Wed May 16 23:55:38 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed May 16 23:55:38 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!  ###这里输出为正常,因为节点一没有修复
[[email protected] shell]$

修复节mysqlmha1节点数据库为备库:

查看在自动切换脚本时记录了change master脚本,在节点一执行时需要把密码加入。

[[email protected] app1]# cd /var/log/masterha/app1
[[email protected] app1]# grep -i change manager.log
Wed May 16 23:45:50 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST=‘192.168.142.112‘, MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER=‘repl‘, MASTER_PASSWORD=‘xxx‘;
Wed May 16 23:45:54 2018 - [info]  Executed CHANGE MASTER.
[[email protected] app1]#

启动mysqlmha1节点数据库,确认数据库启动没有报错后往后操作
[[email protected] ~]# mysqld --defaults-file=/etc/mymha.cnf &
[1] 3636
[[email protected] ~]#

使mysqlmha1节点为备库
[[email protected] ~]# mysql -uroot -p -P3306 --protocol=tcp
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.7.18-log MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

mysql> CHANGE MASTER TO MASTER_HOST=‘192.168.142.112‘, MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER=‘repl‘, MASTER_PASSWORD=‘repl‘;
Query OK, 0 rows affected, 2 warnings (0.36 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.142.112
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog01.000008
          Read_Master_Log_Pos: 286
               Relay_Log_File: relaylog01.000004
                Relay_Log_Pos: 393
        Relay_Master_Log_File: binlog01.000008
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 286
              Relay_Log_Space: 669
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 112
                  Master_UUID: 42f239e7-5908-11e8-8214-000c2926d694
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 42f239e7-5908-11e8-8214-000c2926d694:1
            Executed_Gtid_Set: 42f239e7-5908-11e8-8214-000c2926d694:1,
8d7abed9-d4cd-11e7-a165-000c29c913a2:1-7,
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-15:1000002-1000075
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

mysql> exit
Bye
[[email protected] ~]#

在MHA 管理节点检查数据库当前状态:

[[email protected] shell]$ masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu May 17 00:02:39 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu May 17 00:02:39 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu May 17 00:02:39 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu May 17 00:02:39 2018 - [info] MHA::MasterMonitor version 0.57.
Thu May 17 00:02:40 2018 - [info] GTID failover mode = 1
Thu May 17 00:02:40 2018 - [info] Dead Servers:
Thu May 17 00:02:40 2018 - [info] Alive Servers:
Thu May 17 00:02:40 2018 - [info]   192.168.142.111(192.168.142.111:3306)  ###已经识别到节点恢复
Thu May 17 00:02:40 2018 - [info]   192.168.142.112(192.168.142.112:3306)
Thu May 17 00:02:40 2018 - [info]   192.168.142.113(192.168.142.113:3306)
Thu May 17 00:02:40 2018 - [info] Alive Slaves:
Thu May 17 00:02:40 2018 - [info]   192.168.142.111(192.168.142.111:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Thu May 17 00:02:40 2018 - [info]     GTID ON
Thu May 17 00:02:40 2018 - [info]     Replicating from 192.168.142.112(192.168.142.112:3306)
Thu May 17 00:02:40 2018 - [info]   192.168.142.113(192.168.142.113:3306)  Version=5.7.18-log (oldest major version between slaves) log-bin:enabled
Thu May 17 00:02:40 2018 - [info]     GTID ON
Thu May 17 00:02:40 2018 - [info]     Replicating from 192.168.142.112(192.168.142.112:3306)
Thu May 17 00:02:40 2018 - [info]     Not candidate for the new Master (no_master is set)
Thu May 17 00:02:40 2018 - [info] Current Alive Master: 192.168.142.112(192.168.142.112:3306)  ###当前主库已经切换至节点2
Thu May 17 00:02:40 2018 - [info] Checking slave configurations..
Thu May 17 00:02:40 2018 - [info]  read_only=1 is not set on slave 192.168.142.111(192.168.142.111:3306).  ###节点1已经成为备库
Thu May 17 00:02:40 2018 - [info]  read_only=1 is not set on slave 192.168.142.113(192.168.142.113:3306).  ###节点3没有改边,仍然是备库
Thu May 17 00:02:40 2018 - [info] Checking replication filtering settings..
Thu May 17 00:02:40 2018 - [info]  binlog_do_db= , binlog_ignore_db=
Thu May 17 00:02:40 2018 - [info]  Replication filtering check ok.
Thu May 17 00:02:40 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu May 17 00:02:40 2018 - [info] Checking SSH publickey authentication settings on the current master..
Thu May 17 00:02:40 2018 - [info] HealthCheck: SSH to 192.168.142.112 is reachable.
Thu May 17 00:02:40 2018 - [info]
192.168.142.112(192.168.142.112:3306) (current master)
 +--192.168.142.111(192.168.142.111:3306)
 +--192.168.142.113(192.168.142.113:3306)

Thu May 17 00:02:40 2018 - [info] Checking replication health on 192.168.142.111..
Thu May 17 00:02:40 2018 - [info]  ok.
Thu May 17 00:02:40 2018 - [info] Checking replication health on 192.168.142.113..
Thu May 17 00:02:40 2018 - [info]  ok.
Thu May 17 00:02:40 2018 - [info] Checking master_ip_failover_script status:
Thu May 17 00:02:40 2018 - [info]   /usr/bin/master_ip_failover --command=status --ssh_user=mha --orig_master_host=192.168.142.112 --orig_master_ip=192.168.142.112 --orig_master_port=3306

IN SCRIPT TEST====sudo /sbin/ifconfig eno16777736:2 down==sudo /sbin/ifconfig eno16777736:2 192.168.142.114 netmask 255.255.255.0;/sbin/arping -I eno16777736 -c 3 -s 192.168.142.114 192.168.142.2 >/dev/null 2>&1===

Checking the Status of the script.. OK
Thu May 17 00:02:44 2018 - [info]  OK.
Thu May 17 00:02:44 2018 - [warning] shutdown_script is not defined.
Thu May 17 00:02:44 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[[email protected] shell]$

【测试步骤:人工脚本切换】

原文地址:https://www.cnblogs.com/zetanchen/p/9059959.html

时间: 2024-10-13 19:38:43

mysql MHA高可用测试的相关文章

搭建MySQL MHA高可用

本文内容参考:http://www.ttlsa.com/mysql/step-one-by-one-deploy-mysql-mha-cluster/ MySQL MHA 高可用集群 环境: Linux: centos 6.6 MySQL: 5.5.49 MHA: mha4mysql-manager-0.56-0.el6.noarch.rpm(管理端) 以及 mha4mysql-node-0.56-0.el6.noarch.rpm(节点) 192.168.178.128 MySQL主从环境: M

Mysql MHA高可用集群架构

记得之前发过一篇文章,名字叫<浅析MySQL高可用架构>,之后一直有很多小伙伴在公众号后台或其它渠道问我,何时有相关的深入配置管理文章出来,因此,民工哥,也将对前面的各类架构逐一进行整理,然后发布出来.那么今天将来发布的MHA的架构整体规划与配置操作. 简单介绍MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,作为MySQL高可用性环境下故障切换和主从提升的高可用软件.在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数

MySQL MHA高可用方案

200 ? "200px" : this.width)!important;} --> 介绍 MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件.在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用.它由两部分组成:MHA Man

mysql mha高可用架构的安装

MMM无法全然地保证数据的一致性,所以MMM适用于对数据的一致性要求不是非常高.可是又想最大程度的保证业务可用性的场景对于那些对数据一致性要求非常高的业务,非常不建议採用MMM的这样的高可用性架构.那么可以考虑使用MHA.在mysql故障切换的过程中.MHA可以在0-30s内自己主动完毕数据库的故障切换操作,而且MHA可以最大程度上保证数据的一致性,以达到真正意义上的高可用. MHA软件由两部分组成,Manager工具包和Node工具包.详细的说明例如以下. Manager工具包主要包含下面几个

MySQL MHA高可用架构介绍

介绍了当前主流高可用软件MHA的工作流程和切换演示(模拟从库延迟,主库宕机后,数据自动补齐) 视频地址:http://edu.51cto.com/lesson/id-44865.html

MySQL MHA高可用环境部署

一,安装MHA基本环境 安装MHA节点 (1)基本环境说明 角色IP地址主机名 ========================================= 主机192.168.1.121节点1 从机192.168.1.122节点2 从机192.168.1.123节点3 监视主机192.168.1.125节点5 (2)在node1,node2,node3,node5中操作: #vi / etc / hosts 192.168.1.121 node1 192.168.1.122 node2

MySQL MHA高可用环境搭建

一.安装MHA基本环境 1. 安装MHA node (1) 基本环境说明,本文参考互联网文章学习,搭建MHA与测试如下. 参考文档:http://www.cnblogs.com/xuanzhi201111/p/4231412.html 角色                IP地址            主机名    =============================================    Master              192.168.1.121     node

mysql+mha高可用搭建

主库:172.25.254.125 备库:172.25.254.225 管理节点:172.25.254.126 在开始之前,请先配置好服务器间的时间同步和名称解析 一:在数据库节点安装mha node [[email protected] ~]# rpm -ivh epel-release-5-4.noarch.rpm Retrieving epel-release-5-4.noarch.rpm warning: /var/tmp/rpm-xfer.yqwfYT: Header V3 DSA s

各种报错,搭建Mysql MHA高可用集群时踩的各种坑

mha下载地址,需要翻墙 https://code.google.com/p/mysql-master-ha/ 管理软件 mha4mysql-manager-0.52-0.noarch.rpm 节点软件 mha4mysql-node-0.52-0.noarch.rpm 环境介绍 Centos6.7 X64 192.168.30.210 monitor 192.168.30.211 db1 (master) 192.168.30.212 db2  (备master) 192.168.30.213