官方介绍:
MHA performs automating master failover and slave promotion with minimal downtime, usually within 10-30 seconds. MHA prevents replication consistency problems and saves on expenses of having to acquire additional servers. All this with zero performance degradation,
no complexity (easy-to-install) and requiring no change to existing deployments.
MHA also provides scheduled online master switching, safely changing the currently running master to a new master, within mere seconds (0.5-2 seconds) of downtime (blocking writes only).
MHA provides the following functionality, and can be useful in many deployments in which high availability, data integrity and near non-stop master maintenance are required.
1.Automated master monitoring and failover
2.Interactive (manually initiated) Master Failover
3.Non-interactive master failover
4.Online switching master to a different host
详细可参见官方文档:
https://code.google.com/p/mysql-master-ha/wiki/Overview
体系结构:
安装环境:
Manager: OS:redhat 6.3 NAME:zbdba1 IP:192.168.56.160 NODE1: Mysql_role:master OS:redhat 6.3 NAME:zbdba2 IP:192.168.56.161 NODE2: Mysql_role:slave OS:redhat6.3 NAME:zbdba3 IP:192.168.56.161
1、安装配置manager节点
2、在node节点安装mysql
3、配置mysql主从复制
4、在node节点安装mha
5、配置节点间互信
6、开启MHA
7、测试MHA
1、安装配置manager节点
下载mha的manager和node包
wget http://www.mysql.gr.jp/frame/modules/bwiki/index.php?plugin=attach&pcmd=open&file=mha4mysql-manager-0.56-0.el6.noarch.rpm&refer=matsunobu wget http://www.mysql.gr.jp/frame/modules/bwiki/index.php?plugin=attach&pcmd=open&file=mha4mysql-node-0.56-0.el6.noarch.rpm&refer=matsunobu
安装依赖包:
yum install perl-DBD-MySQL yum install perl-Config-Tiny yum install perl-Log-Dispatch yum install perl-Parallel-ForkManager
这里推荐使用repoforge源:
wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm
安装
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
配置manager:
mkdir -p /etc/masterha/app1 [[email protected] masterha]# cat app1.cnf [server default] manager_workdir=/masterha/app1 manager_log=/masterha/app1/manager.log user=root password=mysql ssh_user=root repl_user=root repl_password=mysql ping_interval=1 shutdown_script="" #master_ip_failover_script="/usr/local/bin/master_ip_failover" master_ip_online_change_script="" report_script="" [server1] hostname=192.168.56.161 master_binlog_dir="/var/lib/mysql" candidate_master=1 [server2] hostname=192.168.56.162 master_binlog_dir="/var/lib/mysql" candidate_master=1
这两步就不详细说明了
4、在node节点安装mha
两个node节点安装:
rpm -ivh perl-DBD-MySQL-4.022-1.el6.rfx.x86_64.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
5、配置节点间互信
[[email protected] ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): /root/.ssh/id_rsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: f6:86:4a:38:7f:43:ec:d4:1f:d7:04:2e:48:5f:be:c9 [email protected] The key's randomart image is: +--[ RSA 2048]----+ | | | . o | | . o + . | | . o o .| | .S. o = | | . .+o. . E .| | o .+. o. o | | + .+. . | | o. . | +-----------------+ [[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected] [email protected]'s password: Now try logging into the machine, with "ssh '[email protected]'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. [[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected] [email protected]'s password: Now try logging into the machine, with "ssh '[email protected]'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. [[email protected] ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): /root/.ssh/id_rsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 19:00:7f:cb:7d:5f:cd:48:3b:08:5a:7f:28:2e:9e:f8 [email protected] The key's randomart image is: +--[ RSA 2048]----+ | ... | | . . | | . o o . | | o B o + +.| | S o = = +| | . o o o | | . . . | | o o | | ..E | +-----------------+ [[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected] [email protected]'s password: Now try logging into the machine, with "ssh '[email protected]'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. [[email protected] ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 9f:2f:5f:0c:fb:dc:f7:f9:4d:b2:de:48:01:ae:51:d1 [email protected] The key's randomart image is: +--[ RSA 2048]----+ | .. | | .E | | o | | o . | | S . o . | | . + + . | | + . = .| | .. = B+| | oo.*.X| +-----------------+ [[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected] [email protected]'s password: Now try logging into the machine, with "ssh '[email protected]'", and check in: .ssh/authorized_keys to make sure we haven't added extra keys that you weren't expecting. 进行测试: [[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Tue Feb 10 21:59:19 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Feb 10 21:59:19 2015 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Tue Feb 10 21:59:19 2015 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Tue Feb 10 21:59:19 2015 - [info] Starting SSH connection tests.. Tue Feb 10 21:59:20 2015 - [debug] Tue Feb 10 21:59:19 2015 - [debug] Connecting via SSH from [email protected](192.168.56.161:22) to [email protected](192.168.56.162:22).. Tue Feb 10 21:59:19 2015 - [debug] ok. Tue Feb 10 21:59:20 2015 - [debug] Tue Feb 10 21:59:20 2015 - [debug] Connecting via SSH from [email protected](192.168.56.162:22) to [email protected](192.168.56.161:22).. Tue Feb 10 21:59:20 2015 - [debug] ok. Tue Feb 10 21:59:20 2015 - [info] All SSH connection tests passed successfully.
6、开启MHA
开启之前进行复制测试:
[[email protected] masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf Wed Feb 11 01:20:18 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Wed Feb 11 01:20:18 2015 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Wed Feb 11 01:20:18 2015 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Wed Feb 11 01:20:18 2015 - [info] MHA::MasterMonitor version 0.56. Wed Feb 11 01:20:18 2015 - [info] GTID failover mode = 0 Wed Feb 11 01:20:18 2015 - [info] Dead Servers: Wed Feb 11 01:20:18 2015 - [info] Alive Servers: Wed Feb 11 01:20:18 2015 - [info] 192.168.56.161(192.168.56.161:3306) Wed Feb 11 01:20:18 2015 - [info] 192.168.56.162(192.168.56.162:3306) Wed Feb 11 01:20:18 2015 - [info] Alive Slaves: Wed Feb 11 01:20:18 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 01:20:18 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 01:20:18 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 01:20:18 2015 - [info] Current Alive Master: 192.168.56.161(192.168.56.161:3306) Wed Feb 11 01:20:18 2015 - [info] Checking slave configurations.. Wed Feb 11 01:20:18 2015 - [info] read_only=1 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 01:20:18 2015 - [warning] relay_log_purge=0 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 01:20:18 2015 - [info] Checking replication filtering settings.. Wed Feb 11 01:20:18 2015 - [info] binlog_do_db= , binlog_ignore_db= Wed Feb 11 01:20:18 2015 - [info] Replication filtering check ok. Wed Feb 11 01:20:18 2015 - [info] GTID (with auto-pos) is not supported Wed Feb 11 01:20:18 2015 - [info] Starting SSH connection tests.. Wed Feb 11 01:20:19 2015 - [info] All SSH connection tests passed successfully. Wed Feb 11 01:20:19 2015 - [info] Checking MHA Node version.. Wed Feb 11 01:20:19 2015 - [info] Version check ok. Wed Feb 11 01:20:19 2015 - [info] Checking SSH publickey authentication settings on the current master.. Wed Feb 11 01:20:19 2015 - [info] HealthCheck: SSH to 192.168.56.161 is reachable. Wed Feb 11 01:20:19 2015 - [info] Master MHA Node version is 0.56. Wed Feb 11 01:20:19 2015 - [info] Checking recovery script configurations on 192.168.56.161(192.168.56.161:3306).. Wed Feb 11 01:20:19 2015 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000002 Wed Feb 11 01:20:19 2015 - [info] Connecting to [email protected](192.168.56.161:22).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000002 Wed Feb 11 01:20:19 2015 - [info] Binlog setting check done. Wed Feb 11 01:20:19 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Wed Feb 11 01:20:19 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.56.162 --slave_ip=192.168.56.162 --slave_port=3306 --workdir=/var/tmp --target_version=5.5.35-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Wed Feb 11 01:20:19 2015 - [info] Connecting to [email protected](192.168.56.162:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to zbdba3-relay-bin.000007 Temporary relay log file is /var/lib/mysql/zbdba3-relay-bin.000007 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Feb 11 01:20:20 2015 - [info] Slaves settings check done. Wed Feb 11 01:20:20 2015 - [info] 192.168.56.161(192.168.56.161:3306) (current master) +--192.168.56.162(192.168.56.162:3306) Wed Feb 11 01:20:20 2015 - [info] Checking replication health on 192.168.56.162.. Wed Feb 11 01:20:20 2015 - [info] ok. Wed Feb 11 01:20:20 2015 - [warning] master_ip_failover_script is not defined. Wed Feb 11 01:20:20 2015 - [warning] shutdown_script is not defined. Wed Feb 11 01:20:20 2015 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
测试成功,开启manager:
[[email protected] masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf&
监控日志文件:
Wed Feb 11 02:12:02 2015 - [info] MHA::MasterMonitor version 0.56. Wed Feb 11 02:12:02 2015 - [warning] /masterha/app1/app1.master_status.health already exists. You might have killed manager with SIGKILL(-9), may run two or more monitoring process for the same application, or use the same working directory. Check for details, and consider setting --workdir separately. Wed Feb 11 02:12:02 2015 - [info] GTID failover mode = 0 Wed Feb 11 02:12:02 2015 - [info] Dead Servers: Wed Feb 11 02:12:02 2015 - [info] Alive Servers: Wed Feb 11 02:12:02 2015 - [info] 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:12:02 2015 - [info] 192.168.56.162(192.168.56.162:3306) Wed Feb 11 02:12:02 2015 - [info] Alive Slaves: Wed Feb 11 02:12:02 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:12:02 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:12:02 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:12:02 2015 - [info] Current Alive Master: 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:12:02 2015 - [info] Checking slave configurations.. Wed Feb 11 02:12:02 2015 - [info] read_only=1 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 02:12:02 2015 - [warning] relay_log_purge=0 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 02:12:02 2015 - [info] Checking replication filtering settings.. Wed Feb 11 02:12:02 2015 - [info] binlog_do_db= , binlog_ignore_db= Wed Feb 11 02:12:02 2015 - [info] Replication filtering check ok. Wed Feb 11 02:12:02 2015 - [info] GTID (with auto-pos) is not supported Wed Feb 11 02:12:02 2015 - [info] Starting SSH connection tests.. Wed Feb 11 02:12:03 2015 - [info] All SSH connection tests passed successfully. Wed Feb 11 02:12:03 2015 - [info] Checking MHA Node version.. Wed Feb 11 02:12:03 2015 - [info] Version check ok. Wed Feb 11 02:12:03 2015 - [info] Checking SSH publickey authentication settings on the current master.. Wed Feb 11 02:12:03 2015 - [info] HealthCheck: SSH to 192.168.56.161 is reachable. Wed Feb 11 02:12:04 2015 - [info] Master MHA Node version is 0.56. Wed Feb 11 02:12:04 2015 - [info] Checking recovery script configurations on 192.168.56.161(192.168.56.161:3306).. Wed Feb 11 02:12:04 2015 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000002 Wed Feb 11 02:12:04 2015 - [info] Connecting to [email protected](192.168.56.161:22).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000002 Wed Feb 11 02:12:04 2015 - [info] Binlog setting check done. Wed Feb 11 02:12:04 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Wed Feb 11 02:12:04 2015 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.56.162 --slave_ip=192.168.56.162 --slave_port=3306 --workdir=/var/tmp --target_version=5.5.35-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Wed Feb 11 02:12:04 2015 - [info] Connecting to [email protected](192.168.56.162:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to zbdba3-relay-bin.000007 Temporary relay log file is /var/lib/mysql/zbdba3-relay-bin.000007 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Wed Feb 11 02:12:04 2015 - [info] Slaves settings check done. Wed Feb 11 02:12:04 2015 - [info] 192.168.56.161(192.168.56.161:3306) (current master) +--192.168.56.162(192.168.56.162:3306) Wed Feb 11 02:12:04 2015 - [warning] master_ip_failover_script is not defined. Wed Feb 11 02:12:04 2015 - [warning] shutdown_script is not defined. Wed Feb 11 02:12:04 2015 - [info] Set master ping interval 1 seconds. Wed Feb 11 02:12:04 2015 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes. Wed Feb 11 02:12:04 2015 - [info] Starting ping health check on 192.168.56.161(192.168.56.161:3306).. Wed Feb 11 02:12:04 2015 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond.. 查看manager状态: [[email protected] masterha]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:7632) is running(0:PING_OK), master:192.168.56.161
7、测试MHA
关闭node1:
[[email protected] database]# service mysql stop
Shutting down MySQL... SUCCESS!
跟踪manager日志:
Wed Feb 11 02:16:49 2015 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away) Wed Feb 11 02:16:49 2015 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mysql-bin Wed Feb 11 02:16:49 2015 - [info] HealthCheck: SSH to 192.168.56.161 is reachable. Wed Feb 11 02:16:50 2015 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Wed Feb 11 02:16:50 2015 - [warning] Connection failed 2 time(s).. Wed Feb 11 02:16:51 2015 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Wed Feb 11 02:16:51 2015 - [warning] Connection failed 3 time(s).. Wed Feb 11 02:16:52 2015 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Wed Feb 11 02:16:52 2015 - [warning] Connection failed 4 time(s).. Wed Feb 11 02:16:52 2015 - [warning] Master is not reachable from health checker! Wed Feb 11 02:16:52 2015 - [warning] Master 192.168.56.161(192.168.56.161:3306) is not reachable! Wed Feb 11 02:16:52 2015 - [warning] SSH is reachable. Wed Feb 11 02:16:52 2015 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status.. Wed Feb 11 02:16:52 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Wed Feb 11 02:16:52 2015 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Wed Feb 11 02:16:52 2015 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Wed Feb 11 02:16:52 2015 - [info] GTID failover mode = 0 Wed Feb 11 02:16:52 2015 - [info] Dead Servers: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Alive Servers: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Wed Feb 11 02:16:52 2015 - [info] Alive Slaves: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:16:52 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:16:52 2015 - [info] Checking slave configurations.. Wed Feb 11 02:16:52 2015 - [info] read_only=1 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 02:16:52 2015 - [warning] relay_log_purge=0 is not set on slave 192.168.56.162(192.168.56.162:3306). Wed Feb 11 02:16:52 2015 - [info] Checking replication filtering settings.. Wed Feb 11 02:16:52 2015 - [info] Replication filtering check ok. Wed Feb 11 02:16:52 2015 - [info] Master is down! Wed Feb 11 02:16:52 2015 - [info] Terminating monitoring script. Wed Feb 11 02:16:52 2015 - [info] Got exit code 20 (Master dead). Wed Feb 11 02:16:52 2015 - [info] MHA::MasterFailover version 0.56. Wed Feb 11 02:16:52 2015 - [info] Starting master failover. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] * Phase 1: Configuration Check Phase.. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] GTID failover mode = 0 Wed Feb 11 02:16:52 2015 - [info] Dead Servers: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Checking master reachability via MySQL(double check)... Wed Feb 11 02:16:52 2015 - [info] ok. Wed Feb 11 02:16:52 2015 - [info] Alive Servers: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Wed Feb 11 02:16:52 2015 - [info] Alive Slaves: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:16:52 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:16:52 2015 - [info] Starting Non-GTID based failover. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] ** Phase 1: Configuration Check Phase completed. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] * Phase 2: Dead Master Shutdown Phase.. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] Forcing shutdown so that applications never connect to the current master.. Wed Feb 11 02:16:52 2015 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address. Wed Feb 11 02:16:52 2015 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Wed Feb 11 02:16:52 2015 - [info] * Phase 2: Dead Master Shutdown Phase completed. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] * Phase 3: Master Recovery Phase.. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] The latest binary log file/position on all slaves is mysql-bin.000002:1355 Wed Feb 11 02:16:52 2015 - [info] Latest slaves (Slaves that received relay log files to the latest): Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:16:52 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:16:52 2015 - [info] The oldest binary log file/position on all slaves is mysql-bin.000002:1355 Wed Feb 11 02:16:52 2015 - [info] Oldest slaves: Wed Feb 11 02:16:52 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:16:52 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:52 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase.. Wed Feb 11 02:16:52 2015 - [info] Wed Feb 11 02:16:52 2015 - [info] Fetching dead master's binary logs.. Wed Feb 11 02:16:52 2015 - [info] Executing command on the dead master 192.168.56.161(192.168.56.161:3306): save_binary_logs --command=save --start_file=mysql-bin.000002 --start_pos=1355 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56 Creating /var/tmp if not exists.. ok. Concat binary/relay logs from mysql-bin.000002 pos 1355 to mysql-bin.000002 EOF into /var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog .. Dumping binlog format description event, from position 0 to 107.. ok. Dumping effective binlog data from /var/lib/mysql/mysql-bin.000002 position 1355 to tail(1374).. ok. Concat succeeded. Wed Feb 11 02:16:53 2015 - [info] scp from [email protected]:/var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog to local:/masterha/app1/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog succeeded. Wed Feb 11 02:16:53 2015 - [info] HealthCheck: SSH to 192.168.56.162 is reachable. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 3.3: Determining New Master Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] Finding the latest slave that has all relay logs for recovering other slaves.. Wed Feb 11 02:16:53 2015 - [info] All slaves received relay logs to the same position. No need to resync each other. Wed Feb 11 02:16:53 2015 - [info] Searching new master from slaves.. Wed Feb 11 02:16:53 2015 - [info] Candidate masters from the configuration file: Wed Feb 11 02:16:53 2015 - [info] 192.168.56.162(192.168.56.162:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Wed Feb 11 02:16:53 2015 - [info] Replicating from 192.168.56.161(192.168.56.161:3306) Wed Feb 11 02:16:53 2015 - [info] Primary candidate for the new Master (candidate_master is set) Wed Feb 11 02:16:53 2015 - [info] Non-candidate masters: Wed Feb 11 02:16:53 2015 - [info] Searching from candidate_master slaves which have received the latest relay log events.. Wed Feb 11 02:16:53 2015 - [info] New master is 192.168.56.162(192.168.56.162:3306) Wed Feb 11 02:16:53 2015 - [info] Starting master failover.. Wed Feb 11 02:16:53 2015 - [info] From: 192.168.56.161(192.168.56.161:3306) (current master) +--192.168.56.162(192.168.56.162:3306) To: 192.168.56.162(192.168.56.162:3306) (new master) Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 3.3: New Master Diff Log Generation Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] This server has all relay logs. No need to generate diff files from the latest slave. Wed Feb 11 02:16:53 2015 - [info] Sending binlog.. Wed Feb 11 02:16:53 2015 - [info] scp from local:/masterha/app1/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog succeeded. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 3.4: Master Log Apply Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed. Wed Feb 11 02:16:53 2015 - [info] Starting recovery on 192.168.56.162(192.168.56.162:3306).. Wed Feb 11 02:16:53 2015 - [info] Generating diffs succeeded. Wed Feb 11 02:16:53 2015 - [info] Waiting until all relay logs are applied. Wed Feb 11 02:16:53 2015 - [info] done. Wed Feb 11 02:16:53 2015 - [info] Getting slave status.. Wed Feb 11 02:16:53 2015 - [info] This slave(192.168.56.162)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000002:1355). No need to recover from Exec_Master_Log_Pos. Wed Feb 11 02:16:53 2015 - [info] Connecting to the target slave host 192.168.56.162, running recover script.. Wed Feb 11 02:16:53 2015 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='root' --slave_host=192.168.56.162 --slave_ip=192.168.56.162 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog --workdir=/var/tmp --target_version=5.5.35-log --timestamp=20150211021652 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.56 --slave_pass=xxx Wed Feb 11 02:16:53 2015 - [info] Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.56.161_3306_20150211021652.binlog on 192.168.56.162:3306. This may take long time... Applying log files succeeded. Wed Feb 11 02:16:53 2015 - [info] All relay logs were successfully applied. Wed Feb 11 02:16:53 2015 - [info] Getting new master's binlog name and position.. Wed Feb 11 02:16:53 2015 - [info] mysql-bin.000003:321 Wed Feb 11 02:16:53 2015 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.56.162', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000003', MASTER_LOG_POS=321, MASTER_USER='root', MASTER_PASSWORD='xxx'; Wed Feb 11 02:16:53 2015 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address. Wed Feb 11 02:16:53 2015 - [info] ** Finished master recovery successfully. Wed Feb 11 02:16:53 2015 - [info] * Phase 3: Master Recovery Phase completed. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 4: Slaves Recovery Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] Generating relay diff files from the latest slave succeeded. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] All new slave servers recovered successfully. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] * Phase 5: New master cleanup phase.. Wed Feb 11 02:16:53 2015 - [info] Wed Feb 11 02:16:53 2015 - [info] Resetting slave info on the new master.. Wed Feb 11 02:16:53 2015 - [info] 192.168.56.162: Resetting slave info succeeded. Wed Feb 11 02:16:53 2015 - [info] Master failover to 192.168.56.162(192.168.56.162:3306) completed successfully. Wed Feb 11 02:16:53 2015 - [info] ----- Failover Report ----- app1: MySQL Master failover 192.168.56.161(192.168.56.161:3306) to 192.168.56.162(192.168.56.162:3306) succeeded Master 192.168.56.161(192.168.56.161:3306) is down! Check MHA Manager logs at zbdba1:/masterha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave 192.168.56.162(192.168.56.162:3306) has all relay logs for recovery. Selected 192.168.56.162(192.168.56.162:3306) as a new master. 192.168.56.162(192.168.56.162:3306): OK: Applying all logs succeeded. Generating relay diff files from the latest slave succeeded. 192.168.56.162(192.168.56.162:3306): Resetting slave info succeeded. Master failover to 192.168.56.162(192.168.56.162:3306) completed successfully.
发现已经切换了。
这里只是简单演示了MHA的基本用法,可结合其他技术发挥它更大的优势。例如可以配合keepalived VIP使用,从而对应用透明。