如果对MHA还不了解,建议先看以下链接对应的博文。
http://os.51cto.com/art/201307/401702.htm //这篇博文把搭建MHA的前期准备共组写的很清楚
http://blog.itpub.net/26230597/viewspace-1570798/ // 这篇博文上的安装过程写的比较具体,而且写了MHA除了支持自动故障切换,还可以做手动的故障切换
http://www.dataguru.cn/thread-457284-1-1.html // 这篇博文把MHA的配置参数等信息解释的很清楚
http://467754239.blog.51cto.com/4878013/1695175 // 这个博文上把整个MHA的切换过程都描述了。博文上也描述了MAH自带虚拟IP转移的脚本,我理解应该不需要keepalive。但是如何把出问题的主在加入到MHA中作为新的从设备好像存在点问题
如果以对MHA有所了解,可以直接阅读。
环境:centos 6.5
mysql 5.7 (yum安装)
mha0.56
master: 192.168.21.10
backup:192.168.21.11
slave:192.168.21.12
yum安装mha
1 安装epel源
2 下载官网的rpm包 官方介绍:https://code.google.com/p/mysql-master-ha/
按照以下博文链接做MHA的实验
http://blog.csdn.net/lichangzai/article/details/50470771 博文链接
以下是我在实验过程中遇到的问题,这些问题都是在执行
masterha_check_repl --conf=/etc/masterha/app1/app1.cnf 发生
有些问题网上很难找到解决办法。现在分享给大家。
问题1
[[email protected] ~]# masterha_check_repl --conf=/etc/mha/app1.conf
Fri Jul 22 09:08:54 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jul 22 09:08:54 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..
Fri Jul 22 09:08:54 2016 - [info] Reading server configuration from /etc/mha/app1.conf..
Fri Jul 22 09:08:54 2016 - [info] MHA::MasterMonitor version 0.56.
Fri Jul 22 09:08:54 2016 - [info] GTID failover mode = 0
Fri Jul 22 09:08:54 2016 - [info] Dead Servers:
Fri Jul 22 09:08:54 2016 - [info] Alive Servers:
Fri Jul 22 09:08:54 2016 - [info] 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:08:54 2016 - [info] 192.168.21.11(192.168.21.11:3306)
Fri Jul 22 09:08:54 2016 - [info] 192.168.21.12(192.168.21.12:3306)
Fri Jul 22 09:08:54 2016 - [info] Alive Slaves:
Fri Jul 22 09:08:54 2016 - [info] 192.168.21.11(192.168.21.11:3306) Version=5.7.16 (oldest major version between slaves) log-bin:disabled
Fri Jul 22 09:08:54 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:08:54 2016 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Jul 22 09:08:54 2016 - [info] 192.168.21.12(192.168.21.12:3306) Version=5.7.16 (oldest major version between slaves) log-bin:disabled
Fri Jul 22 09:08:54 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:08:54 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:08:54 2016 - [info] Checking slave configurations..
Fri Jul 22 09:08:54 2016 - [info] read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:08:54 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:08:54 2016 - [warning] log-bin is not set on slave 192.168.21.11(192.168.21.11:3306). This host cannot be a master.
Fri Jul 22 09:08:54 2016 - [info] read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:08:54 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:08:54 2016 - [warning] log-bin is not set on slave 192.168.21.12(192.168.21.12:3306). This host cannot be a master.
Fri Jul 22 09:08:54 2016 - [info] Checking replication filtering settings..
Fri Jul 22 09:08:54 2016 - [info] binlog_do_db= , binlog_ignore_db= mysql
Fri Jul 22 09:08:54 2016 - [info] Replication filtering check ok.
Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln361] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48
Fri Jul 22 09:08:54 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Fri Jul 22 09:08:54 2016 - [info] Got exit code 1 (Not master dead).
解决方法:
在两个从库上开启二进制日志即可(花了 一天时间,找不到解决方法,最后还是靠自己的理解及测试解决的,骄傲!!)具体配置不在贴上来了。
问题2
[[email protected] ~]# masterha_check_repl --conf=/etc/mha/app1.conf
Fri Jul 22 09:26:48 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jul 22 09:26:48 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..
Fri Jul 22 09:26:48 2016 - [info] Reading server configuration from /etc/mha/app1.conf..
Fri Jul 22 09:26:48 2016 - [info] MHA::MasterMonitor version 0.56.
Fri Jul 22 09:26:48 2016 - [info] GTID failover mode = 0
Fri Jul 22 09:26:48 2016 - [info] Dead Servers:
Fri Jul 22 09:26:48 2016 - [info] Alive Servers:
Fri Jul 22 09:26:48 2016 - [info] 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:26:48 2016 - [info] 192.168.21.11(192.168.21.11:3306)
Fri Jul 22 09:26:48 2016 - [info] 192.168.21.12(192.168.21.12:3306)
Fri Jul 22 09:26:48 2016 - [info] Alive Slaves:
Fri Jul 22 09:26:48 2016 - [info] 192.168.21.11(192.168.21.11:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:26:48 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:26:48 2016 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Jul 22 09:26:48 2016 - [info] 192.168.21.12(192.168.21.12:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:26:48 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:26:48 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:26:48 2016 - [info] Checking slave configurations..
Fri Jul 22 09:26:48 2016 - [info] read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:26:48 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:26:48 2016 - [info] read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:26:48 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:26:48 2016 - [info] Checking replication filtering settings..
Fri Jul 22 09:26:48 2016 - [info] binlog_do_db= , binlog_ignore_db= mysql
Fri Jul 22 09:26:48 2016 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln443] Binlog filtering check failed on 192.168.21.11(192.168.21.11:3306)! All log-bin enabled servers must have same binlog filtering rules (same binlog-do-db and binlog-ignore-db). Check SHOW MASTER STATUS output and set my.cnf correctly.
解决方法:
我在主上开了复制过滤,在从上也必须开启,修改配置文件后还不能reload,需要restart。
问题3
[[email protected] ~]# masterha_check_repl --conf=/etc/mha/app1.conf
Fri Jul 22 09:30:04 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jul 22 09:30:04 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..
Fri Jul 22 09:30:04 2016 - [info] Reading server configuration from /etc/mha/app1.conf..
Fri Jul 22 09:30:04 2016 - [info] MHA::MasterMonitor version 0.56.
Fri Jul 22 09:30:04 2016 - [info] GTID failover mode = 0
Fri Jul 22 09:30:04 2016 - [info] Dead Servers:
Fri Jul 22 09:30:04 2016 - [info] Alive Servers:
Fri Jul 22 09:30:04 2016 - [info] 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:30:04 2016 - [info] 192.168.21.11(192.168.21.11:3306)
Fri Jul 22 09:30:04 2016 - [info] 192.168.21.12(192.168.21.12:3306)
Fri Jul 22 09:30:04 2016 - [info] Alive Slaves:
Fri Jul 22 09:30:04 2016 - [info] 192.168.21.11(192.168.21.11:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:30:04 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:30:04 2016 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Jul 22 09:30:04 2016 - [info] 192.168.21.12(192.168.21.12:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:30:04 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:30:04 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:30:04 2016 - [info] Checking slave configurations..
Fri Jul 22 09:30:04 2016 - [info] read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:30:04 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:30:04 2016 - [info] read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:30:04 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:30:04 2016 - [info] Checking replication filtering settings..
Fri Jul 22 09:30:04 2016 - [info] binlog_do_db= , binlog_ignore_db= mysql
Fri Jul 22 09:30:04 2016 - [info] Replication filtering check ok.
Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln393] 192.168.21.11(192.168.21.11:3306): User repl does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.
Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403
Fri Jul 22 09:30:04 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Fri Jul 22 09:30:04 2016 - [info] Got exit code 1 (Not master dead).
解决方法:
具有复制权限的用户必须在所有节点上都创建一次,具有管理权限的用户也是一样,这两点在网上的好多博文上都没说清楚。
问题4
[[email protected] ~]# masterha_check_repl --conf=/etc/mha/app1.conf
Fri Jul 22 09:42:46 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jul 22 09:42:46 2016 - [info] Reading application default configuration from /etc/mha/app1.conf..
Fri Jul 22 09:42:46 2016 - [info] Reading server configuration from /etc/mha/app1.conf..
Fri Jul 22 09:42:46 2016 - [info] MHA::MasterMonitor version 0.56.
Fri Jul 22 09:42:46 2016 - [info] GTID failover mode = 0
Fri Jul 22 09:42:46 2016 - [info] Dead Servers:
Fri Jul 22 09:42:46 2016 - [info] Alive Servers:
Fri Jul 22 09:42:46 2016 - [info] 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:42:46 2016 - [info] 192.168.21.11(192.168.21.11:3306)
Fri Jul 22 09:42:46 2016 - [info] 192.168.21.12(192.168.21.12:3306)
Fri Jul 22 09:42:46 2016 - [info] Alive Slaves:
Fri Jul 22 09:42:46 2016 - [info] 192.168.21.11(192.168.21.11:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:42:46 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:42:46 2016 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Jul 22 09:42:46 2016 - [info] 192.168.21.12(192.168.21.12:3306) Version=5.7.16-log (oldest major version between slaves) log-bin:enabled
Fri Jul 22 09:42:46 2016 - [info] Replicating from 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:42:46 2016 - [info] Current Alive Master: 192.168.21.10(192.168.21.10:3306)
Fri Jul 22 09:42:46 2016 - [info] Checking slave configurations..
Fri Jul 22 09:42:46 2016 - [info] read_only=1 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:42:46 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.11(192.168.21.11:3306).
Fri Jul 22 09:42:46 2016 - [info] read_only=1 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:42:46 2016 - [warning] relay_log_purge=0 is not set on slave 192.168.21.12(192.168.21.12:3306).
Fri Jul 22 09:42:46 2016 - [info] Checking replication filtering settings..
Fri Jul 22 09:42:46 2016 - [info] binlog_do_db= , binlog_ignore_db= mysql
Fri Jul 22 09:42:46 2016 - [info] Replication filtering check ok.
Fri Jul 22 09:42:47 2016 - [info] GTID (with auto-pos) is not supported
Fri Jul 22 09:42:47 2016 - [info] Starting SSH connection tests..
Fri Jul 22 09:42:48 2016 - [info] All SSH connection tests passed successfully.
Fri Jul 22 09:42:48 2016 - [info] Checking MHA Node version..
Fri Jul 22 09:42:49 2016 - [info] Version check ok.
Fri Jul 22 09:42:49 2016 - [info] Checking SSH publickey authentication settings on the current master..
Fri Jul 22 09:42:49 2016 - [info] HealthCheck: SSH to 192.168.21.10 is reachable.
Fri Jul 22 09:42:49 2016 - [info] Master MHA Node version is 0.56.
Fri Jul 22 09:42:49 2016 - [info] Checking recovery script configurations on 192.168.21.10(192.168.21.10:3306)..
Fri Jul 22 09:42:49 2016 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/logs/mysqllog/mysql-bin --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000001
Fri Jul 22 09:42:49 2016 - [info] Connecting to [email protected](192.168.21.10:22)..
Failed to save binary log: Binlog not found from /logs/mysqllog/mysql-bin! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager‘s configuration file and try again.
at /usr/bin/save_binary_logs line 123
eval {...} called at /usr/bin/save_binary_logs line 70
main::main() called at /usr/bin/save_binary_logs line 66
Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln158] Binlog setting check failed!
Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln405] Master configuration failed.
Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48
Fri Jul 22 09:42:49 2016 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Fri Jul 22 09:42:49 2016 - [info] Got exit code 1 (Not master dead).
解决方法:
如果手动定义了二进制日志文件的路径,就必须在mha的配置文件中制定master_binlog_dir=‘二进制日志文件所在目录‘
总结:用我博文中介绍的MHA版本,应该需要在所有的数据库中都开启二进制日志,中继日志,授权也应该都相同,配置文件也基本相同。我想在这个前提下在安装执行MHA应该不会遇上太多问题了。只是目前还不能确定这种做法是不是正解。