MHA 切换的2个异常(masterha_master_switch line 53)

MHA 在测试手动故障转移和在线切换的过程中,碰到了2个比较诡异的问题,在使用IP地址调用的时候均无法测试成功,出现了Detected dead master xxx does not match with specified dead master以及xxx is not alive。下面是这2个错误问题的描述及解决方案。

1、MHA配置文件
[[email protected] ~]# more /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log

user=mha
password=xxx
ssh_user=root
repl_user=repl  
repl_password=repl  
ping_interval=1
shutdown_script=""
master_ip_online_change_script=""
report_script=""
#master_ip_failover_script=/usr/bin/master_ip_failover
master_ip_failover_script=/tmp/master_ip_failover
 
[server1]
hostname=vdbsrv1
master_binlog_dir=/data/mysqldata

[server2]
hostname=vdbsrv2
master_binlog_dir=/data/mysqldata

[server3]
hostname=vdbsrv3
master_binlog_dir=/data/mysqldata/
#candidate_master=1

2、手动故障转移时的错误提示
[[email protected] ~]# masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.cnf --dead_master_host=192.168.1.6 \
> --dead_master_port=3306 --new_master_host=192.168.1.8 --new_master_port=3306 --ignore_last_failover
--dead_master_ip=<dead_master_ip> is not set. Using 192.168.1.6.
Wed Apr 21 09:08:30 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Apr 21 09:08:30 2015 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Wed Apr 21 09:08:30 2015 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Wed Apr 21 09:08:30 2015 - [info] MHA::MasterFailover version 0.56.
Wed Apr 21 09:08:30 2015 - [info] Starting master failover.
Wed Apr 21 09:08:30 2015 - [info]
Wed Apr 21 09:08:30 2015 - [info] * Phase 1: Configuration Check Phase..
Wed Apr 21 09:08:30 2015 - [info]
Wed Apr 21 09:08:31 2015 - [info] GTID failover mode = 0
Wed Apr 21 09:08:31 2015 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterFailover.pm, ln2083] Detected dead master vdbsrv1(192.168.1.6:3306)
   does not match with specified dead master 192.168.1.6(192.168.1.6:3306)!
Wed Apr 21 09:08:31 2015 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterFailover.pm, ln2151]
   Got ERROR:  at /usr/bin/masterha_master_switch line 53

3、在线切换时的错误提示
[[email protected] ~]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.1.8 \
> --orig_master_is_new_slave --running_updates_limit=10000
Tue Apr 21 11:50:14 2015 - [info] MHA::MasterRotate version 0.56.
Tue Apr 21 11:50:14 2015 - [info] Starting online master switch..
Tue Apr 21 11:50:14 2015 - [info]
Tue Apr 21 11:50:14 2015 - [info] * Phase 1: Configuration Check Phase..
Tue Apr 21 11:50:14 2015 - [info]
Tue Apr 21 11:50:14 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Apr 21 11:50:14 2015 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Apr 21 11:50:14 2015 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Apr 21 11:50:14 2015 - [info] GTID failover mode = 0
Tue Apr 21 11:50:14 2015 - [info] Current Alive Master: vdbsrv1(192.168.1.6:3306)
Tue Apr 21 11:50:14 2015 - [info] Alive Slaves:
Tue Apr 21 11:50:14 2015 - [info]   vdbsrv2(192.168.1.7:3306)  Version=5.6.22-log (oldest major version between slaves) log-bin:enabled
Tue Apr 21 11:50:14 2015 - [info]     Replicating from 192.168.1.6(192.168.1.6:3306)
Tue Apr 21 11:50:14 2015 - [info]   vdbsrv3(192.168.1.8:3306)  Version=5.6.22-log (oldest major version between slaves) log-bin:enabled
Tue Apr 21 11:50:14 2015 - [info]     Replicating from 192.168.1.6(192.168.1.6:3306)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on vdbsrv1(192.168.1.6:3306)? (YES/no): yes
Tue Apr 21 11:50:41 2015 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Tue Apr 21 11:50:41 2015 - [info]  ok.
Tue Apr 21 11:50:41 2015 - [info] Checking MHA is not monitoring or doing failover..
Tue Apr 21 11:50:41 2015 - [info] Checking replication health on vdbsrv2..
Tue Apr 21 11:50:41 2015 - [info]  ok.
Tue Apr 21 11:50:41 2015 - [info] Checking replication health on vdbsrv3..
Tue Apr 21 11:50:41 2015 - [info]  ok.
Tue Apr 21 11:50:41 2015 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate.pm, ln228] 192.168.1.8 is not alive!
Tue Apr 21 11:50:41 2015 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate.pm, ln613] Failed to get new master!
Tue Apr 21 11:50:41 2015 - [error][/usr/lib/perl5/site_perl/5.8.8/MHA/MasterRotate.pm, ln652] Got ERROR:  at /usr/bin/masterha_master_switch line 53

4、解决方案
      直接将IP地址替换为主机名后问题解决,不再演示。

按官方文档描述,参数--dead_master_host=(hostname),而不是可以用IP地址。

If these parameters are not set, --dead_master_ip will be the result of gethostbyname(dead_master_host), and --dead_master_port will be 3306.

时间: 2024-10-09 05:36:11

MHA 切换的2个异常(masterha_master_switch line 53)的相关文章

MySQL MHA切换失败一例

先看下引起问题的密码啥样, 包含两个特殊字符[和~. $ egrep -w 'user|password' /etc/masterha/app1.cnf password=P[AI3M~5z user=mha_mgr 用户mha_mgr的作用, 如下文档中的说明, 可见其对数据库实例起到管理的作用. MySQL administrative database username to the target MySQL server. This should be root because it r

hacmp replace模式,资源组切换后,应用异常的解释和解决

切换后,vg.fs.ip等均正常,但个别应用异常:因为是replace模式,所以切换后,备机上有2个同一网段的地址: 接受数据地址是服务地址,发送数据地址是备机boot1地址,个别外联应用不认可这个boot1地址,所以造成应用异常. 解决应通过修改路由的方式来完成. 更多细节请联络作者. hacmp replace模式,资源组切换后,应用异常的解释和解决,布布扣,bubuko.com

大查询对mha切换的影响

先说结论: 如果在线MHA切换,为了减少对系统的影响,应该先让实例只读,等待大查询结束之后,才开始切换,而更好的做法的,自动kill掉大查询,确保切换影响时间最少. (1)正常的MHA切换程序中,mha会调用FLUSH  NO_WRITE_TO_BINLOG TABLES这个语句: (2)这个时候,如果有大的查询在执行,mha就会需要等待.同事前天做了一次mha在线切换,切换的时候,当时有如下的语句正在执行: select gpl.ID, gpl.GAME, gpl.LOGIN_ACCOUNT,

Error: &quot;DEVELOPER_DIR&quot; is not defined at ./symbolicatecrash line 53

项目问题解析“Error: "DEVELOPER_DIR" is not defined at ./symbolicatecrash line 53.”这个问题是最近调试app的时候出现的,因为自己提交的app遭到拒绝,需要调试,在使用symbolicatecrash的时候出现了问题. 在这里的解决办法是: 在不关闭当前终端的情况下,输入: export DEVELOPER_DIR="/Applications/XCode.app/Contents/Developer&quo

Linux主机root切换用户获取$?结果异常案例

1. 问题描述 首先是NBU备份I2000数据库时,监控页面查看到总是失败,但事实上,rman备份操作已经完成,只是最后取$?时异常导致而已. 其次,在一次业务升级中,ideploy工具自动升级之前做的环境检查,各个部件都检查不通过,无法自动升级,最后手工做的升级,几百块单板,真是悲剧. 2. 问题现象 NDMC21:~ # su - sshusr -c "ls";echo $?  bin breeze check.sh Documents ideploy_file_history n

mha切换脚本可用的

#!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '192.168.237.120/24'; my $

MHA 日常管理

MHA是众多使用MySQL数据库企业高可用的不二选择,它简单易用,功能强大,实现了基于MySQL replication架构的自动主从故障转移.本文主要描述MHA的日常相关操作,同时给出了关于MHA的相关连接,供大家参考. 一.MHA的主要脚本(perl)1.manager端masterha_check_ssh              检查MHA的SSH配置状况masterha_check_repl             检查MySQL复制状况masterha_manger         

MHA的在线切换后的一些总结(mha方案来自网络)

mha方案来自:http://www.cnblogs.com/xuanzhi201111/p/4231412.html MHA的在线切换 192.168.2.131 [root bin]$ masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.2.129 --new_master_port=3306 --orig_master_is_new_slave

MySQL 有关MHA搭建与切换的几个错误log

1:masterha_check_repl 副本集方面报错  replicates is not defined in the configuration file! 具体信息如下: # /usr/local/bin/masterha_check_repl --conf=/etc/mha/app1.cnf Thu Nov 21 15:33:15 2018 - [warning] Global configuration file /etc/masterha_default.cnf not fou