mysql的AB及读写和集群

Mysql的AB及读写

第1章 Mysql的AB配置

1.1 master配置

1.2 slave配置

1.2.1 192.168.13.190
1.2.2 192.168.13.191
1.2.3 192.168.13.192
1.2.4 192.168.13.193
1.2.4 192.168.13.189

第2章读写分离

2.1 安装mycat

2.1.1 server.xml
2.1.1 schema.xml

2.2 启动mycat

2.2.1 启动mycat
2.2.2 mycat日志

2.3 登录mycat相关问题

2.4 小结

第3章 mysql-proxy

3.1 安装mysql-proxy

3.2 配置mysql-proxy

3.3 启动mysql-proxy和查看

第4章集群

4.1 版本介绍

4.2 MHA

4.2.1 ssh互信
4.2.1 ABB（一主二从）/AABB（二主多从）架构搭建
4.2.1 MHA安装（管理节点MHA_manager）
4.2.2 MHA安装（其他节点安装MHA_node）
4.2.3 MHA配置
4.2.4 MHA互信和环境检查
4.2.5 启动及相关日志文件
4.2.6 MHA测试

4.3 集群小结

第5章错误集锦

5.1 主从报错

5.2 集群报错

第1章 Mysql的AB配置

当MySQL的版本是5.5或高于5.5的时候，其中大部分的内容相似

主要是5.5之后不再支持master打头的参数

主\|从	IP地址	允许同步的用户	读\|写	mysql版本	Mycat版本	JDK版本	操作系统版本
master	192.168.13.189	slave	write	mysql-8.0.13			CentOS 7.4.1708
master	192.168.13.192	copy	write	mysql-8.0.13	Mycat-server-1.6.6.1	1.8.0_191	CentOS 7.4.1708
slave	192.168.13.190	slave	read	mysql-8.0.13			CentOS 7.4.1708
	192.168.13.191	slave	read	mysql-8.0.13			CentOS 7.4.1708
	192.168.13.193	copy	read	mysql-8.0.13			CentOS 7.4.1708

1.1 master配置

Master：

在/etc/my.cnf 添加：

server-id = 1

log-bin=/home/mysql/logs/binlog/bin-log（若报错则不加路径，如下）

log-bin=bin-log

max_binlog_size = 500M

binlog_cache_size = 128K

binlog-do-db = adb

binlog-ignore-db = mysql

log-slave-updates

expire_logs_day=2（在centos7.3上易报错，若报错，可不加）

binlog_format="MIXED"

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

server-id = 1 #服务器标志号，注意在配置文件中不能出现多个这样的标识，如果出现多个的话mysql以第一个为准，一组主从中此标识号不能重复。

log-bin=/home/mysql/logs/binlog/bin-log #开启bin-log，并指定文件目录和文件名前缀。

max_binlog_size = 500M #每个bin-log最大大小，当此大小等于500M时会自动生成一个新的日志文件。一条记录不会写在2个日志文件中，所以有时日志文件会超过此大小。

binlog_cache_size = 128K #日志缓存大小

binlog-do-db = adb #需要同步的数据库名字，如果是多个，就以此格式在写一行即可。

binlog-ignore-db = mysql #不需要同步的数据库名字，如果是多个，就以此格式在写一行即可。

log-slave-updates #当Slave从Master数据库读取日志时更新新写入日志中，如果只启动log-bin 而没有启动log-slave-updates则Slave只记录针对自己数据库操作的更新。

expire_logs_day=2 #设置bin-log日志文件保存的天数，此参数mysql5.0以下版本不支持。

binlog_format=”MIXED” #设置bin-log日志文件格式为：MIXED，可以防止主键重复。

接着重启MySQL：/etc/init.d/mysqld restart

再进入MySQL：mysql -u root -p123456

创建backup用户：

创建一个用于让从数据库连接的用户（以前那种方式不能用了）

mysql> CREATE USER ‘copy‘@‘%‘ IDENTIFIED WITH mysql_native_password BY ‘tqw961110‘;（密码强度可能不够会报错，[email protected]@）或下面一个

mysql>CREATE USER ‘slave‘@‘%‘ IDENTIFIED BY ‘[email protected]@‘;

mysql>grant all on *.* to ‘copy‘@‘%‘ identified by "[email protected]" #这个方式也可以

mysql>flush privileges #刷新授权表

mysql> show master status;

Master的操作就完了

【server_id千万不要一样，否则会报错】

mysql> show master status;【192.168.13.192】

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

mysql> show master status;【192.168.13.189】

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

| mysql-bin.000006 | 3490 | | | |

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

1.2 slave配置步骤

在/etc/my.cnf 添加：

server_id=2

replicate-do-db=database1 　　　　 #要同步的数据库的名字

replicate-ignore-db=mysql 　　　　　　#被忽略的数据库

接着重启MySQL：/etc/init.d/mysqld restart

再进入MySQL：mysql -u root -p123456

同步执行此操作：

mysql> change master to

-> master_host=‘2.2.2.2‘,

-> master_user=‘backup‘,

-> master_password=‘123456‘,

-> master_log_file=‘bin_log.000003‘,　　　　　　　　　　# master上记录的日志文件

-> master_log_pos=120; 　　　　　　　　　　　　　　 # master上记录的日志文件位置

Query OK, 0 rows affected, 2 warnings (0.06 sec)

同步开启：start slave；

同步关闭：stop slave；

最后开启同步 start slave即可

1.2.1 192.168.13.190

mysql> change master to

-> master_host=‘192.168.13.189‘,

-> master_user=‘slave‘,

-> master_password=‘XINyang3009‘,

-> master_log_file=‘mysql-bin.000006‘,

-> master_log_pos=3490;

Query OK, 0 rows affected, 2 warnings (0.06 sec)

mysql> start slave;

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

Slave_IO_Running: Yes（必须是yes）

Slave_SQL_Running: Yes（必须是yes）

Replicate_Do_DB: mysql

1.2.2 192.168.13.191

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

Slave_IO_Running: Yes（必须是yes）

Slave_SQL_Running: Yes（必须是yes）

Replicate_Do_DB: mysql

1.2.3 192.168.13.192

1.2.4 192.168.13.193

1.2.4 192.168.13.189

至此，二主三从搭建完成。

#Slave_IO_Running：连接到主库，并读取主库的日志到本地，生成本地日志文件
#Slave_SQL_Running:读取本地日志文件，并执行日志里的SQL命令。

第2章 读写分离

在这里读写分离中间件用的是mycat，部署在192.168.13.192上

2.1 安装mycat

1、安装JDK，mycat依赖Java环境

2、解压Mycat-server-1.6.6.1-release-20181031195535-linux.tar.gz，同时修改环境变量 /etc/profile

[[email protected] conf]# cat /etc/profile

3、进入配置文件目录cd ./mycat/conf

（1）server.xml是登陆mycat的用户进行配置的配置文件

（2）schema.xml 对真实数据库进行配置的配置文件

2.1.1 server.xml

[[email protected] conf]# cat server.xml

97 <user name="root" defaultAccount="true">

99 <property name="password">[email protected]@</property>

100 <property name="schemas">TESTDB</property>

101

102

103 <!--

104 <privileges check="false">

105 <schema name="TESTDB" dml="0110" >

106 <table name="tb01" dml="0000"></table>

107 <table name="tb02" dml="1111"></table>

108 </schema>

109 </privileges>

110 -->

111 </user>

112

113 <user name="user">

114 <property name="password">user</property>

115 <property name="schemas">TESTDB</property>

116

117 </user>

这里配置了两个可以来连接的用户

用户1 test 密码test 给予了此用户TESTDB数据库的权限

用户2 user 密码user 给予了此用户TESTDB数据库的只读权限

注意这里的TESTDB 不一定是你数据库上的真实库名.可以任意指定.只要和接下来的schema.xml的配置文件中的库名统一即可

2.1.1 schema.xml

[[email protected] conf]# cat schema.xml

<?xml version="1.0"?>

<!DOCTYPE mycat:schema SYSTEM "schema.dtd">

<mycat:schema xmlns:mycat="http://io.mycat/">

<dataHost name="write_master_1" maxCon="1000" minCon="10" balance="0"

writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">

<heartbeat>select user()</heartbeat>

</writeHost>

</dataHost>

</mycat:schema>

1、<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">

这里TESTDB 就是我们对外声称的我们有数据库的名称必须和server.xml中的用户指定的数据库名称一致，添加一个dataNode="dn1" 是指定了我们这个库只在dn1上.没有进行分库

2、<dataNode name="dn1" dataHost="localhost1" database="db1" />

这里只需要改database的名字 db1 就是你真实数据库服务上的数据库名，根据你自己的数据库名进行修改.

3、<dataHost name="localhost1" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">

这里只需要配置三个地方 balance="1"与writeType="0" ,switchType=”1”

a. balance 属性负载均衡类型，目前的取值有 4 种：

　　1. balance="0", 不开启读写分离机制，所有读操作都发送到当前可用的 writeHost 上。

　　2. balance="1"，全部的 readHost 与 stand by writeHost 参与 select 语句的负载均衡，简单的说，当双主双从模式(M1 ->S1 ， M2->S2，并且 M1 与 M2 互为主备)，正常情况下， M2,S1,S2 都参与 select 语句的负载均衡。

　　3. balance="2"，所有读操作都随机的在 writeHost、 readhost 上分发。

　　4. balance="3"，所有读请求随机的分发到 wiriterHost 对应的 readhost 执行,writerHost 不负担读压力，注意 balance=3 只在 1.4 及其以后版本有， 1.3 没有。

b. writeType 属性

　　负载均衡类型，目前的取值有 3 种：

　　1. writeType="0", 所有写操作发送到配置的第一个 writeHost，第一个挂了切到还生存的第二个

writeHost，重新启动后已切换后的为准，切换记录在配置文件中:dnindex.properties .

　　2. writeType="1"，所有写操作都随机的发送到配置的 writeHost。

　　3. writeType="2"，没实现。

c. switchType 属性

　　　 1：表示不自动切换

　　　　1 ：默认值，自动切换

　　　　2 ：基于MySQL 主从同步的状态决定是否切换

4、<writeHost host="hostM1" url="192.168.1.100:3306" user="root" password="123456">

<readHost host="hostS1" url="192.168.1.101:3306" user="root" password="123456" />

</writeHost>

这里是配置的我们的两台读写服务器IP地址访问端口和访问用户的用户名和密码

2.2 启动mycat

2.2.1 启动mycat

到bin目录下，执行./mycat restart

[[email protected] bin]# ps -aux|grep mycat

root 30319 0.0 0.0 17816 744 ? Sl 15:51 0:00 /usr/local/mysql/mycat/bin/./wrapper-linux-x86-64 /usr/local/mysql/mycat/conf/wrapper.conf wrapper.syslog.ident=mycat wrapper.pidfile=/usr/local/mysql/mycat/logs/mycat.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/mycat

root 30321 3.5 6.5 6927816 253020 ? Sl 15:51 0:03 java -DMYCAT_HOME=. -server -XX:MaxPermSize=64M -XX:+AggressiveOpts -XX:MaxDirectMemorySize=2G -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1984 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Xmx4G -Xms1G -Djava.library.path=lib -classpath lib/wrapper.jar:conf:lib/asm-4.0.jar:lib/commons-collections-3.2.1.jar:lib/commons-lang-2.6.jar:lib/curator-client-2.11.0.jar:lib/curator-framework-2.11.0.jar:lib/curator-recipes-2.11.0.jar:lib/disruptor-3.3.4.jar:lib/dom4j-1.6.1.jar:lib/druid-1.0.26.jar:lib/ehcache-core-2.6.11.jar:lib/fastjson-1.2.12.jar:lib/guava-19.0.jar:lib/hamcrest-core-1.3.jar:lib/hamcrest-library-1.3.jar:lib/jline-0.9.94.jar:lib/joda-time-2.9.3.jar:lib/jsr305-2.0.3.jar:lib/kryo-2.10.jar:lib/leveldb-0.7.jar:lib/leveldb-api-0.7.jar:lib/libwrapper-linux-ppc-64.so:lib/libwrapper-linux-x86-32.so:lib/libwrapper-linux-x86-64.so:lib/log4j-1.2-api-2.5.jar:lib/log4j-1.2.17.jar:lib/log4j-api-2.5.jar:lib/log4j-core-2.5.jar:lib/log4j-slf4j-impl-2.5.jar:lib/mapdb-1.0.7.jar:lib/minlog-1.2.jar:lib/mongo-java-driver-2.11.4.jar:lib/Mycat-server-1.6.6.1-release.jar:lib/mysql-binlog-connector-java-0.16.1.jar:lib/mysql-connector-java-5.1.35.jar:lib/netty-3.7.0.Final.jar:lib/netty-buffer-4.1.9.Final.jar:lib/netty-common-4.1.9.Final.jar:lib/objenesis-1.2.jar:lib/reflectasm-1.03.jar:lib/sequoiadb-driver-1.12.jar:lib/slf4j-api-1.6.1.jar:lib/univocity-parsers-2.2.1.jar:lib/velocity-1.7.jar:lib/wrapper.jar:lib/zookeeper-3.4.6.jar -Dwrapper.key=aBVwhEZHoe1K2Vpw -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=30319 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp io.mycat.MycatStartup start

root 30458 0.0 0.0 112676 956 pts/2 S+ 15:53 0:00 grep --color=auto mycat

[[email protected] bin]# netstat -anlpt|grep :*066

tcp6 0 0 :::8066 :::* LISTEN 30321/java

tcp6 0 0 :::9066 :::* LISTEN 30321/java

[[email protected] bin]#

8066：数据端口

9066：管理端口

命令行的登陆是通过 9066 管理端口来操作。

2.2.2 mycat日志

[[email protected] logs]# cd /usr/local/mysql/mycat/logs

warpper 日志：mycat启动，停止，添加为服务等都会记录到此日志文件，如果系统环境配置错误或缺少配置时，导致 Mycat 无法启动，可以通过查看 warrpper.log 定位具体错误原因。

mycat.log：为mycat主要日志文件，记录了启动时分配的相关buffer信息，数据源连接信息，连接池，动态类加载信息等等在log4j.xml文件中进行相关配置，如保留个数，大小，字符集，日志文件大小等。非启动状态下可以删除，启动后会自动生成该日志文件

2.3 登录mycat相关问题

命令：mysql -uroot [email protected]@ -h192.168.13.192 -P9066

[[email protected] conf]# mysql -uroot [email protected]@ -h192.168.13.192 -P9066

mysql: [Warning] Using a password on the command line interface can be insecure.

ERROR 1045 (HY000): Access denied for user ‘root‘, because password is error

特别注意：本次实验的mysql版本均为mysql-8.0.13，此故障折腾了我两三天，以上的配置查看了无数遍，后来查阅相关资料才知道目前MyCat仍主要面对MySql 5.5, 5.6, 5.7版，对最新的MySql 8尚未完全支持，需要用户对MySql 8和MyCat的配置进行一系列的修改。其中对mycat的登陆方式有改变，Mycat登录逻辑库的传统方式是：mysql -uroot -p -h127.0.0.1 -P8066 -DTESTDB，对于MySql 8，会报密码错误方式，这是由于Mysql 8的缺省加密方式已经改为caching_sha2_password，而MyCat对此尚不支持。当然可以通过修改MySQL8的加密方式使其通过，但未对MySQL8完全支持，所以会在功能上也会大打折扣，目前mycat官网并没有对MySQL8版本的详细文档，官网的文档目前对MySQL5.X依旧适用。

之前做实验之所以成功是因为我在云上也装了一台mysql 5.6，并且也是用5.6做的主从，当时做完之后没有用用mysql 8把版本做连接测试，用的是5.6版本，而此次用mysql8版本连接测试显示始终失败，还折腾两天左右，我才想到可能是版本的问题，为此我又在mysql5.6版本上重新做了一遍验证一下是否是由于版本太新导致mycat不支持从而出现的错误，于是又开始折腾很久。

1、版本简介：

OS版本	IP	读\|写	MySQL版本	mycat版本
CentOS 7.2	2.2.2.10	读\|写	5.6.19	Mycat-server-1.6.6.1

为尽快达到实验目的，只在一台虚机安装MySQL（时长约3h），并读写也在该主机上。

2、安装MySQL 5.6（详细过程略）

3、配置文件server.xml和schema.xml

4、mycat启动

到bin目录下，执行./mycat restart

5、登录

[[email protected] conf]# mysql -uroot [email protected]@ -h2.2.2.10 -P9066

Warning: Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 2

Server version: 5.6.29-mycat-1.6.6.1-release-20181031195535 MyCat Server (monitor)

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

mysql>

当出现mycat版本就说明登录成功。

6、查看当前可读写的入口

mysql> show @@datasource;

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---| dn1 | write1 | mysql | 2.2.2.10 | 3306 | W | 0 | 10 | 1000 | 442 | 2 | 1 |

| dn1 | write1 | mysql | 2.2.2.10 | 3306 | R | 0 | 8 | 1000 | 436 | 0 | 0 |

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

2 rows in set (0.02 sec)

7、知识点总结

8、总结

由上可知，最新的mycat并不支持最新版本的mysql，也许能够连上去但在并未完全开放之前，相信很多mycat的功能会受限。

经过第二次试验可知，在mysql 8上线的配置并没有错，只不过是由于mycat的不支持最新的mysql版本导致，所以等mycat支持mysql 8的时候就可以生效了

2.4 小结

由于mycat目前并不全面支持mysql最新版本，所以目前读写分离我选的是mysql官方推荐的mysql-proxy，功能虽小但基于mysql 8版本的过度期还是一款不错的产品。

第3章 mysql-proxy

3.1 安装mysql-proxy

1、使用本地镜像安装相关环境：

yum install -y gcc* gcc-c++* autoconf* automake* zlib* libxml* ncurses-devel* libmcrypt* libtool* flex* pkgconfig* libevent* glib*

2、下载mysql-proxy-0.8.5-linux-glibc2.3-x86-64bit.tar

3、解压安装

4、环境变量

[[email protected] ~]# vi .bash_profile

PATH=$PATH:$HOME/bin:/mysqlprox/proxy/bin

3.2 配置mysql-proxy

mkdir /home/mysql-proxy/logs 【日志文件位置】
mkdir /home/mysql-proxy/lua 【脚本位置】

cd /usr/local/mysql-proxy 【mysql-proxy安装位置】
cp share/doc/mysql-proxy/rw-splitting.lua ./lua 【复制读写分离配置文件】

vi /etc/mysql-proxy.cnf 【创建配置文件】

[mysql-proxy]

user=root 【运行mysql-proxy用户】

admin-username=proxy 【主从mysql共有的用户】

admin-password=123456 【用户的密码】

proxy-address=192.168.11.31:4040【 mysql-proxy运行ip和端口，不加端口，默认4040】

proxy-read-only-backend-addresses=192.168.11.31 【指定后端从slave读取数据】

proxy-backend-addresses=192.168.11.34 【指定后端主master写入数据】

proxy-lua-script=/usr/local/mysql-proxy/lua/rw-splitting.lua 【指定读写分离配置文件位置】

admin-lua-script=/usr/local/mysql-proxy/lua/admin.lua 【指定管理脚本】

log-file=/var/log/mysql-proxy.log

log-level=info 【定义log日志级别】

daemon=true【以守护进程方式运行】

keepalive=true 【mysql-proxy崩溃时，尝试重启】

（1）(critical) Key file contains key “keepalive” which has a value that cannot be interpreted.

daemon=1

keepalive=1

<1> 给配置文件执行权限
chmod 660 /etc/mysql-porxy.cnf

<2> 修改读写分离配置文件

vim /usr/local/mysql-proxy/lua/rw-splitting.luaif not proxy.global.config.rwsplit

min_idle_connections = 1, 【默认超过4个连接数时，才开始读写分离，改为1】

max_idle_connections = 1, 【默认8，改为1】

3.3 启动mysql-proxy和查看

启动：/usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/etc/mysql-proxy.cnf

查看：netstat -tupln | grep 4000

这样一个读写分离就完全了

第4章集群

目前Mysql集群用的MHA或keepalive的方式做的，但都通过VIP进行管理，若是在云上做集群会有一些隐患。和H3C云工程师进行交流了解，他们不建议我们在云上做任何的集群，因为集群有心跳检测，会与虚机的备份有冲突，当虚机进行备份时会造成心跳停止，导致集群报错，解决办法就是不使用虚机备份，使用其他方式进行备份方式。中国经济网就是采用虚机做集群但不采用虚机备份的方式。

其他备份方式与虚机备份又有区别：

（1）其他备份方式：只能备份指定的数据，不能备份虚机的相关信息。

（2）虚机备份：不仅能备份指定的数据还能备份虚机的相关配置信息，在虚机崩溃时还可以进行虚机恢复。

4.1 版本介绍

属性	IP地址	允许同步的用户	读\|写	mysql版本	MHA版本	操作系统版本
master	192.168.13.189	slave	write	mysql-8.0.13	0.54	CentOS 7.4
	192.168.13.190	slave	read	mysql-8.0.13	0.54	CentOS 7.4
slave	192.168.13.192	slave	read	mysql-8.0.13	0.58	CentOS 7.4
MHA_manager	192.168.13.171			mysql-8.0.13	0.58	CentOS 7.4

MHA实现机制：

（1）监控AB的状态

（2）完整的选举机制（看谁的数据跟master最接近）

（3）让一个B切换到新A

（4）保证数据的完整性（通过差异还原）

因为AB复制是异步复制，所以可能有一些数据尚没有被B拉到其relay_log中,即AB数据不一致，MHA是怎样解决这种情况的呢？

（1）mha_manager 使用scp命令将A当前binlog拷贝到mha_manager

（2）待新A（选举：依据谁的relay_log新）产生后，mha_manager再拿着旧A的binlog和新的relay_log做比对，并进行差异还原以保证新A和旧A数据的一致性

（3）mha_manager最后拿着老A的binlog去找复制组中其他B 做差异还原，保证数据的一致性

4.2 MHA

实验步骤：

1）ssh证书户信任（ssh的密钥验证）

2）ABB/AABB架构

3）安装mha_manager、mha_node

4）测试

4.2.1 ssh互信

（1） ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.189

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.190

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.192

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.171

（2）在192.168.13.189、192.168.13.190、192.168.13.192、192.168.13.171主机上将（1）都执行一遍即可，实现这四台主机ssh连接不需要密码

4.2.1 ABB（一主二从）/AABB（二主多从）架构搭建

略

4.2.1 MHA安装（管理节点MHA_manager）

（1）安装依赖包

[[email protected] mha]# ls dependent1

perl-Config-Tiny-2.14-7.el7.noarch.rpm perl-Mail-Sender-0.8.23-1.el7.noarch.rpm perl-MIME-Types-1.38-2.el7.noarch.rpm

perl-Email-Date-Format-1.002-15.el7.noarch.rpm perl-Mail-Sendmail-0.79-21.el7.noarch.rpm perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm

perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm perl-MIME-Lite-3.030-1.el7.noarch.rpm perl-Params-Validate-1.08-4.el7.x86_64.rpm

[[email protected] dependent1]# yum -y localinstall ./*

已加载插件：fastestmirror

正在检查 ./perl-Config-Tiny-2.14-7.el7.noarch.rpm: perl-Config-Tiny-2.14-7.el7.noarch

./perl-Config-Tiny-2.14-7.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Email-Date-Format-1.002-15.el7.noarch.rpm: perl-Email-Date-Format-1.002-15.el7.noarch

./perl-Email-Date-Format-1.002-15.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm: perl-Log-Dispatch-2.41-1.el7.1.noarch

./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm: perl-Mail-Sender-0.8.23-1.el7.noarch

./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm: perl-Mail-Sendmail-0.79-21.el7.noarch

./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-MIME-Lite-3.030-1.el7.noarch.rpm: perl-MIME-Lite-3.030-1.el7.noarch

./perl-MIME-Lite-3.030-1.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-MIME-Types-1.38-2.el7.noarch.rpm: perl-MIME-Types-1.38-2.el7.noarch

./perl-MIME-Types-1.38-2.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm: perl-Parallel-ForkManager-1.18-2.el7.noarch

./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Params-Validate-1.08-4.el7.x86_64.rpm: perl-Params-Validate-1.08-4.el7.x86_64

./perl-Params-Validate-1.08-4.el7.x86_64.rpm：不更新已安装的软件包。

无须任何处理

以上这些包均已安装过所以显示不更新已安装的软件包

（2）安装MHA相关组件包

[[email protected] mha]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

准备中... ################################# [100%]

软件包 mha4mysql-node-0.58-0.el7.centos.noarch 已经安装

[[email protected] mha]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

准备中... ################################# [100%]

软件包 mha4mysql-manager-0.58-0.el7.centos.noarch 已经安装

[[email protected] mha]#

4.2.2 MHA安装（其他节点安装MHA_node）

192.168.13.189（mha_node）

192.168.13.190（mha_node）

192.168.13.192（mha_node）

在mha_node安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm，【注意：MHA_node版本不论6或7都可以使用低版本】

至此，MHA_node安装完成

4.2.3 MHA配置

（1）mkdir /etc/mha ——创建目录

（2）vi /etc/mha/mha.cnf ——编写MHA配置文件

[[email protected] mha]# cat mha.cnf

[server default]

#mysql_admin and password

user=root

[email protected]@

#mha work_dir and mha_log

manager_workdir=/etc/mha

manager_log=/etc/mha/manager.log

#ssh connection account

ssh_user=root

#AB copy account and password

repl_user=slave

[email protected]@

ping_interval=1

#mysql server

[server1]

hostname=192.168.13.189

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server2]

hostname=192.168.13.190

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server4]

hostname=192.168.13.192

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[[email protected] mha]#

4.2.4 MHA互信和环境检查

（1）互信检查：masterha_check_ssh --conf=/etc/mha/mha.cnf

[[email protected] mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf

Thu Feb 28 09:50:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:50:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:50:58 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from [email protected](192.168.13.189:22) to [email protected](192.168.13.190:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from [email protected](192.168.13.189:22) to [email protected](192.168.13.192:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:59 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from [email protected](192.168.13.190:22) to [email protected](192.168.13.189:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from [email protected](192.168.13.190:22) to [email protected](192.168.13.192:22)..

Thu Feb 28 09:50:59 2019 - [debug] ok.

Thu Feb 28 09:51:00 2019 - [debug]

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from [email protected](192.168.13.192:22) to [email protected](192.168.13.189:22)..

Thu Feb 28 09:50:59 2019 - [debug] ok.

Thu Feb 28 09:50:59 2019 - [debug] Connecting via SSH from [email protected](192.168.13.192:22) to [email protected](192.168.13.190:22)..

Thu Feb 28 09:51:00 2019 - [debug] ok.

Thu Feb 28 09:51:00 2019 - [info] All SSH connection tests passed successfully.

[[email protected] mha]#

（2）环境健康检查：masterha_check_repl --conf=/etc/mha/mha.cnf

[[email protected] mha]# masterha_check_repl --conf=/etc/mha/mha.cnf

Thu Feb 28 09:53:58 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:53:58 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 09:53:59 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)——可写

Thu Feb 28 09:53:59 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only——可读（双主模式，只能有一个可写）

Thu Feb 28 09:53:59 2019 - [info] GTID failover mode = 0

Thu Feb 28 09:53:59 2019 - [info] Dead Servers:

Thu Feb 28 09:53:59 2019 - [info] Alive Servers:

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:53:59 2019 - [info] Alive Slaves:

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Checking slave configurations..

Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 09:53:59 2019 - [info] Checking replication filtering settings..

Thu Feb 28 09:53:59 2019 - [info] binlog_do_db= , binlog_ignore_db=

Thu Feb 28 09:53:59 2019 - [info] Replication filtering check ok.

Thu Feb 28 09:53:59 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 09:53:59 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:54:02 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 09:54:02 2019 - [info] Checking MHA Node version..

Thu Feb 28 09:54:03 2019 - [info] Version check ok.

Thu Feb 28 09:54:03 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 09:54:04 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 09:54:04 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 09:54:04 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 09:54:04 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 09:54:04 2019 - [info] Connecting to [email protected](192.168.13.190:22)..

Creating /var/tmp if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 09:54:05 2019 - [info] Binlog setting check done.

Thu Feb 28 09:54:05 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info] Connecting to [email protected](192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info] Connecting to [email protected](192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:06 2019 - [info] Slaves settings check done.

Thu Feb 28 09:54:06 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.189..

Thu Feb 28 09:54:06 2019 - [info] ok.

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.192..

Thu Feb 28 09:54:06 2019 - [info] ok.

Thu Feb 28 09:54:06 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 09:54:06 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 09:54:06 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

[[email protected] mha]#

4.2.5 启动及相关日志文件

（1）启动：

nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[[email protected] mha]#nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log </dev/null 2>&1 &

[1] 6492

[[email protected] mha]# jobs

[1]+ 运行中 nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[[email protected] mha]#

（2）日志文件

manager.log文件是MHA的启动日志文件，当启动成功会自动生成一个这样的文件并且记载了启动过程

[[email protected] mha]# tail -f manager.log

Thu Feb 28 10:05:57 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:05:57 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:05:57 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:05:57 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:05:57 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:01 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master‘s advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:01 2019 - [info] Got exit code 1 (Not master dead).——这些是默认自带的

下面是完整的启动流程：

Thu Feb 28 10:06:41 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 10:06:41 2019 - [warning] /etc/mha/mha.master_status.health already exists. You might have killed manager with SIGKILL(-9), may run two or more monitoring process for the same application, or use the same working directory. Check for details, and consider setting --workdir separately.

Thu Feb 28 10:06:43 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only

Thu Feb 28 10:06:43 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:06:43 2019 - [info] Dead Servers:

Thu Feb 28 10:06:43 2019 - [info] Alive Servers:

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:43 2019 - [info] Alive Slaves:

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Checking slave configurations..

Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:06:43 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:06:43 2019 - [info] binlog_do_db= , binlog_ignore_db=

Thu Feb 28 10:06:43 2019 - [info] Replication filtering check ok.

Thu Feb 28 10:06:43 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 10:06:43 2019 - [info] Starting SSH connection tests..

Thu Feb 28 10:06:46 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 10:06:46 2019 - [info] Checking MHA Node version..

Thu Feb 28 10:06:47 2019 - [info] Version check ok.

Thu Feb 28 10:06:47 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 10:06:47 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:06:48 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 10:06:48 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:48 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 10:06:48 2019 - [info] Connecting to [email protected](192.168.13.190:22)..

Creating /var/tmp if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 10:06:48 2019 - [info] Binlog setting check done.

Thu Feb 28 10:06:48 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 10:06:48 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 10:06:48 2019 - [info] Connecting to [email protected](192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:49 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 10:06:49 2019 - [info] Connecting to [email protected](192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:50 2019 - [info] Slaves settings check done.

Thu Feb 28 10:06:50 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:50 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:06:50 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:06:50 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:06:50 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:06:50 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:54 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master‘s advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:54 2019 - [info] Got exit code 1 (Not master dead).

启动成功。

4.2.6 MHA测试

MHA其目的就是当master故障时能及时剔除掉故障并能推选出new_master从而使其他slave的master指向new_master，使业务不中断。

（1）停掉master（192.168.13.190）上的mysql

（2）查看MHA的进程日志manager.log

查看红色标记部分即可

[[email protected] mha]# tail -f manager.log

Thu Feb 28 10:14:54 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)

Thu Feb 28 10:14:54 2019 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=on

Thu Feb 28 10:14:55 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:14:55 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:55 2019 - [warning] Connection failed 2 time(s)..

Thu Feb 28 10:14:56 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:56 2019 - [warning] Connection failed 3 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:57 2019 - [warning] Connection failed 4 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Master is not reachable from health checker!

Thu Feb 28 10:14:57 2019 - [warning] Master 192.168.13.190(192.168.13.190:3306) is not reachable!

Thu Feb 28 10:14:57 2019 - [warning] SSH is reachable.

Thu Feb 28 10:14:57 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.cnf again, and trying to connect to all servers to check server status..

Thu Feb 28 10:14:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 10:14:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:58 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:14:58 2019 - [info] Dead Servers:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Servers:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Slaves:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info] Checking slave configurations..

Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:14:58 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:14:58 2019 - [info] Replication filtering check ok.

Thu Feb 28 10:14:58 2019 - [info] Master is down!

Thu Feb 28 10:14:58 2019 - [info] Terminating monitoring script.

Thu Feb 28 10:14:58 2019 - [info] Got exit code 20 (Master dead).

Thu Feb 28 10:14:58 2019 - [info] MHA::MasterFailover version 0.58.

Thu Feb 28 10:14:58 2019 - [info] Starting master failover.

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:14:58 2019 - [info] * Phase 1: Configuration Check Phase..

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:15:00 2019 - [info] Dead Servers:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Checking master reachability via MySQL(double check)...

Thu Feb 28 10:15:00 2019 - [info] ok.

Thu Feb 28 10:15:00 2019 - [info] Alive Servers:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:00 2019 - [info] Alive Slaves:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info] Starting Non-GTID based failover.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] ** Phase 1: Configuration Check Phase completed.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] * Phase 2: Dead Master Shutdown Phase..

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] Forcing shutdown so that applications never connect to the current master..

Thu Feb 28 10:15:00 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.

Thu Feb 28 10:15:00 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.

Thu Feb 28 10:15:01 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3: Master Recovery Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] The latest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] The oldest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Oldest slaves:

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.2: Saving Dead Master‘s Binlog Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] Fetching dead master‘s binary logs..

Thu Feb 28 10:15:01 2019 - [info] Executing command on the dead master 192.168.13.190(192.168.13.190:3306): save_binary_logs --command=save --start_file=on.000001 --start_pos=661 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58

Creating /var/tmp if not exists.. ok.

Concat binary/relay logs from on.000001 pos 661 to on.000001 EOF into /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog ..

Dumping binlog format description event, from position 0 to 124.. ok.

Dumping effective binlog data from /var/lib/mysql/on.000001 position 661 to tail(684).. ok.

Concat succeeded.

Thu Feb 28 10:15:03 2019 - [info] scp from [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:03 2019 - [info] HealthCheck: SSH to 192.168.13.189 is reachable.

Thu Feb 28 10:15:04 2019 - [info] HealthCheck: SSH to 192.168.13.192 is reachable.

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.3: Determining New Master Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] Finding the latest slave that has all relay logs for recovering other slaves..

Thu Feb 28 10:15:04 2019 - [info] All slaves received relay logs to the same position. No need to resync each other.

Thu Feb 28 10:15:04 2019 - [info] Searching new master from slaves..

Thu Feb 28 10:15:04 2019 - [info] Candidate masters from the configuration file:

Thu Feb 28 10:15:04 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info] Non-candidate masters:

Thu Feb 28 10:15:04 2019 - [info] Searching from candidate_master slaves which have received the latest relay log events..

Thu Feb 28 10:15:04 2019 - [info] New master is 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:04 2019 - [info] Starting master failover..

Thu Feb 28 10:15:04 2019 - [info]

From:

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

To:

192.168.13.189(192.168.13.189:3306) (new master)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.4: New Master Diff Log Generation Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:04 2019 - [info] Sending binlog..

Thu Feb 28 10:15:05 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] * Phase 3.5: Master Log Apply Phase..

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.

Thu Feb 28 10:15:05 2019 - [info] Starting recovery on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:05 2019 - [info] Generating diffs succeeded.

Thu Feb 28 10:15:05 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:05 2019 - [info] done.

Thu Feb 28 10:15:05 2019 - [info] Getting slave status..

Thu Feb 28 10:15:05 2019 - [info] This slave(192.168.13.189)‘s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:05 2019 - [info] Connecting to the target slave host 192.168.13.189, running recover script..

Thu Feb 28 10:15:05 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:05 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.189:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:05 2019 - [info] All relay logs were successfully applied.

Thu Feb 28 10:15:05 2019 - [info] Getting new master‘s binlog name and position..

Thu Feb 28 10:15:06 2019 - [info] mysql-bin.000009:654

Thu Feb 28 10:15:06 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST=‘192.168.13.189‘, MASTER_PORT=3306, MASTER_LOG_FILE=‘mysql-bin.000009‘, MASTER_LOG_POS=654, MASTER_USER=‘slave‘, MASTER_PASSWORD=‘xxx‘;

Thu Feb 28 10:15:06 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.

Thu Feb 28 10:15:06 2019 - [info] Setting read_only=0 on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:06 2019 - [info] ok.

Thu Feb 28 10:15:06 2019 - [info] ** Finished master recovery successfully.

Thu Feb 28 10:15:06 2019 - [info] * Phase 3: Master Recovery Phase completed.

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4: Slaves Recovery Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] -- Slave diff file generation on host 192.168.13.192(192.168.13.192:3306) started, pid: 11154. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:07 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:07 2019 - [info] -- 192.168.13.192(192.168.13.192:3306) has the latest relay log events.

Thu Feb 28 10:15:07 2019 - [info] Generating relay diff files from the latest slave succeeded.

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) started, pid: 11157. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Sending binlog..

Thu Feb 28 10:15:07 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:07 2019 - [info] Starting recovery on 192.168.13.192(192.168.13.192:3306)..

Thu Feb 28 10:15:07 2019 - [info] Generating diffs succeeded.

Thu Feb 28 10:15:07 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:07 2019 - [info] done.

Thu Feb 28 10:15:07 2019 - [info] Getting slave status..

Thu Feb 28 10:15:07 2019 - [info] This slave(192.168.13.192)‘s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:07 2019 - [info] Connecting to the target slave host 192.168.13.192, running recover script..

Thu Feb 28 10:15:07 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:08 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.192:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:08 2019 - [info] All relay logs were successfully applied.

Thu Feb 28 10:15:08 2019 - [info] Resetting slave 192.168.13.192(192.168.13.192:3306) and starting replication from the new master 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:08 2019 - [info] Executed CHANGE MASTER.

Thu Feb 28 10:15:08 2019 - [info] Slave started.

Thu Feb 28 10:15:09 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:09 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) succeeded.

Thu Feb 28 10:15:09 2019 - [info] All new slave servers recovered successfully.

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] * Phase 5: New master cleanup phase..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Resetting slave info on the new master..

Thu Feb 28 10:15:09 2019 - [info] 192.168.13.189: Resetting slave info succeeded.

Thu Feb 28 10:15:09 2019 - [info] Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

Thu Feb 28 10:15:09 2019 - [info]

----- Failover Report -----

mha: MySQL Master failover 192.168.13.190(192.168.13.190:3306) to 192.168.13.189(192.168.13.189:3306) succeeded

Master 192.168.13.190(192.168.13.190:3306) is down!

Check MHA Manager logs at ybb-test-02-6:/etc/mha/manager.log for details.

Started automated(non-interactive) failover.

The latest slave 192.168.13.189(192.168.13.189:3306) has all relay logs for recovery.

Selected 192.168.13.189(192.168.13.189:3306) as a new master.

192.168.13.189(192.168.13.189:3306): OK: Applying all logs succeeded.

192.168.13.192(192.168.13.192:3306): This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

192.168.13.192(192.168.13.192:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.13.189(192.168.13.189:3306)

192.168.13.189(192.168.13.189:3306): Resetting slave info succeeded.

Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

查看192.168.13.192上mysql的master是不是已经成功转变了：

由此可见，MHA已经起到高可用的目的了

4.3 集群小结

一、或者，其他方案？不管哪种方案都是有其场景限制或说规模限制，以及优缺点的。

1、首先反对大家做读写分离，关于这方面的原因解释太多次数（增加技术复杂度、可能导致读到落后的数据等），只说一点：99.8%的业务场景没有必要做读写分离，只要做好数据库设计优化和配置合适正确的主机即可。

2、Keepalived+MySQL --确实有脑裂的问题，还无法做到准确判断mysqld是否HANG的情况；

3、DRBD+Heartbeat+MySQL --同样有脑裂的问题，还无法做到准确判断mysqld是否HANG的情况，且DRDB是不需要的，增加反而会出问题；

3、MySQL Proxy -- 不错的项目，可惜官方半途夭折了，不建议用，无法高可用，是一个写分离；

4、MySQL Cluster -- 社区版本不支持NDB是错误的言论，商用案例确实不多，主要是跟其业务场景要求有关系、这几年发展有点乱不过现在已经上正规了、对网络要求高；

5、MySQL + MHA -- 可以解决脑裂的问题，需要的IP多，小集群是可以的，但是管理大的就麻烦，其次MySQL + MMM 的话且坑很多，有MHA就没必要采用MMM建议：

（1）若是双主复制的模式，不用做数据拆分，那么就可以选择MHA或 Keepalive 或 heartbeat

（2）若是双主复制，还做了数据的拆分，则可以考虑采用Cobar；

二、 MHA是自动的master故障转移和Slave提升的软件包.它是基于标准的MySQL复制(异步/半同步)。该软件由两部分组成：MHA Manager（管理节点）和MHA Node（数据节点）。

1）MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群，也可以部署在一台slave节点上。MHA Manager会定时探测集群中的node节点,当发现master出现故障的时候,它可以自动将具有最新数据的slave提升为新的master,然后将所有其它的slave导向新的master上.整个故障转移过程对应用程序是透明的。

2）MHA Node运行在每台MySQL服务器上，它通过监控具备解析和清理logs功能的脚本来加快故障转移的。

第5章 错误集锦

5.1 主从报错

1、 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file‘

解决办法：（1）stop slave;（2）reset slave; （3）start slave;若一遍不行，隔一会儿再试一下

reset slave 将使slave 忘记主从复制关系的位置信息。该语句将被用于干净的启动, 它删除master.info文件和relay-log.info 文件以及所有的relay log 文件并重新启用一个新的relaylog文件。使用reset slave之前必须使用stop slave 命令将复制进程停止。

2、若执行此3条命令不行，则使用以下方法

（1）删除所有bin-log日志【find / -name “*bin-log*” -exec ls {} \;】

（2）重新生成bin-log日志文件

change master to

-> master_log_file=‘bin_log.000001‘,

-> master_log_pos=155;

Start slave；

（3）若还是报和之前一样的错误，则：

① stop slave;

② reset slave;

③ start slave;

（4）完成

3、若提示连接超时或用户名、密码错误：

mysql -ubackup -h 2.2.2.2 -p123456，看看在从上能不能连上主mysql。

能：用户名+密码没问题

不能：用户名或密码错误

4、找不到mysql临时密码

[[email protected] mysql]# cat /var/log/mysqld.log |grep password

[[email protected] mysql]#

（1）删除原来安装过的mysql残留的数据（这一步非常重要，问题就出在这）

rm -rf /var/lib/mysql

（3）重启mysqld服务 systemctl restart mysqld

（4）查看mysql日志文件

[[email protected] mysql]# cat /var/log/mysqld.log |grep password

2019-02-25T08:50:58.053021Z 5 [Note] [MY-010454] [Server] A temporary password is generated for [email protected]: X=0gw#AAQdx*

5、User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

在master创建slave同步用户时未赋予权限，解决办法：

GRANT REPLICATION SLAVE ON *.* TO ‘slave‘@‘%‘;

flush privileges;

6、

这个时候先等一会儿或查看日志，show slave status\G;若没有错误等一会儿错误就会报出来（几分钟），报错如下：

Slave SQL for channel ‘‘: Could not execute Update_rows event on table mysql.user; Can‘t find record in ‘user‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000006, end_log_pos 1768, Error_code: MY-001032

解决办法：

stop slave;

set global sql_slave_skip_counter=1;

start slave;

5.2 集群报错

1、[error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln781] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:

由于MHA目前支持一主多从模式也就是只支持一个master写，其他master只能读的模式，所以当多个master并存时，只能设置其他master为read-only模式即可

解决办法：在其他master执行mysql -uroot -p -e "set global read_only=1"即可

2、

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.13.190(192.168.13.190:3306): User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

这是因为从192.168.13.190上面同步用户slave不存在导致，因为当master挂了之后，任何从都有可能成为master，所以若从没有同步用户slave会导致同步失败。

原文地址：https://www.cnblogs.com/zhangan/p/10898322.html

时间： 2024-10-11 20:41:26

mysql的AB及读写和集群

mysql的AB及读写和集群的相关文章

MyCAT+MySQL 搭建高可用企业级数据库集群

mysql+mycat搭建稳定高可用集群，负载均衡，主备复制，读写分离

mysql+myca搭建稳定高可用集群，负载均衡，主备复制，读写分离

DB层面上的设计分库分表读写分离集群化负载均衡

MHA-结合MySQL半同步复制高可用集群(Centos7)

MyCAT+MySQL 搭建高可用企业级数据库集群——第1章课程介绍

mysql系列之--------mmm高可用集群

使用hive客户端java api读写hive集群上的信息

MyCAT+MySQL 搭建高可用企业级数据库集群——第2章 MyCat入门