mysql的AB及读写和集群

Mysql的AB及读写 

第1章 Mysql的AB配置

1.1 master配置

1.2 slave配置

  • 1.2.1 192.168.13.190
  • 1.2.2 192.168.13.191
  • 1.2.3 192.168.13.192
  • 1.2.4 192.168.13.193
  • 1.2.4 192.168.13.189

第2章 读写分离

2.1 安装mycat

  • 2.1.1 server.xml
  • 2.1.1 schema.xml

2.2 启动mycat

  • 2.2.1 启动mycat
  • 2.2.2 mycat日志

2.3 登录mycat相关问题

2.4 小结

第3章 mysql-proxy

3.1 安装mysql-proxy

3.2 配置mysql-proxy

3.3 启动mysql-proxy和查看

第4章 集群

4.1 版本介绍

4.2 MHA

  • 4.2.1 ssh互信
  • 4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建
  • 4.2.1 MHA安装(管理节点MHA_manager)
  • 4.2.2 MHA安装(其他节点安装MHA_node)
  • 4.2.3 MHA配置
  • 4.2.4 MHA互信和环境检查
  • 4.2.5 启动及相关日志文件
  • 4.2.6 MHA测试

4.3 集群小结

第5章 错误集锦

5.1 主从报错

5.2 集群报错

第1章 Mysql的AB配置

当MySQL的版本是5.5或高于5.5的时候,其中大部分的内容相似

主要是5.5之后不再支持master打头的参数


主|从


IP地址


允许同步的用户


读|写


mysql版本


Mycat版本


JDK版本


操作系统版本


master


192.168.13.189


slave


write


mysql-8.0.13


CentOS  7.4.1708


192.168.13.192


copy


write


mysql-8.0.13


Mycat-server-1.6.6.1


1.8.0_191


CentOS  7.4.1708


slave


192.168.13.190


slave


read


mysql-8.0.13


CentOS  7.4.1708


192.168.13.191


slave


read


mysql-8.0.13


CentOS  7.4.1708


192.168.13.193


copy


read


mysql-8.0.13


CentOS  7.4.1708

1.1 master配置

Master:

在/etc/my.cnf 添加:

server-id = 1

log-bin=/home/mysql/logs/binlog/bin-log(若报错则不加路径,如下)

log-bin=bin-log

max_binlog_size = 500M

binlog_cache_size = 128K

binlog-do-db = adb

binlog-ignore-db = mysql

log-slave-updates

expire_logs_day=2(在centos7.3上易报错,若报错,可不加

binlog_format="MIXED"

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

server-id = 1 #服务器标志号,注意在配置文件中不能出现多个这样的标识,如果出现多个的话mysql以第一个为准,一组主从中此标识号不能重复。

log-bin=/home/mysql/logs/binlog/bin-log #开启bin-log,并指定文件目录和文件名前缀。

max_binlog_size = 500M #每个bin-log最大大小,当此大小等于500M时会自动生成一个新的日志文件。一条记录不会写在2个日志文件中,所以有时日志文件会超过此大小。

binlog_cache_size = 128K #日志缓存大小

binlog-do-db = adb #需要同步的数据库名字,如果是多个,就以此格式在写一行即可。

binlog-ignore-db = mysql  #不需要同步的数据库名字,如果是多个,就以此格式在写一行即可。

log-slave-updates  #当Slave从Master数据库读取日志时更新新写入日志中,如果只启动log-bin 而没有启动log-slave-updates则Slave只记录针对自己数据库操作的更新。

expire_logs_day=2 #设置bin-log日志文件保存的天数,此参数mysql5.0以下版本不支持。

binlog_format=”MIXED”   #设置bin-log日志文件格式为:MIXED,可以防止主键重复。

接着重启MySQL:/etc/init.d/mysqld restart

再进入MySQL:mysql -u root -p123456

创建backup用户:

创建一个用于让从数据库连接的用户(以前那种方式不能用了)

mysql> CREATE USER ‘copy‘@‘%‘ IDENTIFIED WITH mysql_native_password BY ‘tqw961110‘;(密码强度可能不够会报错,[email protected]@)或下面一个

mysql>CREATE USER ‘slave‘@‘%‘ IDENTIFIED BY ‘[email protected]@‘;

mysql>grant all on *.* to ‘copy‘@‘%‘ identified by "[email protected]"     #这个方式也可以

mysql>flush privileges #刷新授权表

mysql> show master status;

Master的操作就完了

【server_id千万不要一样,否则会报错】

mysql> show master status;【192.168.13.192

+------------------+----------+--------------+------------------+-------------------+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+------------------+----------+--------------+------------------+-------------------+

| mysql-bin.000005 |      610 | mysql        |                  |                   |

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

mysql> show master status;【192.168.13.189

+------------------+----------+--------------+------------------+-------------------+

| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+------------------+----------+--------------+------------------+-------------------+

| mysql-bin.000006 |     3490 |              |                  |                   |

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

1.2 slave配置步骤

在/etc/my.cnf 添加:

server_id=2

replicate-do-db=database1           #要同步的数据库的名字

replicate-ignore-db=mysql       #被忽略的数据库

接着重启MySQL:/etc/init.d/mysqld restart

再进入MySQL:mysql -u root -p123456

同步执行此操作:

mysql> change master to

-> master_host=‘2.2.2.2‘,

-> master_user=‘backup‘,

-> master_password=‘123456‘,

-> master_log_file=‘bin_log.000003‘,          # master上记录的日志文件

-> master_log_pos=120;                  # master上记录的日志文件位置

Query OK, 0 rows affected, 2 warnings (0.06 sec)

同步开启:start slave;

同步关闭:stop slave;

最后开启同步 start slave即可

1.2.1 192.168.13.190

mysql> change master to

-> master_host=‘192.168.13.189‘,

-> master_user=‘slave‘,

-> master_password=‘XINyang3009‘,

-> master_log_file=‘mysql-bin.000006‘,

-> master_log_pos=3490;

Query OK, 0 rows affected, 2 warnings (0.06 sec)

mysql> start slave;

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

  Slave_IO_Running: Yes(必须是yes)

            Slave_SQL_Running: Yes(必须是yes)

Replicate_Do_DB: mysql

1.2.2 192.168.13.191

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

  Slave_IO_Running: Yes(必须是yes)

            Slave_SQL_Running: Yes(必须是yes)

Replicate_Do_DB: mysql

1.2.3 192.168.13.192

1.2.4 192.168.13.193

1.2.4 192.168.13.189

至此,二主三从搭建完成。

#Slave_IO_Running:连接到主库,并读取主库的日志到本地,生成本地日志文件
#Slave_SQL_Running:读取本地日志文件,并执行日志里的SQL命令。

2读写分离

在这里读写分离中间件用的是mycat,部署在192.168.13.192上

2.1 安装mycat

1、安装JDK,mycat依赖Java环境

2、解压Mycat-server-1.6.6.1-release-20181031195535-linux.tar.gz,同时修改环境变量 /etc/profile

[[email protected] conf]# cat /etc/profile

3、进入配置文件目录cd ./mycat/conf

(1)server.xml是登陆mycat的用户进行配置的配置文件

(2)schema.xml 对真实数据库进行配置的配置文件

2.1.1 server.xml

[[email protected] conf]# cat server.xml

97          <user name="root" defaultAccount="true">

99                  <property name="password">[email protected]@</property>

100                  <property name="schemas">TESTDB</property>

101

102                  <!-- 表级 DML 权限设置 -->

103                  <!--

104                  <privileges check="false">

105                          <schema name="TESTDB" dml="0110" >

106                                  <table name="tb01" dml="0000"></table>

107                                  <table name="tb02" dml="1111"></table>

108                          </schema>

109                  </privileges>

110                   -->

111          </user>

112

113       <user name="user">

114                  <property name="password">user</property>

115                  <property name="schemas">TESTDB</property>

116

117          </user>

这里配置了两个可以来连接的用户

用户1  test   密码test   给予了此用户TESTDB数据库的权限

用户2  user   密码user   给予了此用户TESTDB数据库的只读权限

注意这里的TESTDB 不一定是你数据库上的真实库名.可以任意指定.只要和接下来的schema.xml的配置文件中的库名统一即可

2.1.1 schema.xml

[[email protected] conf]# cat schema.xml

<?xml version="1.0"?>

<!DOCTYPE mycat:schema SYSTEM "schema.dtd">

<mycat:schema xmlns:mycat="http://io.mycat/">

<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1"></schema>

<dataNode name="dn1" dataHost="write_master_1" database="mysql" />

<dataHost name="write_master_1" maxCon="1000" minCon="10" balance="0"

writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">

<heartbeat>select user()</heartbeat>

<writeHost host="write_master_1" url="192.168.13.189:3306" user="root" password="[email protected]@">

<writeHost host="write_master_1" url="192.168.13.192:3306" user="root" password="[email protected]@">

<readHost host="read_slave_1" url="192.168.13.190:3306" user="root" password="[email protected]@" />

<readHost host="read_slave_1" url="192.168.13.191:3306" user="root" password="[email protected]@" />

<readHost host="read_slave_2" url="192.168.13.193:3306" user="root" password="[email protected]@" />

</writeHost>

</dataHost>

</mycat:schema>

1、<schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100" dataNode="dn1">

这里TESTDB 就是我们对外声称的我们有数据库的名称  必须和server.xml中的用户指定的数据库名称一致,添加一个dataNode="dn1" 是指定了我们这个库只在dn1上.没有进行分库

2、<dataNode name="dn1" dataHost="localhost1" database="db1" />

这里只需要改database的名字 db1 就是你真实数据库服务上的数据库名,根据你自己的数据库名进行修改.

3、<dataHost name="localhost1" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">

这里只需要配置三个地方    balance="1"与writeType="0" ,switchType=”1”

a. balance 属性负载均衡类型,目前的取值有 4 种:

  1. balance="0", 不开启读写分离机制,所有读操作都发送到当前可用的 writeHost 上。

  2. balance="1",全部的 readHost 与 stand by writeHost 参与 select 语句的负载均衡,简单的说,当双主双从模式(M1 ->S1 , M2->S2,并且 M1 与 M2 互为主备),正常情况下, M2,S1,S2 都参与 select 语句的负载均衡。

  3. balance="2",所有读操作都随机的在 writeHost、 readhost 上分发。

  4. balance="3", 所有读请求随机的分发到 wiriterHost 对应的 readhost 执行,writerHost 不负担读压力,注意 balance=3 只在 1.4 及其以后版本有, 1.3 没有。

b. writeType 属性

  负载均衡类型,目前的取值有 3 种:

  1. writeType="0", 所有写操作发送到配置的第一个 writeHost,第一个挂了切到还生存的第二个

writeHost,重新启动后已切换后的为准,切换记录在配置文件中:dnindex.properties .

  2. writeType="1",所有写操作都随机的发送到配置的 writeHost。

  3. writeType="2",没实现。

c. switchType 属性

      1: 表示不自动切换

    1 :默认值,自动切换

    2 :基于MySQL 主从同步的状态决定是否切换

4、<writeHost host="hostM1" url="192.168.1.100:3306" user="root" password="123456">

<readHost host="hostS1" url="192.168.1.101:3306" user="root" password="123456" />

</writeHost>

这里是配置的我们的两台读写服务器IP地址访问端口和 访问用户的用户名和密码

2.2 启动mycat

2.2.1 启动mycat

到bin目录下,执行./mycat  restart

[[email protected] bin]# ps -aux|grep mycat

root     30319  0.0  0.0  17816   744 ?        Sl   15:51   0:00 /usr/local/mysql/mycat/bin/./wrapper-linux-x86-64 /usr/local/mysql/mycat/conf/wrapper.conf wrapper.syslog.ident=mycat wrapper.pidfile=/usr/local/mysql/mycat/logs/mycat.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/mycat

root     30321  3.5  6.5 6927816 253020 ?      Sl   15:51   0:03 java -DMYCAT_HOME=. -server -XX:MaxPermSize=64M -XX:+AggressiveOpts -XX:MaxDirectMemorySize=2G -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1984 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Xmx4G -Xms1G -Djava.library.path=lib -classpath lib/wrapper.jar:conf:lib/asm-4.0.jar:lib/commons-collections-3.2.1.jar:lib/commons-lang-2.6.jar:lib/curator-client-2.11.0.jar:lib/curator-framework-2.11.0.jar:lib/curator-recipes-2.11.0.jar:lib/disruptor-3.3.4.jar:lib/dom4j-1.6.1.jar:lib/druid-1.0.26.jar:lib/ehcache-core-2.6.11.jar:lib/fastjson-1.2.12.jar:lib/guava-19.0.jar:lib/hamcrest-core-1.3.jar:lib/hamcrest-library-1.3.jar:lib/jline-0.9.94.jar:lib/joda-time-2.9.3.jar:lib/jsr305-2.0.3.jar:lib/kryo-2.10.jar:lib/leveldb-0.7.jar:lib/leveldb-api-0.7.jar:lib/libwrapper-linux-ppc-64.so:lib/libwrapper-linux-x86-32.so:lib/libwrapper-linux-x86-64.so:lib/log4j-1.2-api-2.5.jar:lib/log4j-1.2.17.jar:lib/log4j-api-2.5.jar:lib/log4j-core-2.5.jar:lib/log4j-slf4j-impl-2.5.jar:lib/mapdb-1.0.7.jar:lib/minlog-1.2.jar:lib/mongo-java-driver-2.11.4.jar:lib/Mycat-server-1.6.6.1-release.jar:lib/mysql-binlog-connector-java-0.16.1.jar:lib/mysql-connector-java-5.1.35.jar:lib/netty-3.7.0.Final.jar:lib/netty-buffer-4.1.9.Final.jar:lib/netty-common-4.1.9.Final.jar:lib/objenesis-1.2.jar:lib/reflectasm-1.03.jar:lib/sequoiadb-driver-1.12.jar:lib/slf4j-api-1.6.1.jar:lib/univocity-parsers-2.2.1.jar:lib/velocity-1.7.jar:lib/wrapper.jar:lib/zookeeper-3.4.6.jar -Dwrapper.key=aBVwhEZHoe1K2Vpw -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=30319 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp io.mycat.MycatStartup start

root     30458  0.0  0.0 112676   956 pts/2    S+   15:53   0:00 grep --color=auto mycat

[[email protected] bin]# netstat -anlpt|grep :*066

tcp6    0    0 :::8066        :::*                    LISTEN      30321/java

tcp6    0    0 :::9066        :::*                    LISTEN      30321/java

[[email protected] bin]#

8066:数据端口

9066:管理端口

命令行的登陆是通过 9066 管理端口来操作。

2.2.2 mycat日志

[[email protected] logs]# cd /usr/local/mysql/mycat/logs

warpper 日志:mycat启动,停止,添加为服务等都会记录到此日志文件,如果系统环境配置错误或缺少配置时,导致 Mycat 无法启动,可以通过查看 warrpper.log 定位具体错误原因。

mycat.log:为mycat主要日志文件,记录了启动时分配的相关buffer信息,数据源连接信息,连接池,动态类加载信息等等在log4j.xml文件中进行相关配置,如保留个数,大小,字符集,日志文件大小等。非启动状态下可以删除,启动后会自动生成该日志文件

2.3 登录mycat相关问题

命令:mysql -uroot [email protected]@ -h192.168.13.192 -P9066

[[email protected] conf]# mysql -uroot [email protected]@ -h192.168.13.192 -P9066

mysql: [Warning] Using a password on the command line interface can be insecure.

ERROR 1045 (HY000): Access denied for user ‘root‘, because password is error

特别注意:本次实验的mysql版本均为mysql-8.0.13,此故障折腾了我两三天,以上的配置查看了无数遍,后来查阅相关资料才知道目前MyCat仍主要面对MySql 5.5, 5.6, 5.7版,对最新的MySql 8尚未完全支持,需要用户对MySql 8和MyCat的配置进行一系列的修改。其中对mycat的登陆方式有改变,Mycat登录逻辑库的传统方式是:mysql -uroot -p -h127.0.0.1 -P8066 -DTESTDB,对于MySql 8,会报密码错误方式,这是由于Mysql 8的缺省加密方式已经改为caching_sha2_password,而MyCat对此尚不支持。当然可以通过修改MySQL8的加密方式使其通过,但未对MySQL8完全支持,所以会在功能上也会大打折扣,目前mycat官网并没有对MySQL8版本的详细文档,官网的文档目前对MySQL5.X依旧适用。

之前做实验之所以成功是因为我在云上也装了一台mysql 5.6,并且也是用5.6做的主从,当时做完之后没有用用mysql 8把版本做连接测试,用的是5.6版本,而此次用mysql8版本连接测试显示始终失败,还折腾两天左右,我才想到可能是版本的问题,为此我又在mysql5.6版本上重新做了一遍验证一下是否是由于版本太新导致mycat不支持从而出现的错误,于是又开始折腾很久。

1、版本简介:


OS版本


IP


读|写


MySQL版本


mycat版本


CentOS  7.2


2.2.2.10


读|写


5.6.19


Mycat-server-1.6.6.1

为尽快达到实验目的,只在一台虚机安装MySQL(时长约3h),并读写也在该主机上。

2、安装MySQL 5.6(详细过程略)

3、配置文件server.xml和schema.xml

4、mycat启动

到bin目录下,执行./mycat  restart

5、登录

[[email protected] conf]# mysql -uroot [email protected]@ -h2.2.2.10 -P9066

Warning: Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 2

Server version: 5.6.29-mycat-1.6.6.1-release-20181031195535 MyCat Server (monitor)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective owners.

Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.

mysql>

当出现mycat版本就说明登录成功。

6、查看当前可读写的入口

mysql> show @@datasource;

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

| DATANODE | NAME   | TYPE  | HOST     | PORT | W/R  | ACTIVE | IDLE | SIZE | EXECUTE | READ_LOAD | WRITE_LOAD |

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---| dn1      | write1 | mysql | 2.2.2.10 | 3306 | W    |      0 |   10 | 1000 |     442 |         2 |          1 |

| dn1      | write1 | mysql | 2.2.2.10 | 3306 | R    |      0 |    8 | 1000 |     436 |         0 |          0 |

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

2 rows in set (0.02 sec)

7、知识点总结

8、总结

由上可知,最新的mycat并不支持最新版本的mysql,也许能够连上去但在并未完全开放之前,相信很多mycat的功能会受限。

经过第二次试验可知,在mysql 8上线的配置并没有错,只不过是由于mycat的不支持最新的mysql版本导致,所以等mycat支持mysql 8的时候就可以生效了

2.4 小结

由于mycat目前并不全面支持mysql最新版本,所以目前读写分离我选的是mysql官方推荐的mysql-proxy,功能虽小但基于mysql 8版本的过度期还是一款不错的产品。

3mysql-proxy

3.1 安装mysql-proxy

1、使用本地镜像安装相关环境:

yum install -y gcc* gcc-c++* autoconf* automake* zlib* libxml* ncurses-devel* libmcrypt* libtool* flex* pkgconfig* libevent* glib*

2、下载mysql-proxy-0.8.5-linux-glibc2.3-x86-64bit.tar

3、解压安装

4、环境变量

[[email protected] ~]# vi .bash_profile

PATH=$PATH:$HOME/bin:/mysqlprox/proxy/bin

3.2 配置mysql-proxy

mkdir /home/mysql-proxy/logs 【日志文件位置】
mkdir /home/mysql-proxy/lua  【脚本位置】

cd /usr/local/mysql-proxy 【mysql-proxy安装位置】
cp share/doc/mysql-proxy/rw-splitting.lua  ./lua 【复制读写分离配置文件】

vi /etc/mysql-proxy.cnf 【创建配置文件】

[mysql-proxy]

user=root 【运行mysql-proxy用户】

admin-username=proxy   【主从mysql共有的用户】

admin-password=123456 【用户的密码】

proxy-address=192.168.11.31:4040【 mysql-proxy运行ip和端口,不加端口,默认4040】

proxy-read-only-backend-addresses=192.168.11.31 【指定后端从slave读取数据】

proxy-backend-addresses=192.168.11.34 【指定后端主master写入数据】

proxy-lua-script=/usr/local/mysql-proxy/lua/rw-splitting.lua 【指定读写分离配置文件位置】

admin-lua-script=/usr/local/mysql-proxy/lua/admin.lua 【指定管理脚本】

log-file=/var/log/mysql-proxy.log

log-level=info 【定义log日志级别】

daemon=true【以守护进程方式运行】

keepalive=true 【mysql-proxy崩溃时,尝试重启】

(1)(critical) Key file contains key “keepalive” which has a value that cannot be interpreted.

daemon=1

keepalive=1

<1> 给配置文件执行权限
chmod 660 /etc/mysql-porxy.cnf

<2> 修改读写分离配置文件

vim /usr/local/mysql-proxy/lua/rw-splitting.luaif not proxy.global.config.rwsplit

min_idle_connections = 1, 【默认超过4个连接数时,才开始读写分离,改为1】

max_idle_connections = 1, 【默认8,改为1】

3.3 启动mysql-proxy和查看

启动:/usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/etc/mysql-proxy.cnf

查看:netstat -tupln | grep 4000

这样一个读写分离就完全了

第4章 集群

目前Mysql集群用的MHA或keepalive的方式做的,但都通过VIP进行管理,若是在云上做集群会有一些隐患。和H3C云工程师进行交流了解,他们不建议我们在云上做任何的集群,因为集群有心跳检测,会与虚机的备份有冲突,当虚机进行备份时会造成心跳停止,导致集群报错,解决办法就是不使用虚机备份,使用其他方式进行备份方式。中国经济网就是采用虚机做集群但不采用虚机备份的方式。

其他备份方式与虚机备份又有区别:

(1)其他备份方式:只能备份指定的数据,不能备份虚机的相关信息。

(2)虚机备份:不仅能备份指定的数据还能备份虚机的相关配置信息,在虚机崩溃时还可以进行虚机恢复。

4.1 版本介绍


属性


IP地址


允许同步的用户


读|写


mysql版本


MHA版本


操作系统版本


master


192.168.13.189


slave


write


mysql-8.0.13


0.54


CentOS  7.4


 


192.168.13.190


slave


read


mysql-8.0.13


0.54


CentOS  7.4


slave


192.168.13.192


slave


read


mysql-8.0.13


0.58


CentOS  7.4


MHA_manager


192.168.13.171


 


 


mysql-8.0.13


0.58


CentOS  7.4

MHA实现机制:

(1)监控AB的状态

(2)完整的选举机制(看谁的数据跟master最接近)

(3)让一个B切换到新A

(4)保证数据的完整性(通过差异还原)

因为AB复制是异步复制,所以可能有一些数据尚没有被B拉到其relay_log中,即AB数据不一致,MHA是怎样解决这种情况的呢?

(1)mha_manager 使用scp命令将A当前binlog拷贝到mha_manager

(2)待新A(选举:依据谁的relay_log新)产生后,mha_manager再拿着旧A的binlog和新的relay_log做比对,并进行差异还原以保证新A和旧A数据的一致性

(3)mha_manager最后拿着老A的binlog去找复制组中其他B 做差异还原,保证数据的一致性

4.2 MHA

实验步骤:

1)ssh证书户信任(ssh的密钥验证)

2)ABB/AABB架构

3)安装mha_manager、mha_node

4)测试

4.2.1 ssh互信

(1) ssh-keygen -t rsa

ssh-copy-id -i  ~/.ssh/id_rsa.pub root@192.168.13.189

ssh-copy-id -i  ~/.ssh/id_rsa.pub root@192.168.13.190

ssh-copy-id -i  ~/.ssh/id_rsa.pub root@192.168.13.192

ssh-copy-id -i  ~/.ssh/id_rsa.pub root@192.168.13.171

(2)在192.168.13.189、192.168.13.190、192.168.13.192、192.168.13.171主机上将(1)都执行一遍即可,实现这四台主机ssh连接不需要密码

4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建

4.2.1 MHA安装(管理节点MHA_manager)

(1)安装依赖包

[[email protected] mha]# ls dependent1

perl-Config-Tiny-2.14-7.el7.noarch.rpm          perl-Mail-Sender-0.8.23-1.el7.noarch.rpm   perl-MIME-Types-1.38-2.el7.noarch.rpm

perl-Email-Date-Format-1.002-15.el7.noarch.rpm  perl-Mail-Sendmail-0.79-21.el7.noarch.rpm  perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm

perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm       perl-MIME-Lite-3.030-1.el7.noarch.rpm      perl-Params-Validate-1.08-4.el7.x86_64.rpm

[[email protected] dependent1]# yum -y localinstall ./*

已加载插件:fastestmirror

正在检查 ./perl-Config-Tiny-2.14-7.el7.noarch.rpm: perl-Config-Tiny-2.14-7.el7.noarch

./perl-Config-Tiny-2.14-7.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Email-Date-Format-1.002-15.el7.noarch.rpm: perl-Email-Date-Format-1.002-15.el7.noarch

./perl-Email-Date-Format-1.002-15.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm: perl-Log-Dispatch-2.41-1.el7.1.noarch

./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm: perl-Mail-Sender-0.8.23-1.el7.noarch

./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm: perl-Mail-Sendmail-0.79-21.el7.noarch

./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-MIME-Lite-3.030-1.el7.noarch.rpm: perl-MIME-Lite-3.030-1.el7.noarch

./perl-MIME-Lite-3.030-1.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-MIME-Types-1.38-2.el7.noarch.rpm: perl-MIME-Types-1.38-2.el7.noarch

./perl-MIME-Types-1.38-2.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm: perl-Parallel-ForkManager-1.18-2.el7.noarch

./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm:不更新已安装的软件包。

正在检查 ./perl-Params-Validate-1.08-4.el7.x86_64.rpm: perl-Params-Validate-1.08-4.el7.x86_64

./perl-Params-Validate-1.08-4.el7.x86_64.rpm:不更新已安装的软件包。

无须任何处理

以上这些包均已安装过所以显示不更新已安装的软件包

(2)安装MHA相关组件包

[[email protected] mha]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

准备中...                          ################################# [100%]

软件包 mha4mysql-node-0.58-0.el7.centos.noarch 已经安装

[[email protected] mha]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

准备中...                          ################################# [100%]

软件包 mha4mysql-manager-0.58-0.el7.centos.noarch 已经安装

[[email protected] mha]#

4.2.2 MHA安装(其他节点安装MHA_node)

192.168.13.189(mha_node)

192.168.13.190(mha_node)

192.168.13.192(mha_node)

在mha_node安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm,【注意:MHA_node版本不论6或7都可以使用低版本】

至此,MHA_node安装完成

4.2.3 MHA配置

(1)mkdir /etc/mha ——创建目录

(2)vi /etc/mha/mha.cnf ——编写MHA配置文件

[[email protected] mha]# cat mha.cnf

[server default]

#mysql_admin and password

user=root

[email protected]@

#mha work_dir and mha_log

manager_workdir=/etc/mha

manager_log=/etc/mha/manager.log

#ssh connection account

ssh_user=root

#AB copy account and password

repl_user=slave

[email protected]@

ping_interval=1

#mysql server

[server1]

hostname=192.168.13.189

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server2]

hostname=192.168.13.190

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server4]

hostname=192.168.13.192

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[[email protected] mha]#

4.2.4 MHA互信和环境检查

(1)互信检查:masterha_check_ssh --conf=/etc/mha/mha.cnf

[[email protected] mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf

Thu Feb 28 09:50:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:50:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:50:58 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.189:22) to [email protected](192.168.13.190:22)..

Thu Feb 28 09:50:58 2019 - [debug]   ok.

Thu Feb 28 09:50:58 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.189:22) to [email protected](192.168.13.192:22)..

Thu Feb 28 09:50:58 2019 - [debug]   ok.

Thu Feb 28 09:50:59 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.190:22) to [email protected](192.168.13.189:22)..

Thu Feb 28 09:50:58 2019 - [debug]   ok.

Thu Feb 28 09:50:58 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.190:22) to [email protected](192.168.13.192:22)..

Thu Feb 28 09:50:59 2019 - [debug]   ok.

Thu Feb 28 09:51:00 2019 - [debug]

Thu Feb 28 09:50:58 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.192:22) to [email protected](192.168.13.189:22)..

Thu Feb 28 09:50:59 2019 - [debug]   ok.

Thu Feb 28 09:50:59 2019 - [debug]  Connecting via SSH from [email protected](192.168.13.192:22) to [email protected](192.168.13.190:22)..

Thu Feb 28 09:51:00 2019 - [debug]   ok.

Thu Feb 28 09:51:00 2019 - [info] All SSH connection tests passed successfully.

[[email protected] mha]#

(2)环境健康检查:masterha_check_repl --conf=/etc/mha/mha.cnf

[[email protected] mha]# masterha_check_repl --conf=/etc/mha/mha.cnf

Thu Feb 28 09:53:58 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:53:58 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 09:53:59 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)——可写

Thu Feb 28 09:53:59 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only——可读(双主模式,只能有一个可写)

Thu Feb 28 09:53:59 2019 - [info] GTID failover mode = 0

Thu Feb 28 09:53:59 2019 - [info] Dead Servers:

Thu Feb 28 09:53:59 2019 - [info] Alive Servers:

Thu Feb 28 09:53:59 2019 - [info]   192.168.13.189(192.168.13.189:3306)

Thu Feb 28 09:53:59 2019 - [info]   192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info]   192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:53:59 2019 - [info] Alive Slaves:

Thu Feb 28 09:53:59 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Checking slave configurations..

Thu Feb 28 09:53:59 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 09:53:59 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 09:53:59 2019 - [info] Checking replication filtering settings..

Thu Feb 28 09:53:59 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Thu Feb 28 09:53:59 2019 - [info]  Replication filtering check ok.

Thu Feb 28 09:53:59 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 09:53:59 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:54:02 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 09:54:02 2019 - [info] Checking MHA Node version..

Thu Feb 28 09:54:03 2019 - [info]  Version check ok.

Thu Feb 28 09:54:03 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 09:54:04 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 09:54:04 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 09:54:04 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 09:54:04 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 09:54:04 2019 - [info]   Connecting to [email protected](192.168.13.190:22)..

Creating /var/tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 09:54:05 2019 - [info] Binlog setting check done.

Thu Feb 28 09:54:05 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 09:54:05 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002  --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info]   Connecting to [email protected](192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:05 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002  --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info]   Connecting to [email protected](192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:06 2019 - [info] Slaves settings check done.

Thu Feb 28 09:54:06 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

 +--192.168.13.189(192.168.13.189:3306)

 +--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.189..

Thu Feb 28 09:54:06 2019 - [info]  ok.

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.192..

Thu Feb 28 09:54:06 2019 - [info]  ok.

Thu Feb 28 09:54:06 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 09:54:06 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 09:54:06 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

[[email protected] mha]#

4.2.5 启动及相关日志文件

(1)启动:

nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[[email protected] mha]#nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log </dev/null 2>&1 &

[1] 6492

[[email protected] mha]# jobs

[1]+  运行中               nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[[email protected] mha]#

(2)日志文件

manager.log文件是MHA的启动日志文件,当启动成功会自动生成一个这样的文件并且记载了启动过程

[[email protected] mha]# tail -f manager.log

Thu Feb 28 10:05:57 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:05:57 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:05:57 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:05:57 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:05:57 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:01 2019 - [warning] Got error when monitoring master:  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master‘s advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking.  at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:01 2019 - [info] Got exit code 1 (Not master dead).——这些是默认自带的

下面是完整的启动流程:

Thu Feb 28 10:06:41 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 10:06:41 2019 - [warning] /etc/mha/mha.master_status.health already exists. You might have killed manager with SIGKILL(-9), may run two or more monitoring process for the same application, or use the same working directory. Check for details, and consider setting --workdir separately.

Thu Feb 28 10:06:43 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only

Thu Feb 28 10:06:43 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:06:43 2019 - [info] Dead Servers:

Thu Feb 28 10:06:43 2019 - [info] Alive Servers:

Thu Feb 28 10:06:43 2019 - [info]   192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:06:43 2019 - [info]   192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info]   192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:43 2019 - [info] Alive Slaves:

Thu Feb 28 10:06:43 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Checking slave configurations..

Thu Feb 28 10:06:43 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:06:43 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:06:43 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:06:43 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Thu Feb 28 10:06:43 2019 - [info]  Replication filtering check ok.

Thu Feb 28 10:06:43 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 10:06:43 2019 - [info] Starting SSH connection tests..

Thu Feb 28 10:06:46 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 10:06:46 2019 - [info] Checking MHA Node version..

Thu Feb 28 10:06:47 2019 - [info]  Version check ok.

Thu Feb 28 10:06:47 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 10:06:47 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:06:48 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 10:06:48 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:48 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 10:06:48 2019 - [info]   Connecting to [email protected](192.168.13.190:22)..

Creating /var/tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 10:06:48 2019 - [info] Binlog setting check done.

Thu Feb 28 10:06:48 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 10:06:48 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002  --slave_pass=xxx

Thu Feb 28 10:06:48 2019 - [info]   Connecting to [email protected](192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:49 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002  --slave_pass=xxx

Thu Feb 28 10:06:49 2019 - [info]   Connecting to [email protected](192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:50 2019 - [info] Slaves settings check done.

Thu Feb 28 10:06:50 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

 +--192.168.13.189(192.168.13.189:3306)

 +--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:50 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:06:50 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:06:50 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:06:50 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:06:50 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:54 2019 - [warning] Got error when monitoring master:  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master‘s advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking.  at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:54 2019 - [info] Got exit code 1 (Not master dead).

启动成功。

4.2.6 MHA测试

MHA其目的就是当master故障时能及时剔除掉故障并能推选出new_master从而使其他slave的master指向new_master,使业务不中断。

(1)停掉master(192.168.13.190)上的mysql

(2)查看MHA的进程日志manager.log

查看红色标记部分即可

[[email protected] mha]# tail -f manager.log

Thu Feb 28 10:14:54 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)

Thu Feb 28 10:14:54 2019 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=on

Thu Feb 28 10:14:55 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:14:55 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:55 2019 - [warning] Connection failed 2 time(s)..

Thu Feb 28 10:14:56 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:56 2019 - [warning] Connection failed 3 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Got error on MySQL connect: 2003 (Can‘t connect to MySQL server on ‘192.168.13.190‘ (111))

Thu Feb 28 10:14:57 2019 - [warning] Connection failed 4 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Master is not reachable from health checker!

Thu Feb 28 10:14:57 2019 - [warning] Master 192.168.13.190(192.168.13.190:3306) is not reachable!

Thu Feb 28 10:14:57 2019 - [warning] SSH is reachable.

Thu Feb 28 10:14:57 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.cnf again, and trying to connect to all servers to check server status..

Thu Feb 28 10:14:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 10:14:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:58 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:14:58 2019 - [info] Dead Servers:

Thu Feb 28 10:14:58 2019 - [info]   192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Servers:

Thu Feb 28 10:14:58 2019 - [info]   192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:14:58 2019 - [info]   192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Slaves:

Thu Feb 28 10:14:58 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info] Checking slave configurations..

Thu Feb 28 10:14:58 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:14:58 2019 - [warning]  relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:14:58 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:14:58 2019 - [info]  Replication filtering check ok.

Thu Feb 28 10:14:58 2019 - [info] Master is down!

Thu Feb 28 10:14:58 2019 - [info] Terminating monitoring script.

Thu Feb 28 10:14:58 2019 - [info] Got exit code 20 (Master dead).

Thu Feb 28 10:14:58 2019 - [info] MHA::MasterFailover version 0.58.

Thu Feb 28 10:14:58 2019 - [info] Starting master failover.

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:14:58 2019 - [info] * Phase 1: Configuration Check Phase..

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:15:00 2019 - [info] Dead Servers:

Thu Feb 28 10:15:00 2019 - [info]   192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Checking master reachability via MySQL(double check)...

Thu Feb 28 10:15:00 2019 - [info]  ok.

Thu Feb 28 10:15:00 2019 - [info] Alive Servers:

Thu Feb 28 10:15:00 2019 - [info]   192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:00 2019 - [info]   192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:00 2019 - [info] Alive Slaves:

Thu Feb 28 10:15:00 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info] Starting Non-GTID based failover.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] ** Phase 1: Configuration Check Phase completed.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] * Phase 2: Dead Master Shutdown Phase..

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] Forcing shutdown so that applications never connect to the current master..

Thu Feb 28 10:15:00 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.

Thu Feb 28 10:15:00 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.

Thu Feb 28 10:15:01 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3: Master Recovery Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] The latest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):

Thu Feb 28 10:15:01 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] The oldest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Oldest slaves:

Thu Feb 28 10:15:01 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.2: Saving Dead Master‘s Binlog Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] Fetching dead master‘s binary logs..

Thu Feb 28 10:15:01 2019 - [info] Executing command on the dead master 192.168.13.190(192.168.13.190:3306): save_binary_logs --command=save --start_file=on.000001  --start_pos=661 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58

Creating /var/tmp if not exists..    ok.

Concat binary/relay logs from on.000001 pos 661 to on.000001 EOF into /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog ..

Dumping binlog format description event, from position 0 to 124.. ok.

Dumping effective binlog data from /var/lib/mysql/on.000001 position 661 to tail(684).. ok.

Concat succeeded.

Thu Feb 28 10:15:03 2019 - [info] scp from [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:03 2019 - [info] HealthCheck: SSH to 192.168.13.189 is reachable.

Thu Feb 28 10:15:04 2019 - [info] HealthCheck: SSH to 192.168.13.192 is reachable.

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.3: Determining New Master Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] Finding the latest slave that has all relay logs for recovering other slaves..

Thu Feb 28 10:15:04 2019 - [info] All slaves received relay logs to the same position. No need to resync each other.

Thu Feb 28 10:15:04 2019 - [info] Searching new master from slaves..

Thu Feb 28 10:15:04 2019 - [info]  Candidate masters from the configuration file:

Thu Feb 28 10:15:04 2019 - [info]   192.168.13.189(192.168.13.189:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info]   192.168.13.192(192.168.13.192:3306)  Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info]     Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info]  Non-candidate masters:

Thu Feb 28 10:15:04 2019 - [info]  Searching from candidate_master slaves which have received the latest relay log events..

Thu Feb 28 10:15:04 2019 - [info] New master is 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:04 2019 - [info] Starting master failover..

Thu Feb 28 10:15:04 2019 - [info]

From:

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

To:

192.168.13.189(192.168.13.189:3306) (new master)

 +--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.4: New Master Diff Log Generation Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:04 2019 - [info] Sending binlog..

Thu Feb 28 10:15:05 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] * Phase 3.5: Master Log Apply Phase..

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.

Thu Feb 28 10:15:05 2019 - [info] Starting recovery on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:05 2019 - [info]  Generating diffs succeeded.

Thu Feb 28 10:15:05 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:05 2019 - [info]  done.

Thu Feb 28 10:15:05 2019 - [info] Getting slave status..

Thu Feb 28 10:15:05 2019 - [info] This slave(192.168.13.189)‘s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:05 2019 - [info] Connecting to the target slave host 192.168.13.189, running recover script..

Thu Feb 28 10:15:05 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=‘root‘ --slave_host=192.168.13.189 --slave_ip=192.168.13.189  --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:05 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.189:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:05 2019 - [info]  All relay logs were successfully applied.

Thu Feb 28 10:15:05 2019 - [info] Getting new master‘s binlog name and position..

Thu Feb 28 10:15:06 2019 - [info]  mysql-bin.000009:654

Thu Feb 28 10:15:06 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST=‘192.168.13.189‘, MASTER_PORT=3306, MASTER_LOG_FILE=‘mysql-bin.000009‘, MASTER_LOG_POS=654, MASTER_USER=‘slave‘, MASTER_PASSWORD=‘xxx‘;

Thu Feb 28 10:15:06 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.

Thu Feb 28 10:15:06 2019 - [info] Setting read_only=0 on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:06 2019 - [info]  ok.

Thu Feb 28 10:15:06 2019 - [info] ** Finished master recovery successfully.

Thu Feb 28 10:15:06 2019 - [info] * Phase 3: Master Recovery Phase completed.

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4: Slaves Recovery Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] -- Slave diff file generation on host 192.168.13.192(192.168.13.192:3306) started, pid: 11154. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:07 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:07 2019 - [info] -- 192.168.13.192(192.168.13.192:3306) has the latest relay log events.

Thu Feb 28 10:15:07 2019 - [info] Generating relay diff files from the latest slave succeeded.

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) started, pid: 11157. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Sending binlog..

Thu Feb 28 10:15:07 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:07 2019 - [info] Starting recovery on 192.168.13.192(192.168.13.192:3306)..

Thu Feb 28 10:15:07 2019 - [info]  Generating diffs succeeded.

Thu Feb 28 10:15:07 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:07 2019 - [info]  done.

Thu Feb 28 10:15:07 2019 - [info] Getting slave status..

Thu Feb 28 10:15:07 2019 - [info] This slave(192.168.13.192)‘s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:07 2019 - [info] Connecting to the target slave host 192.168.13.192, running recover script..

Thu Feb 28 10:15:07 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=‘root‘ --slave_host=192.168.13.192 --slave_ip=192.168.13.192  --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:08 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.192:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:08 2019 - [info]  All relay logs were successfully applied.

Thu Feb 28 10:15:08 2019 - [info]  Resetting slave 192.168.13.192(192.168.13.192:3306) and starting replication from the new master 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:08 2019 - [info]  Executed CHANGE MASTER.

Thu Feb 28 10:15:08 2019 - [info]  Slave started.

Thu Feb 28 10:15:09 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:09 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) succeeded.

Thu Feb 28 10:15:09 2019 - [info] All new slave servers recovered successfully.

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] * Phase 5: New master cleanup phase..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Resetting slave info on the new master..

Thu Feb 28 10:15:09 2019 - [info]  192.168.13.189: Resetting slave info succeeded.

Thu Feb 28 10:15:09 2019 - [info] Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

Thu Feb 28 10:15:09 2019 - [info]

----- Failover Report -----

mha: MySQL Master failover 192.168.13.190(192.168.13.190:3306) to 192.168.13.189(192.168.13.189:3306) succeeded

Master 192.168.13.190(192.168.13.190:3306) is down!

Check MHA Manager logs at ybb-test-02-6:/etc/mha/manager.log for details.

Started automated(non-interactive) failover.

The latest slave 192.168.13.189(192.168.13.189:3306) has all relay logs for recovery.

Selected 192.168.13.189(192.168.13.189:3306) as a new master.

192.168.13.189(192.168.13.189:3306): OK: Applying all logs succeeded.

192.168.13.192(192.168.13.192:3306): This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

192.168.13.192(192.168.13.192:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.13.189(192.168.13.189:3306)

192.168.13.189(192.168.13.189:3306): Resetting slave info succeeded.

Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

查看192.168.13.192上mysql的master是不是已经成功转变了:

由此可见,MHA已经起到高可用的目的了

4.3 集群小结

一、或者,其他方案? 不管哪种方案都是有其场景限制 或说 规模限制,以及优缺点的。

1、首先反对大家做读写分离,关于这方面的原因解释太多次数(增加技术复杂度、可能导致读到落后的数据等),只说一点:99.8%的业务场景没有必要做读写分离,只要做好数据库设计优化 和配置合适正确的主机即可。

2、Keepalived+MySQL --确实有脑裂的问题,还无法做到准确判断mysqld是否HANG的情况;

3、DRBD+Heartbeat+MySQL --同样有脑裂的问题,还无法做到准确判断mysqld是否HANG的情况,且DRDB是不需要的,增加反而会出问题;

3、MySQL Proxy -- 不错的项目,可惜官方半途夭折了,不建议用,无法高可用,是一个写分离;

4、MySQL Cluster -- 社区版本不支持NDB是错误的言论,商用案例确实不多,主要是跟其业务场景要求有关系、这几年发展有点乱不过现在已经上正规了、对网络要求高;

5、MySQL + MHA -- 可以解决脑裂的问题,需要的IP多,小集群是可以的,但是管理大的就麻烦,其次MySQL + MMM 的话且坑很多,有MHA就没必要采用MMM建议:

(1)若是双主复制的模式,不用做数据拆分,那么就可以选择MHA或 Keepalive 或 heartbeat

(2)若是双主复制,还做了数据的拆分,则可以考虑采用Cobar;

二、 MHA是自动的master故障转移和Slave提升的软件包.它是基于标准的MySQL复制(异步/半同步)。该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。

1)MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Manager会定时探测集群中的node节点,当发现master出现故障的时候,它可以自动将具有最新数据的slave提升为新的master,然后将所有其它的slave导向新的master上.整个故障转移过程对应用程序是透明的。

2)MHA Node运行在每台MySQL服务器上,它通过监控具备解析和清理logs功能的脚本来加快故障转移的。

5错误集锦

5.1 主从报错

1、 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file‘

解决办法:(1)stop slave;(2)reset slave; (3)start slave;若一遍不行,隔一会儿再试一下

reset slave 将使slave 忘记主从复制关系的位置信息。该语句将被用于干净的启动, 它删除master.info文件和relay-log.info 文件以及所有的relay log 文件并重新启用一个新的relaylog文件。使用reset slave之前必须使用stop slave 命令将复制进程停止。

2、若执行此3条命令不行,则使用以下方法

(1)删除所有bin-log日志【find / -name “*bin-log*” -exec ls {} \;】

(2)重新生成bin-log日志文件

change master to

-> master_log_file=‘bin_log.000001‘,

-> master_log_pos=155;

Start slave;

(3)若还是报和之前一样的错误,则:

① stop slave;

② reset slave;

③ start slave;

(4)完成

3、若提示连接超时或用户名、密码错误:

mysql -ubackup -h  2.2.2.2 -p123456,看看在从上能不能连上主mysql。

能:用户名+密码没问题

不能:用户名或密码错误

4、找不到mysql临时密码

[[email protected] mysql]# cat /var/log/mysqld.log |grep password

[[email protected] mysql]#

(1)删除原来安装过的mysql残留的数据(这一步非常重要,问题就出在这)

rm -rf /var/lib/mysql

(3)重启mysqld服务    systemctl restart mysqld

(4)查看mysql日志文件

[[email protected] mysql]# cat /var/log/mysqld.log |grep password

2019-02-25T08:50:58.053021Z 5 [Note] [MY-010454] [Server] A temporary password is generated for [email protected]: X=0gw#AAQdx*

5、User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

在master创建slave同步用户时未赋予权限,解决办法:

GRANT REPLICATION SLAVE ON *.* TO ‘slave‘@‘%‘;

flush privileges;

6、

这个时候先等一会儿或查看日志,show slave status\G;若没有错误等一会儿错误就会报出来(几分钟),报错如下:

Slave SQL for channel ‘‘: Could not execute Update_rows event on table mysql.user; Can‘t find record in ‘user‘, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event‘s master log mysql-bin.000006, end_log_pos 1768, Error_code: MY-001032

解决办法:

stop slave;

set global sql_slave_skip_counter=1;

start slave;

5.2 集群报错

1、[error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln781] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:

由于MHA目前支持一主多从模式也就是只支持一个master写,其他master只能读的模式,所以当多个master并存时,只能设置其他master为read-only模式即可

解决办法:在其他master执行mysql -uroot -p -e "set global read_only=1"即可

2、

Mon Mar  4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.13.190(192.168.13.190:3306): User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

Mon Mar  4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.

Mon Mar  4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

这是因为从192.168.13.190上面同步用户slave不存在导致,因为当master挂了之后,任何从都有可能成为master,所以若从没有同步用户slave会导致同步失败。

原文地址:https://www.cnblogs.com/zhangan/p/10898322.html

时间: 2024-10-11 20:41:26

mysql的AB及读写和集群的相关文章

MyCAT+MySQL 搭建高可用企业级数据库集群

第1章 课程介绍课程介绍1-1 MyCAT导学 试看1-2 课程介绍 第2章 MyCAT入门这一章中,我们将回顾了垂直切分,水平切分,分库分表等基础概念,然后快速回如何安装和启动MyCAT的,介绍如何以打包好的可执行程序的方式来启动MyCAT.以及如何对其相关的启动配置文件进行配置.2-1 章节综述2-2 什么是MyCAT2-3 什么是数据库中间层2-4 MyCAT的主要作用2-5 MyCAT基本元素2-6 MyCAT安装 第3章 MYCAT核心配置详解本章将对MyCAT的常用核心配置文件ser

mysql+mycat搭建稳定高可用集群,负载均衡,主备复制,读写分离

主要思路 测试环境 实现mysql主备复制 配置A主mysql 配置B备mysql 验证同步配置结果 验证是否同步 关闭B备mysql的同步,验证读写分离 实现读写分离 安装mycat 配置mycat 启动mycat 测试读写分离 验证是否同步 关闭B备mysql的同步,验证读写分离 数据库性能优化普遍采用集群方式,oracle集群软硬件投入昂贵,今天花了一天时间搭建基于mysql的集群环境. 主要思路 简单说,实现mysql主备复制-->利用mycat实现负载均衡. 比较了常用的读写分离方式,

mysql+myca搭建稳定高可用集群,负载均衡,主备复制,读写分离

数据库性能优化普遍采用集群方式,oracle集群软硬件投入昂贵,今天花了一天时间搭建基于mysql的集群环境. 主要思路 简单说,实现mysql主备复制-->利用mycat实现负载均衡. 比较了常用的读写分离方式,推荐mycat,社区活跃,性能稳定. 测试环境 MYSQL版本:Server version: 5.5.53,到官网可以下载WINDWOS安装包. 注意:确保mysql版本为5.5以后,以前版本主备同步配置方式不同. linux实现思路类似,修改my.cnf即可. A主mysql.19

DB层面上的设计 分库分表 读写分离 集群化 负载均衡

第1章  引言 随着互联网应用的广泛普及,海量数据的存储和访问成为了系统设计的瓶颈问题.对于一个大型的 互联网应用,每天几十亿的PV无疑对数据库造成了相当高的负载.对于系统的稳定性和扩展性造成了极大的问题.通过数据切分来提高网站性能,横向扩展数据层 已经成为架构研发人员首选的方式.水平切分数据库,可以降低单台机器的负载,同时最大限度的降低了了宕机造成的损失.通过负载均衡策略,有效的降低了单台 机器的访问负载,降低了宕机的可能性:通过集群方案,解决了数据库宕机带来的单点数据库不能访问的问题:通过读

MHA-结合MySQL半同步复制高可用集群(Centos7)

目录 一.理论概述 本案例部署思路 二.环境 三.部署 部署MHA 部署二进制包MySQL及部署主从复制 部署半同步复制 配置MHA MHA测试 部署lvs+keepalived(lvs1,lvs2) 四.总结 一.理论概述 在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用. 优点 缺点 由perl语言开发的开源工具 需要编写脚本或利用第三方工具来实现Vip的配置 支持

MyCAT+MySQL 搭建高可用企业级数据库集群——第1章 课程介绍

1-1 Mycat导学 1-2 课程介绍 1-1 Mycat导学 1.Mycat导学内容介绍: PS:来吧,让我们一起努力,变成更好的自己吧! 1-2 课程介绍 1.课程介绍: 目标:通过Mycat实现MySQL的高可用集群架构: 原文地址:https://www.cnblogs.com/tqtl911/p/9099913.html

mysql系列之--------mmm高可用集群

mmm高可用集群介绍: 高可用集群介绍:主备模式,当主不能提供服务的时候备用主机接替它提供服务. 这个过程对于客户端是透明的. 一.前期准备工作 1.准备五台linux服务器:主(192.168.4.122),主备(192.168.4.123) 从(192.168.4.126),从(192.168.4.127),管理主机(192.168.4.128) 四台主机做主从的用户名:tongbu,密码:123456 2.主和主备做成相互的主从模式,两台从服务器做成主备的从 3.IP规划,write-vi

使用hive客户端java api读写hive集群上的信息

上文介绍了hdfs集群信息的读取方式,本文说hive 1.先解决依赖 <properties> <hive.version>1.2.1</hive.version> </properties> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>${hive

MyCAT+MySQL 搭建高可用企业级数据库集群——第2章 MyCat入门

2-1 章节综述 2-2 什么是MyCat 2-3 什么是数据库中间层 2-4 MyCat的主要作用 2-5 MyCat基本元素 2-6 MyCat的安装 2-1 章节综述 1.掌握Mycat的基础概念.功能及适用场景: 2.掌握Mycat的安装和启动: 2-2 什么是MyCat 1.不同的工种对Mycat的不同理解: 研发人员: 架构师: 2.实现"读写分离"的两种方式: 从程序的角度实现,配置两套数据源: 通过中间件的角度实现,比如Mycat: 2-3 什么是数据库中间层 1.数据