简介:
maxscale作为一款数据库中间件,有着高性能的读写分离和负载均衡的db router作用。由于mariadb在1.4.4版本之后采用延迟开源的政策,一直没有打算在线上部署测试,近期发现线上的atlas有php连接的弊端,所以安装测试了maxscale。
关于maxscale的介绍请参考官方的wiki
注:美团近期完全开源了其内部使用的dbproxy(基于atlas),后续测试
环境(测试):
os:centos6 percona server5.6 maxscale1.4.4
硬件:
cpu:Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz 24core
disk:SSD ---数据文件放在SSD
M-S:192.168.1.21:3306(M)---->192.128.1.22:3306(S)
maxscale:192.168.1.23
注:主备环境搭建此篇不做介绍,maxscale可结合MHA来部署
安装:
- 安装方式:
rpm包安装--本文采用方式
源码安装
二进制安装
2.安装步骤
(1)获取rpm包
wget https://downloads.mariadb.com/MaxScale/1.4.4/centos/6Server/x86_64/maxscale-1.4.4-1.centos.6.x86_64.rpm
rpm -ivh maxscale-1.4.4-1.centos.6.x86_64.rpm
(2)创建maxscale用户,主库执行
CREATE USER [email protected]‘192.168.1.%‘ IDENTIFIED BY "maxscaletest";
GRANT replication slave, replication client ON *.* TO [email protected]‘192.168.1.%‘;
GRANT SELECT ON mysql.* TO [email protected]‘192.168.1.%‘;
GRANT ALL ON maxscale_schema.* TO [email protected]‘192.168.1.%‘;
GRANT SHOW DATABASES ON *.* TO [email protected]‘192.168.1.%‘;
注:此用户用来监控和实现maxscale内部调用get user等操作
(3)创建maxscale需要的目录
mkdir -p /data/maxscale/{data,cache,logs,tmp}
mkdir -p /data/maxscale/logs/{trace,binlog}
(4)生成加密密码
maxkeys /data/maxscale/data/
maxpasswd /data/maxscale/data/.secrets maxscaletest
49066584626E94EA24A963164E5AA5F6
(5)生成配置文件
cat /etc/maxscale.cnf
# MaxScale documentation on GitHub:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Documentation-Contents.md
# Global parameters
#
# Complete list of configuration options:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Getting-Started/Configuration-Guide.md
[maxscale]
# 开启线程个数,默认为1.设置为auto会同cpu核数相同
threads=auto
# timestamp精度
ms_timestamp=1
# 将日志写入到syslog中
syslog=1
# 将日志写入到maxscale的日志文件中
maxlog=1
# 不将日志写入到共享缓存中,开启debug模式时可打开加快速度
log_to_shm=0
# 记录告警信息
log_warning=1
# 记录notice信息
log_notice=1
# 记录info信息
log_info=1
# 不打开debug模式
log_debug=0
# 日志递增
log_augmentation=1
# 相关目录设置
basedir=/usr/bin/maxscale/
logdir=/data/maxscale/logs/trace/
datadir=/data/maxscale/data/
cachedir=/data/maxscale/cache/
piddir=/data/maxscale/tmp/
[server1]
type=server
address=192.168.1.21
port=3306
protocol=MySQLBackend
serv_weight=1
[server2]
type=server
address=192.168.1.22
port=3306
protocol=MySQLBackend
serv_weight=3
[MySQL Monitor]
type=monitor
module=mysqlmon
servers=server1,server2
user=maxscale
passwd=49066584626E94EA24A963164E5AA5F6
# 监控心态为 10s
monitor_interval=10000
# 当复制slave全部断掉时,maxscale仍然可用,将所有的访问指向master节点
detect_stale_master=true
# 监控主从复制延迟,可用后续指定router service的(配置此参数请求会永远落在 master)
# detect_replication_lag=true
[Read-Only Service]
type=service
router=readconnroute
servers=server1,server2,server3
user=maxscale
passwd=49066584626E94EA24A963164E5AA5F6
router_options=slave
# 允许root用户登录执行
enable_root_user=1
# 查询权重
weightby=serv_weight
[Read-Write Service]
type=service
router=readwritesplit
servers=server1,server2
user=maxscale
passwd=49066584626E94EA24A963164E5AA5F6
max_slave_connections=100%
# sql语句中的存在变量只指向master中执行
use_sql_variables_in=master
# 允许root用户登录执行
enable_root_user=1
# 允许主从最大间隔(s)
max_slave_replication_lag=3600
[MaxAdmin Service]
type=service
router=cli
[Read-Only Listener]
type=listener
service=Read-Only Service
protocol=MySQLClient
port=4008
[Read-Write Listener]
type=listener
service=Read-Write Service
protocol=MySQLClient
port=4006
[MaxAdmin Listener]
type=listener
service=MaxAdmin Service
protocol=maxscaled
socket=/data/maxscale/tmp/maxadmin.sock
port=6603
注:从配置文件可以看出,maxscale是支持readwrite和read only两种模式。各自监听不同端口。
(6)启动:
maxscale -f /etc/maxscale.cnf && tailf /data/maxscale/logs/trace/maxscale1.log
如果有报错信息,日志会打印出来。一般maxscale用户权限和防火墙等端口开放错误
(7)管理
查看后台状态
maxadmin -uadmin -pmariadb
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
server1 | 192.168.1.21 | 3306 | 0 | Master, Running
server2 | 192.168.1.22 | 3306 | 0 | Slave, Running
MaxScale> list services
Services.
--------------------------+----------------------+--------+---------------
Service Name | Router Module | #Users | Total Sessions
--------------------------+----------------------+--------+---------------
Read-Only Service | readconnroute | 1 | 1
Read-Write Service | readwritesplit | 1 | 1
MaxAdmin Service | cli | 2 | 2
(8)读写分离的测试
使用业务用户proin登录
mysql -uproin -p -P4008 -h192.168.1.23
Enter password:
mysql>use test; create table test_maxscale(id int);insert into test_maxscale values(87);select * from test_maxscale;
查看后台日志(需要打开info)
tailf /data/maxscale/logs/trace/maxscale1.log
2017-01-03 23:17:02.516 [9] info : (route_single_stmt): > Autocommit: [enabled], trx is [not open], cmd: COM_QUERY, type: QUERY_TYPE_WRITE, stmt: create table test_maxscale(id int)
2017-01-03 23:17:02.516 [9] info : (route_single_stmt): Route query to master 192.168.1.21:3306 <
2017-01-03 23:17:06.517 [9] info : (route_single_stmt): > Autocommit: [enabled], trx is [not open], cmd: COM_QUERY, type: QUERY_TYPE_WRITE, stmt: insert into test_maxscale values(87)
2017-01-03 23:17:06.517 [9] info : (route_single_stmt): Route query to master 192.168.1.21:3306 <
2017-01-03 23:17:08.518 [9] info : (route_single_stmt): > Autocommit: [enabled], trx is [not open], cmd: COM_QUERY, type: QUERY_TYPE_WRITE, stmt: select * from test_maxscale
2017-01-03 23:17:08.518 [9] info : (route_single_stmt): Route query to slave 192.168.1.22:3306 <
读写分离已正常
(9)性能测试
性能测试采用公司的自动化测试工具
核心代码原理:python短连接接口并发执行sql,可控制并发。
QPS测试结果:
直连22:
测试日期为: 2017-01-04 12:56:40
总并发数为: 200
被测接口: testmysql
并发用户为: 200
测试时长为: 36s
发送request总数: 64156
success个数: 64177
failed个数: 0
ERROR个数: 0
失败总数为: 0
失败率为: 0.000‰
测试平均QPS为: 1982.69
maxscale:
测试日期为: 2017-01-04 12:58:28
总并发数为: 200
被测接口: testmysql
并发用户为: 200
测试时长为: 36s
发送request总数: 64599
success个数: 64650
failed个数: 0
ERROR个数: 0
失败总数为: 0
失败率为: 0.000‰
测试平均QPS为: 1795.83
注:从测试结果看,在短连接并发200时,maxscale性能下降在可接受范围内,确实是高性能的db router.
测试过程中观察maxscale主机,总体上maxscale吃cpu资源,压力还是来自后面的mysql server
结论:
maxscale性能优越,可作为atlas的替代产品部署上线
附文: 开篇提到的php连接问题
线上架构为atlas+MHA架构,主库报错,数据文件损坏。通知业务方限制部分写入口之后,在atlas管理backends里面踢掉rw做repair。发现大量php查询报错,java无报错。后复现问题(php小程序循环查询hostname),发现只要主库(RW)offline,便报错,mysql server gone away,至今未找到具体原因,如有知晓,可在评论中告知。谢谢