现在的生产环境基本上都采用主备方式,而如何实现redis的主备呢?一般情况下,redis实现主从复制比较简单,只需要在从服务器的配置文件里配置 "slaveof"即可。但这样存在一个问题:主服务器挂掉时,可临时将业务地址转移到从服务器,但从服务器无法进行写操作。
为解决这个问题,在网上查找了一些资料。主要有以下几种方案
1、使用keepalived+监控脚本实现主备切换
2、使用redis cluster实现双主切换,原理就是模仿MySQL的bin-log机制。
本文介绍一下第1种方案
设计思路:
当 Master 与 Slave 均运作正常时, Master负责服务,Slave负责Standby;
当 Master 挂掉,Slave 正常时, Slave接管服务,同时关闭主从复制功能;
当 Master 恢复正常,则从Slave同步数据,同步数据之后关闭主从复制功能,恢复Master身份,于此同时Slave等待Master同步数据完成之后,恢复Slave身份。
然后依此循环。
环境介绍:
Master: 192.168.2.23
Slave: 192.168.2.24
VIP: 192.168.2.40
操作系统:
CentOS 6.5 64bit
软件版本:
redis 3.2.0
keepalived 1.2.13
实施步骤:
一、搭建redis主从
1、用saltstack部署redis到主从节点(略)
2、修改redis从节点的配置文件:
增加以下一行:
slaveof 192.168.2.40 6379
二、配置keepalived
1、Master和Slave两台主机上分别安装keepalived服务
yum -y install keepalived
2、修改keepalived配置文件
Master:
cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived vrrp_script chk_redis { script "/etc/keepalived/scripts/check_redis.sh" interval 5 weight 20 } vrrp_instance VI_REDIS_1 { #state BACKUP state MASTER interface eno16780032 virtual_router_id 30 mcast_src_ip 192.168.2.23 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 11111 } track_script { chk_redis } notify_master /etc/keepalived/scripts/master.sh notify_backup /etc/keepalived/scripts/backup.sh notify_stop /etc/keepalived/scripts/stop.sh notify_fault /etc/keepalived/scripts/fault.sh virtual_ipaddress { 192.168.2.40 } }
Slave:
cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived vrrp_script chk_redis { script "/etc/keepalived/scripts/check_redis.sh" interval 5 weight 20 } vrrp_instance VI_REDIS_1 { #state BACKUP state MASTER interface eno16780032 virtual_router_id 30 mcast_src_ip 192.168.2.24 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 11111 } track_script { chk_redis } notify_master /etc/keepalived/scripts/master.sh notify_backup /etc/keepalived/scripts/backup.sh notify_stop /etc/keepalived/scripts/stop.sh notify_fault /etc/keepalived/scripts/fault.sh virtual_ipaddress { 192.168.2.40 } }
3、编写脚本
首先在两台服务器分别创建监控redis的脚本
cat /etc/keepalived/scripts/check_redis.sh
#!/bin/bash count=1 LOGFILE="/var/log/keepalived-redis-state.log" while true do /usr/local/redis/bin/redis-cli -p 6379 -a abcd*123456 ping >/dev/null 2>&1 i=$? netstat -ntlup | grep 6379 >/dev/null 2>&1 j=$? if [ $i = 0 -a $j = 0 ];then exit 0 else if [ $count -gt 10 ]; then echo "`hostname` redis check is failed,Exit check script....." >>$LOGFILE break fi sleep 1 let count++ continue fi done /etc/init.d/keepalived stop
然后是以下这些脚本的用途
notify_master /etc/keepalived/scripts/master.sh
notify_backup /etc/keepalived/scripts/backup.sh
notify_stop /etc/keepalived/scripts/stop.sh
notify_fault /etc/keepalived/scripts/fault.sh
当keepalived进入Master状态时,会执行notify_master;
当keepalived进入Backup状态时,会执行notify_backup;
当keepalived进入fault状态时,会执行notify_fault ;
当keepalived进入stop状态时,会执行notify_stop;
在Master、Slave服务器上编写以下4个脚本
cat /etc/keepalived/scripts/master.sh
#!/bin/bash REDISCLI="/usr/local/redis/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log" echo "[master]" >> $LOGFILE echo "`date +%Y-%m-%d‘ ‘%H:%M:%S`" >> $LOGFILE echo " change to master role ..." >> $LOGFILE nohup /bin/bash /usr/local/shell/change_master.sh & while true do /bin/bash /usr/local/shell/role_check.sh |grep slave >> $LOGFILE if [ $? -eq 0 ];then nohup /bin/bash /usr/local/shell/change_master.sh & else exit 0 fi sleep 5 done
cat /etc/keepalived/scripts/backup.sh
#!/bin/bash REDISCLI="/usr/local/redis/bin/redis-cli" LOGFILE="/var/log/keepalived-redis-state.log" echo "[backup]" >> $LOGFILE echo "`date +%Y-%m-%d‘ ‘%H:%M:%S`" >> $LOGFILE echo " change to slave role ..." >> $LOGFILE sleep 20 nohup /bin/bash /usr/local/shell/change_slave.sh & while true do /bin/bash /usr/local/shell/role_check.sh |grep master >> $LOGFILE if [ $? -eq 0 ];then nohup /bin/bash /usr/local/shell/change_slave.sh & else exit 0 fi sleep 5 done
cat /etc/keepalived/scripts/stop.sh
#!/bin/bash log=‘/var/log/keepalived-redis-state.log‘ Host=`hostname` echo "`date +%Y-%m-%d‘ ‘%H:%M:%S` :`hostname` keepalived service stop ... " >> $log
cat /etc/keepalived/scripts/fault.sh
#!/bin/bash log=‘/var/log/keepalived-redis-state.log‘ echo "`date +%Y-%m-%d‘ ‘%H:%M:%S` : fault ..." >> $log
在Master、Slave服务器上编写以下4个故障切换脚本
cat /usr/local/shell/change_master.sh
#!/bin/bash PWD=abcd*123456 MASTER_IP=192.168.2.40 port=(6379) for PORT in ${port[*]} do /usr/local/redis/bin/redis-cli -p $PORT -a $PWD SLAVEOF NO ONE done
cat /usr/local/shell/change_slave.sh
#!/bin/bash PWD=abcd*123456 MASTER_IP=192.168.2.40 port=(6379) for PORT in ${port[*]} do /usr/local/redis/bin/redis-cli -p $PORT -a $PWD SLAVEOF $MASTER_IP $PORT done
cat /usr/local/shell/startredis.sh
#!/bin/bash port=(6379) for PORT in ${port[*]} do /etc/init.d/redis_$PORT start done
cat /usr/local/shell/stopredis.sh
#!/bin/bash port=(6379) for PORT in ${port[*]} do /etc/init.d/redis_$PORT stop done
cat /usr/local/shell/restartredis.sh
#!/bin/bash port=(6379) for PORT in ${port[*]} do /etc/init.d/redis_$PORT restart done
cat /usr/local/shell/role_check.sh
#!/bin/bash PWD=abcd*123456 MASTER_IP=192.168.2.40 port=(6379) for PORT in ${port[*]} do /usr/local/redis/bin/redis-cli -p $PORT -a $PWD info|egrep "role|tcp_port" done