高可用是个老生常谈的问题了,开源的高可用软件已经做的相当成熟了,之前也在debian下做过lvs+heartbeat的4层LB,一直很稳定(可惜流量不大啊),现在由于业务的需要,做一个基于keepalived+nginx的高可用7层负载均衡。
拓扑结构也比较简单,就不画拓扑图了:2个节点上分别安装配置keepalived和nginx,配置nginx反向代理后端的real server
比较关键的几个点:
1、为避免同一个局域网中有多个keepalived组中的多播相互响应,采用单播通信
2、状态切换的过程触发邮件通知、短信通知、web通知、log记录,便于通过各种途径了解主备工作状态
3、nginx的检测脚本采用了轻量级的方式:"killall -0 nginx",还可以使用 pidof nginx的方式或者调用其他自定义检测脚本的方式
4、特别要注意优先级的大小及检测到异常时权重的变化
5、了解免费ARP的工作机制
6、了解VRRP协议的适用范围:局域网,第一跳网关冗余
7、单个vrrp实例工作在主备模式,为最大程度的利用2个节点的资源,可以做多个vrrp实例,实现高可用和负载均衡
为便于软件包的管理,采用centos自带的keepalived,nginx1.8.0采用nginx官方源,整体安装也比较简单。
#yum install keepalived nginx -y
设置关键服务的开机启动
#chkconfig nginx on
#chkconfig keepalived on
查看keepalived包安装了那些文件(文档一定好好看):
#rpm -ql keepalived
/etc/keepalived
/etc/keepalived/keepalived.conf
/etc/rc.d/init.d/keepalived
/etc/sysconfig/keepalived
/usr/bin/genhash
/usr/libexec/keepalived
/usr/sbin/keepalived
/usr/share/doc/keepalived-1.2.13
/usr/share/doc/keepalived-1.2.13/AUTHOR
.........
keepalived的主配置文件,2台机器的配置文件略有区别,具体请看配置文件中注释
[[email protected] ~]# more /etc/keepalived/keepalived.conf
####Configuration File for keepalived
####内部API网关 keepalived HA配置
#### laijingli 20151213
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id proxy101 ## proxy101 on master101 , proxy102 on backup102
}
###simple check with killall -0 which is less expensive than pidof to verify that nginx is running
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 1
weight 2
fall 2
rise 1
}
vrrp_instance YN_API_GATEWAY {
state MASTER ## MASTER on master101 , BACKUP on backup102
interface em1
virtual_router_id 101
priority 200 ## 200 on master101 , 199 on backup102
advert_int 1
###采用单播通信,避免同一个局域网中多个keepalived组之间的相互影响
unicast_src_ip 192.168.0.101 ##本机ip
unicast_peer {
192.168.0.102 ##对端ip
}
authentication {
auth_type PASS
auth_pass testpass
}
virtual_ipaddress {
192.168.0.105 ## VIP
}
###如果只有一块网卡的话监控网络接口就没有必要了
#track_interface {
# em1
#}
track_script {
chk_nginx
}
###状态切换是发送邮件通知,本机记录log,后期会触发短信通知
notify_master /usr/local/bin/keepalived_notify.sh notify_master
notify_backup /usr/local/bin/keepalived_notify.sh notify_backup
notify_fault /usr/local/bin/keepalived_notify.sh notify_fault
notify /usr/local/bin/keepalived_notify.sh notify
smtp_alert
}
VRRP 实例组节点状态切换触发邮件通知、短信通知、log记录的脚本
# more /usr/local/bin/keepalived_notify.sh
#!/bin/bash
###keepalived notify script for record ha state transtion to log files
###将将状态转换过程记录到log,便于排错
logfile=/var/log/keepalived.notify.log
echo --------------- >> $logfile
echo `date` [`hostname`] keepalived HA role state transition: $1 $2 $3 $4 $5 $6 >> $logfile
###将状态转换记录到nginx的文件,便于通过web查看ha状态(一定注意不要开放到公网)
echo `date` `hostname` $1 $2 $3 $4 $5 $6 > /usr/share/nginx/html/index.html
nginx的部分配置文件,仅供参考
# more /etc/nginx/nginx.conf
###运维管理用途: 用于区别vip跑在那台服务器上
server {
listen 80;
server_name localhost;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
## nginx monitor use only
###add by lai monitor nginx status
location /server-status {
stub_status on;
allow 127.0.0.1;
allow 192.168.0.0/24;
}
}
通过启停keepalived和nginx来模拟故障,测试vrrp 实例的状态切换过程(即VIP的漂移):
/etc/init.d/keepalived start/stop
/etc/init.d/nginx stop start/stop
查看VIP跑在那台服务器上:
# ip addr show|grep 192.168
inet 192.168.0.101/24 brd 192.168.0.255 scope global em1
inet 192.168.0.105/32 scope global em1
测试:
[[email protected] ~]$ curl 192.168.0.101
Mon Dec 14 16:27:10 CST 2015 proxy101 INSTANCE YN_API_GATEWAY MASTER 202
[[email protected] ~]$ curl 192.168.0.102
Mon Dec 14 16:34:40 CST 2015 proxy102 INSTANCE YN_API_GATEWAY BACKUP 199
[[email protected] ~]$ curl 192.168.0.105
Mon Dec 14 16:27:10 CST 2015 proxy101 INSTANCE YN_API_GATEWAY MASTER 202
# tail /var/log/keepalived.notify.log
Mon Dec 14 16:25:13 CST 2015 [proxy101] keepalived HA role state transition:
Mon Dec 14 16:25:13 CST 2015 [proxy101] keepalived HA role state transition: INSTANCE YN_API_GATEWAY MASTER 202
---------------
---------------
Mon Dec 14 16:26:34 CST 2015 [proxy101] keepalived HA role state transition:
Mon Dec 14 16:26:34 CST 2015 [proxy101] keepalived HA role state transition: INSTANCE YN_API_GATEWAY BACKUP 200
---------------
---------------
Mon Dec 14 16:27:10 CST 2015 [proxy101] keepalived HA role state transition:
Mon Dec 14 16:27:10 CST 2015 [proxy101] keepalived HA role state transition: INSTANCE YN_API_GATEWAY MASTER 202
# tail /var/log/messages
Dec 14 16:27:08 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) forcing a new MASTER election
Dec 14 16:27:08 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) forcing a new MASTER election
Dec 14 16:27:09 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) Transition to MASTER STATE
Dec 14 16:27:10 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) Entering MASTER STATE
Dec 14 16:27:10 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) setting protocol VIPs.
Dec 14 16:27:10 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) Sending gratuitous ARPs on em1 for 192.168.0.105
Dec 14 16:27:10 localhost Keepalived_healthcheckers[74307]: Netlink reflector reports IP 192.168.0.105 added
Dec 14 16:27:10 localhost Keepalived_vrrp[74308]: Remote SMTP server [127.0.0.1]:25 connected.
Dec 14 16:27:10 localhost Keepalived_vrrp[74308]: SMTP alert successfully sent.
Dec 14 16:27:15 localhost Keepalived_vrrp[74308]: VRRP_Instance(YN_API_GATEWAY) Sending gratuitous ARPs on em1 for 192.168.0.105
主备状态转换通知邮件:
通过抓包查看来了解详细的工作过程:
# tcpdump -ni em1 vrrp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on em1, link-type EN10MB (Ethernet), capture size 65535 bytes
17:36:47.098225 IP 192.168.0.101 > 192.168.0.102: VRRPv2, Advertisement, vrid 101, prio 202, authtype simple, intvl 1s, length 20
17:36:47.388540 IP 192.168.0.22 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 120, authtype simple, intvl 1s, length 20
17:36:48.099409 IP 192.168.0.101 > 192.168.0.102: VRRPv2, Advertisement, vrid 101, prio 202, authtype simple, intvl 1s, length 20
17:36:48.389504 IP 192.168.0.22 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 120, authtype simple, intvl 1s, length 20
17:36:49.100544 IP 192.168.0.101 > 192.168.0.102: VRRPv2, Advertisement, vrid 101, prio 202, authtype simple, intvl 1s, length 20
17:36:49.390487 IP 192.168.0.22 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 120, authtype simple, intvl 1s, length 20
17:36:50.101713 IP 192.168.0.101 > 192.168.0.102: VRRPv2, Advertisement, vrid 101, prio 202, authtype simple, intvl 1s, length 20
17:36:50.391453 IP 192.168.0.22 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 120, authtype simple, intvl 1s, length 20
参考:https://www.nginx.com/resources/admin-guide/nginx-ha-keepalived/