说明:
上周研究了DRBD的安装和配置,今天研究下DRBD的第一个应用,利用DRBD+HeartBeat+NFS:配置NFS的高可用,作为集群中的底端共享存储
NFS主要存储WEB服务器上的程序代码和一些图片文件
参考:
http://network.51cto.com/art/201010/230237_all.htm
http://showerlee.blog.51cto.com/2047005/1212185
环境:
[[email protected] ~]# cat /etc/issue CentOS release 6.4 (Final) Kernel \r on an \m [[email protected] ~]# uname -r 2.6.32-358.el6.i686
dbm135 | 192.168.186.135 | dbm135.51.com | primary | DRBD+HeartBeat+NFS |
dbm134 | 192.168.186.134 | dbm134.51.com | secondary | DRBD+HeartBeat+NFS |
VIP | 192.168.186.150 |
准备工作和安装DRBD:
参考:http://732233048.blog.51cto.com/9323668/1665979
安装配置HeartBeat:
安装HeartBeat:(dbm135,dbm134)
这里采用yum的方式安装HeartBeat(推荐)
centos6.4默认不带HeartBeat软件包,需要安装epel源
[[email protected] ~]# cd /usr/local/src/ [[email protected] src]# wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm [[email protected] src]# rpm -ivh epel-release-6-8.noarch.rpm [[email protected] src]# yum -y install heartbeat Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again
注意:yum安装时报错,解决方法:
[[email protected] ~]#vi /etc/yum.repos.d/epel.repo #将所有baseurl行的注释去掉 #将所有mirrorlist行注释掉 [[email protected] src]# yum -y install heartbeat
注意:在yum安装HeartBeat时,第一次安装总是有问题,需要第二次安装才会成功(不知道是为什么)
在yum安装HeartBeat时,会把nfs相关的软件包和rpcbind一起安装
配置HeartBeat:(dbm135,dbm134)
Heartbeat配置共涉及以下几个文件: /etc/ha.d/ha.cf #主配置文件 /etc/ha.d/haresources #资源文件 /etc/ha.d/authkeys #认证相关 /etc/ha.d/resource.d/killnfsd #nfs启动脚本,由HeartBeat管理
[[email protected] ~]# vi /etc/ha.d/ha.cf (dbm135) logfile /var/log/ha-log #定义HA的日志名及存放位置 logfacility local0 keepalive 2 #设定心跳(监测)时间为2秒 deadtime 5 #死亡时间定义为5秒 ucast eth0 192.168.186.134 #采用单播方式,IP地址指定为对方IP,若有内网网卡,最好指定内网ip auto_failback off #服务器正常后由主服务器接管资源,另一台服务器放弃该资源 node dbm135.51.com dbm134.51.com #定义节点,指定主机名 hostname
[[email protected] ~]# vi /etc/ha.d/ha.cf (dbm134) logfile /var/log/ha-log #定义HA的日志名及存放位置 logfacility local0 keepalive 2 #设定心跳(监测)时间为2秒 deadtime 5 #死亡时间定义为5秒 ucast eth0 192.168.186.135 #采用单播方式,IP地址指定为对方IP,若有内网网卡,最好指定内网ip auto_failback off #服务器正常后由主服务器接管资源,另一台服务器放弃该资源 node dbm135.51.com dbm134.51.com #定义节点,指定主机名 hostname
编辑双机互联验证文件authkeys :(dbm135,dbm134)
[[email protected] ~]# vi /etc/ha.d/authkeys auth 1 1 crc
#需要将 /etc/ha.d/authkeys设为600的权限 [[email protected] ~]# chmod 600 /etc/ha.d/authkeys
编辑集群资源文件haresources:(dbm135,dbm134)
[[email protected] ~]# vi /etc/ha.d/haresources dbm135.51.com IPaddr::192.168.186.150/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext4 killnfsd ##两台主机dbm135和dbm134的此文件内容一模一样,不要擅自把dbm134修改为dbm134.51.com ##主机名设置为此时的主节点的主机名,即dbm135.51.com ##Ipaddr:绑定虚拟ip,且绑定在eth0上 ##drbddisk:指定drbd的资源r0 ##Filesystem:指定drbd的设备/dev/drbd0,挂载点/data,文件系统ext4 ##killnfsd:指定nfs的启动脚本,由heartbeat管理
编辑nfs脚本文件killnfsd:(dbm135,dbm134)
目的是为了重启nfs服务,因为NFS服务切换后,必须重新mount一下nfs共享出来的目录,否则会出现stale NFS file handle的错误
[[email protected] ~]#vi /etc/ha.d/resource.d/killnfsd killall -9 nfsd; /etc/init.d/nfs restart; exit 0 [[email protected] ~]#chmod 755 /etc/ha.d/resource.d/killnfsd
[[email protected] ~]# cd /etc/ha.d/resource.d/ [[email protected] resource.d]# ll drbddisk Filesystem killnfsd IPaddr -rwxr-xr-x 1 root root 3162 Sep 27 2013 drbddisk -rwxr-xr-x 1 root root 1903 Dec 2 2013 Filesystem -rwxr-xr-x 1 root root 2273 Dec 2 2013 IPaddr -rwxr-xr-x 1 root root 49 Jun 30 12:02 killnfsd ##四个脚本都存在
配置nfs:(dbm135,dbm134)
注意:nfs相关软件包,在安装HeartBeat时作为依赖包已经安装好了
[[email protected] ~]# vi /etc/exports /data 192.168.186.0/255.255.255.0(rw,no_root_squash,sync) [[email protected] ~]# chkconfig rpcbind on [[email protected] ~]# chkconfig nfs off #nfs不需要设置开机自动启动,因为nfs的启动由heartbeat管理 [[email protected] ~]# /etc/init.d/rpcbind start Starting rpcbind: [ OK ] ##nfs不需要启动,会由heartbeat来启动
启动HeartBeat:(dbm135,dbm134)
注意:先在主节点上启动(dbm135是primary)
[[email protected] ~]# /etc/init.d/heartbeat start [[email protected] ~]#chkconfig heartbeat on [[email protected] ~]# ps -ef | grep heartbeat root 1854 1 0 12:33 ? 00:00:00 heartbeat: master control process root 1858 1854 0 12:33 ? 00:00:00 heartbeat: FIFO reader root 1859 1854 0 12:33 ? 00:00:00 heartbeat: write: ucast eth0 root 1860 1854 0 12:33 ? 00:00:00 heartbeat: read: ucast eth0 root 2057 2034 0 12:33 ? 00:00:00 /bin/sh /usr/share/heartbeat/ResourceManager takegroup IPaddr::192.168.186.150/24/eth0 root 2283 1 0 12:33 ? 00:00:00 /bin/sh /usr/lib/ocf/resource.d//heartbeat/IPaddr start root 2286 2283 0 12:33 ? 00:00:00 /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.186.150 eth0 192.168.186.150 auto not_used not_used root 2471 2057 0 12:33 ? 00:00:00 /bin/sh /usr/share/heartbeat/ResourceManager takegroup IPaddr::192.168.186.150/24/eth0 root 2566 1352 0 12:33 pts/1 00:00:00 grep heartbeat [[email protected] ~]# ps -ef | grep nfs #查看主节点上的nfs是否启动 root 2493 2 0 17:59 ? 00:00:00 [nfsd4] root 2494 2 0 17:59 ? 00:00:00 [nfsd4_callbacks] root 2495 2 0 17:59 ? 00:00:00 [nfsd] root 2496 2 0 17:59 ? 00:00:00 [nfsd] root 2497 2 0 17:59 ? 00:00:00 [nfsd] root 2498 2 0 17:59 ? 00:00:00 [nfsd] root 2499 2 0 17:59 ? 00:00:00 [nfsd] root 2500 2 0 17:59 ? 00:00:00 [nfsd] root 2501 2 0 17:59 ? 00:00:00 [nfsd] root 2502 2 0 17:59 ? 00:00:00 [nfsd] root 2530 1528 0 17:59 pts/1 00:00:00 grep nfs
测试:
测试一:
测试是否可以成功挂载
##nfs服务端,dbm135,primary [[email protected] ~]# cd /data/ [[email protected] data]# ll total 16 -rw-r--r-- 1 root root 0 Jun 30 10:15 file1 -rw-r--r-- 1 root root 0 Jun 30 10:15 file2 -rw-r--r-- 1 root root 0 Jun 30 10:15 file3 -rw-r--r-- 1 root root 0 Jun 30 10:15 file4 -rw-r--r-- 1 root root 0 Jun 30 10:15 file5 -rw-r--r-- 1 root root 0 Jun 30 18:01 file6 drwx------ 2 root root 16384 Jun 30 10:14 lost+found
在另外一台主机192.168.186.131上挂载nfs服务器共享出来的目录
[[email protected] ~]# showmount -e 192.168.186.150 Export list for 192.168.186.150: /data 192.168.186.0/255.255.255.0 [[email protected] ~]# mount 192.168.186.150:/data /data [[email protected] ~]# cd /data/ [[email protected] data]# ll total 16 -rw-r--r-- 1 root root 0 Jun 30 2015 file1 -rw-r--r-- 1 root root 0 Jun 30 2015 file2 -rw-r--r-- 1 root root 0 Jun 30 2015 file3 -rw-r--r-- 1 root root 0 Jun 30 2015 file4 -rw-r--r-- 1 root root 0 Jun 30 2015 file5 -rw-r--r-- 1 root root 0 Jun 30 2015 file6 drwx------ 2 root root 16384 Jun 30 2015 lost+found ##挂载成功
测试二:主节点DRBD重启
在主节点把DRBD重启,看Secondary节点的变化,192.168.186.131挂载是否仍正常