Heartbeat+DRBD+MySQ+NFS部署文档 / 憋错料

系统环境

系统：CentOS6.6

系统位数 x86_64

软件环境

heartbeat-3.0.4-2

drbd-8.4.3

nfs-utils-1.2.3-26

部署环境

角色 IP

VIP 192.168.1.13(内网提供服务的地址)

data-09.com br0:192.168.1.9

data-11.com br0:192.168.1.11

1、DRBD 篇

注意：DRBD可使用硬盘、分区、逻辑卷，但不能建立文件系统

1)、安装依赖包

[[email protected] ~]# yum install gcc gcc-c++ make glibc flex kernel-devel kernel-headers

2)、安装drbd

[[email protected] ~]#wget http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz

[[email protected] ~]#tar zxvf drbd-8.4.3.tar.gz

[[email protected] ~]#cd drbd-8.4.3

[[email protected] ~]#./configure --prefix=/usr/local/tdoa/drbd --with-km

[[email protected] ~]#make KDIR=/usr/src/kernels/2.6.32-279.el6.x86_64/

[[email protected] ~]#make install

[[email protected] ~]#mkdir -p /usr/local/tdoa/drbd/var/run/drbd

[[email protected] ~]#cp /usr/local/tdoa/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d

[[email protected] ~]#加载DRBD模块:

[[email protected] ~]# modprobe drbd

3)、配置DRBD

主备节点两端配置文件完全一致

[[email protected] ~]#cat /usr/local/drbd/etc/drbd.conf

resource r0{

protocol C;

startup { wfc-timeout 0; degr-wfc-timeout 120;}

disk { on-io-error detach;}

net{

timeout 60;

connect-int 10;

ping-int 10;

max-buffers 2048;

max-epoch-size 2048;

}

syncer { rate 100M;}

on data-09.com{

device /dev/drbd0;

disk /dev/data/data_lv;

address 192.168.1.9:7788;

meta-disk internal;

}

on data-11.com{

device /dev/drbd0;

disk /dev/data/data_lv;

address 192.168.1.11:7788;

meta-disk internal;

}

4)、初始化drbd的r0资源并启动

在两个节点都要做的操作

[[email protected] ~]# drbdadm create-md r0

[[email protected] ~]# drbdadm up r0

查看data-09.com和data-11.com的状态应该类似下面的：

[[email protected] ~]# cat /proc/drbd

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2014-02-26 07:26:07

0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----

ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

5)、将data-09.com提升为主节点并设置启动

[[email protected] ~]# drbdadm primary --force r0

查看data-09.com的状态应该类似下面的：

[[email protected] ~]# cat /proc/drbd

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2014-02-26 07:28:26

0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:4 nr:0 dw:4 dr:681 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注意：DRBD服务需要开机自启动

2.NFS篇

yum install nfs-utils portmap -y 安装NFS服务

vim /etc/exports

/usr/local/tdoa/data/attach 192.168.100.0/24(rw,no_root_squash)

/usr/local/tdoa/data/attachment 192.168.100.0/24 (rw,no_root_squash)

service rpcbind restart

service nfs restart

chkconfig rpcbind on

chkconfig nfs off

service nfs stop

测试NFS可被前端Web服务器挂载并可写后停止NFS服务

3.Mysql篇

1．建立高可用目录/usr/local/data

data5 目录用于数据库文件

2.heartbeat主修改Mysql数据库存放目录至/usr/local/data/data5

3.主heartbeat和备heartbeat服务器上的Mysql安装完毕后切换DRBD分区切换至备机，备机的Mysql是否正常工作。

将主机降级为备机

[[email protected] /]# drbdadm secondary r0

[[email protected] /]# cat /proc/drbd

在备机data-11.com上, 将它升级为”主机”.

[[email protected]/]# drbdadm primary r0

4、heartbeat篇

(1.1)、YUM安装heartbeat

[[email protected] ~]#wget http://mirrors.sohu.com/fedora-epel/6Server/x86_64/epel-release-6-8.noarch.rpm

[[email protected] ~]# rpm -ivh epel-release-6-8.noarch.rpm

[[email protected] ~]# yum install heartbeat -y

(1.2)、RPM安装heartbeat

1.yum install "liblrm.so.2()(64bit)"

2.rpm -ivh PyXML-0.8.4-19.el6.x86_64.rpm

3.rpm -ivh perl-TimeDate-1.16-13.el6.noarch.rpm

4.rpm -ivh resource-agents-3.9.5-12.el6_6.1.x86_64.rpm

5.rpm -ivh cluster-glue-1.0.5-6.el6.x86_64.rpm

6.rpm -ivh cluster-glue-libs-1.0.5-6.el6.x86_64.rpm

7.rpm -ivh heartbeat-libs-3.0.4-2.el6.x86_64.rpm heartbeat-3.0.4-2.el6.x86_64.rpm

备注:heartbeat-libs和heartbeat要一起安装

(2)、配置heartbeat

主备节点两端的配置文件(ha.cf authkeys haresources)完全相同

cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/

cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/

cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/

vim /etc/ha.d/ha.cf

#############################################

logfile /var/log/ha-log #日志目录

logfacility local0 #日志级别

keepalive 2 #心跳检测间隔

deadtime 5 #死亡时间

ucast eth3 75.0.2.33 #心跳网卡及对方的IP(备机仅修改此处)

auto_failback off #主服务器正常后，资源转移至主

node oa-mysql.com oa-haproxy.com #两个节点的主机名

###############################################################################

vim /etc/ha.d/authkeys #心跳密码文件权限必须是600

######################

auth 3 #选用算法3，MD5算法

#1 crc

#2 sha1 HI!

3 md5 heartbeat

######################

vim /etc/ha.d/

#########################################################################

data-09.com IPaddr::192.168.1.13/24/br0 drbddisk::r0 Filesystem::/dev/drbd0::/usr/local/data::ext4 mysql nfs

注释:主服务器的主机名 VIP/绑定的网卡 drbd分区:drbd分区挂载目录:文件系统 mysql服务 NFS服务

(5)、创建drbddisk nfs mysql脚本并授予执行权限（三个资源管理脚本需存放在ha.d/resource.d）

[[email protected] ~]#cat /etc/ha.d/resource.d/drbddisk

##################################################################

#!/bin/bash

# This script is inteded to be used as resource script by heartbeat

# Copright 2003-2008 LINBIT Information Technologies

# Philipp Reisner, Lars Ellenberg

###

DEFAULTFILE="/etc/default/drbd"

DRBDADM="/sbin/drbdadm"

if [ -f $DEFAULTFILE ]; then

. $DEFAULTFILE

if [ "$#" -eq 2 ]; then

RES="$1"

CMD="$2"

else

RES="all"

CMD="$1"

## EXIT CODES

# since this is a "legacy heartbeat R1 resource agent" script,

# exit codes actually do not matter that much as long as we conform to

# http://wiki.linux-ha.org/HeartbeatResourceAgent

# but it does not hurt to conform to lsb init-script exit codes,

# where we can.

# http://refspecs.linux-foundation.org/LSB_3.1.0/

#LSB-Core-generic/LSB-Core-generic/iniscrptact.html

####

drbd_set_role_from_proc_drbd()

{

local out

if ! test -e /proc/drbd; then

ROLE="Unconfigured"

return

dev=$( $DRBDADM sh-dev $RES )

minor=${dev#/dev/drbd}

if [[ $minor = *[!0-9]* ]] ; then

# sh-minor is only supported since drbd 8.3.1

minor=$( $DRBDADM sh-minor $RES )

if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then

ROLE=Unknown

return

if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then

set -- $out

ROLE=${5%/**}

: ${ROLE:=Unconfigured} # if it does not show up

else

ROLE=Unknown

}

case "$CMD" in

start)

# try several times, in case heartbeat deadtime

# was smaller than drbd ping time

try=6

while true; do

$DRBDADM primary $RES && break

let "--try" || exit 1 # LSB generic error

sleep 1

done

;;

stop)

# heartbeat (haresources mode) will retry failed stop

# for a number of times in addition to this internal retry.

try=3

while true; do

$DRBDADM secondary $RES && break

# We used to lie here, and pretend success for anything != 11,

# to avoid the reboot on failed stop recovery for "simple

# config errors" and such. But that is incorrect.

# Don‘t lie to your cluster manager.

# And don‘t do config errors...

let --try || exit 1 # LSB generic error

sleep 1

done

;;

status)

if [ "$RES" = "all" ]; then

echo "A resource name is required for status inquiries."

exit 10

ST=$( $DRBDADM role $RES )

ROLE=${ST%/**}

case $ROLE in

Primary|Secondary|Unconfigured)

# expected

;;

# unexpected. whatever...

# If we are unsure about the state of a resource, we need to

# report it as possibly running, so heartbeat can, after failed

# stop, do a recovery by reboot.

# drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is

# suddenly readonly. So we retry by parsing /proc/drbd.

drbd_set_role_from_proc_drbd

esac

case $ROLE in

Primary)

echo "running (Primary)"

exit 0 # LSB status "service is OK"

;;

Secondary|Unconfigured)

echo "stopped ($ROLE)"

exit 3 # LSB status "service is not running"

;;

# NOTE the "running" in below message.

# this is a "heartbeat" resource script,

# the exit code is _ignored_.

echo "cannot determine status, may be running ($ROLE)"

exit 4 # LSB status "service status is unknown"

;;

esac

;;

echo "Usage: drbddisk [resource] {start|stop|status}"

exit 1

;;

esac

exit 0

##############################################################

[[email protected] ~]#cat /etc/ha.d/resrouce.d/nfs

killall -9 nfsd; /etc/init.d/nfs restart;exit 0

mysql启动脚本用mysql自带的启动管理脚本即可

cp /etc/init.d/mysql /etc/ha.d/resrouce.d/

注意:nfs mysql drbddisk 三个脚本需要+x 权限

(6)、启动heartbeat

[[email protected] ~]# service heartbeat start (两个节点同时启动)

[[email protected] ~]# chkconfig heartbeat off

说明：关闭开机自启动，当服务器重启时，需要人工去启动

5、测试

在在另外一台LINUX的客户端挂载虚IP：192.168.7.90，挂载成功表明NFS+DRBD+HeartBeat大功告成.

测试DRBD+HeartBeat+NFS可用性：

1.向挂载的/tmp目录传送文件，忽然重新启动主端DRBD服务器，查看变化能够实现断点续传，但是drbd+heartbeat正常切换需要时间

2. 假设此时把primary的eth0 给ifdown了, 然后直接在secondary上进行主的提升,并也给mount了, 发现在primary上测试拷入的文件确实同步过来了。之后把primary的 eth0 恢复后, 发现没有自动恢复主从关系, 经过支持查询,发现出现了drbd检测出现了Split-Brain 的状况, 两个节点各自都standalone了，故障描术如下：Split-Brain detected, dropping connection!这个即时传说中的脑裂了，DRBD官方推荐手动恢复(生产环境下出现这个机率的机会很低的，谁会去故障触动生产中的服务器)

以下手动恢复Split-Brain状况：

1. drbdadm secondary r0

2. drbdadm disconnect all

3. drbdadmin -- --discard-my-data connect r0

ii.在primary上：

1. drbdadm disconnect all

2. drbdadm connect r0

3. 假设Primary因硬件损坏了，需要将Secondary提生成Primay主机，如何处理，方法如下：

在primaty主机上,先要卸载掉DRBD设备.

umount /tmp

将主机降级为备机

[[email protected] /]# drbdadm secondary r0

[[email protected] /]# cat /proc/drbd

1: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r—

现在,两台主机都是”备机”.

在备机data-11.com上, 将它升级为”主机”.

[[email protected]/]# drbdadm primary r0

[[email protected] /]# cat /proc/drbd

1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r—

已知问题：

heartbeat无法监控资源，也就是说当drbd或者nfs挂掉了，也不会发生任何动作，它只认为对方的机器dead之后才会发生动作。也就是机器宕机，网络断掉才会发生主备切换，因此有另外一种方案：corosync+pacemaker

时间： 2024-10-10 13:28:31

Heartbeat+DRBD+MySQ+NFS部署文档

Heartbeat+DRBD+MySQ+NFS部署文档的相关文章

LVS+Heartbeat安装部署文档

VDP VMware 备份部署文档

Open-falcon部署文档（绘图及报警）

Sqlserver2008安装部署文档

loganalyzer部署文档-（第一部分）

Nginx部署文档（二进制包安装）

Wcp知识管理系统部署文档

ElasticSearch部署文档(Ubuntu 14.04)

zabbix监控安装部署文档