一、什么是DRBD
DRBD(Distributed Replicated Block Device)是一个用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。
二、集群中常见的存储类型
DAS:Direct Attached Storage;drbd属于此类
NAS:Network Attached Storage;如nfs
SAN:Storage Area Network;如iSCSI
DAS和SAN输出的是设备接口,前端各节点的内核分别维护文件锁,若前端节点之间要求同时读写而又不能共享锁信息,则可能产生资源争用问题,而NAS输出是的文件系统接口,文件锁由提供NAS的主机维护,所以不会产生资源争用问题
三、DRBD的原理
DRBD由两部分组成:内核模块和用户空间的管理工具。DRBD内核模块程序工作于buffer cache和磁盘高度器之间。DRBD将接收到的数据写入磁盘,并且复制一份通过网络发送给对方节点,对方节点的DRBD将接收到的数据写入本地对应的磁盘上,类似于RAID 1的功能,在高可用(HA)集群中使用DRBD,可以替代使用一个后端共享存储。
DRBD需要构建在底层物理设备之上,然后构建出一个块设备。对于用户来说,一个DRBD设备,就像是一块物理的磁盘,可以在上面内创建文件系统。DRBD所支持的底层设备可以是:
①磁盘或磁盘的某个分区;
②soft raid;
③逻辑卷;,
⑤其他块设备
构建DRBD的底层设备的空间大小必须相同
四、DRBD基础概念
1、资源:
资源名称:两个节点上可构建多个DRBD架构,每个独立的DRBD架构用资源名来标识,资源名只能使用ASCII码,且之间不能包含空白字符
DRBD设备:每个DRBD架构对应的DRBD块设备名称,/dev/drbd#,#从0开始
磁盘相关的配置信息:定义创建DRBD设备需要用到的磁盘或分区,以及DRBD的元数据信息的存放位置
网络相关的配置信息:定义是否认证
2、资源角色:Primary or Secondary,目前DRBD只支持双节点:
primary:唯一标识读写操作,可以格式化、挂载
secondary,不允许挂载和读写
3、DRDB工作模式:
primary/secondary(主从模型):任何资源,在任何时刻,只能被主节点读写
primary/primary(双主模型):任何用户,在任何时刻,可以通过任意节点对DRBD进行读写操作,但是双主节点模型需要依靠集群文件系统来实现(将两个节点做成高可用集群,启用分布式锁管理器,将DRBD设备格式化为集群式文件系统)。
4、复制模型
protocol A:异步复制,一旦写入本地磁盘,且复制的数据已在发送队列中,则被认为是完成;此种模型性能最好
protocol B:半同步复制,一旦写入本地磁盘,且复制的数据已到达对方节点的缓冲区,则被认为是完成;
protocol C:完全同步复制,数据已确认写入到本地磁盘和对方节点的磁盘才被认为是完成,默认模型;此种模型的数据安全性最好
下面以案例的方式演示drbd的安装配置
四、案例:corosync+pacemaker+mysql+drbd
1、设计方案
2、准备工作
时间同步、双机互信、名称通信等,此各略。可参考博客http://9124573.blog.51cto.com/9114573/1763980
3、DRBD的安装配置
⑴在两个节点上安装软件包
drbd内核模块代码已经整合进Linux内核2.6.33以后的版本中,因此,如果内核版本高于此版本的话,只需要安装管理工具即可;否则需要同时安装内核模块和管理工具两个软件包,并且此两者的版本号一定要保持对应。
目前适用CentOS 5的drbd版本主要有8.0、8.2、8.3三个版本,其对应的rpm包的名字分别为drbd, drbd82和drbd83,对应的内核模块的名字分别为kmod-drbd, kmod-drbd82和kmod-drbd83。而适用于CentOS 6的版本为8.4,其对应的rpm包为drbd和drbd-kmdl,但在实际选用时,要切记:除了drbd和drbd-kmdl的版本要对应,drbd-kmdl的版本也要与当前系统的内核版本相对应。
uname -r #确定内核版本
rpm -ivh drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm drbd-8.4.3-33.el6.x86_64.rpm
modprobe drbd
[[email protected] ~]# uname -r 2.6.32-431.el6.x86_64 [[email protected] ~]# yum -y install drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm ... [[email protected] ~]# modprobe -l | grep drbd updates/drbd.ko [[email protected] ~]# modprobe drbd [[email protected] ~]# lsmod | grep drbd drbd 326010 0 libcrc32c 1246 1 drbd #在另一节点上也安装好软件包
⑵配置drbd
drbd的主配置文件为/etc/drbd.conf;为了管理的便捷性,目前通常会将些配置文件分成多个部分,且都保存至/etc/drbd.d/目录中,主配置文件中仅使用"include"指令将这些配置文件片断整合起来。通常,/etc/drbd.d目录中的配置文件为global_common.conf和所有以.res结尾的文件。其中global_common.conf中主要定义global段和common段,而每一个.res的文件用于定义一个资源。
在配置文件中,global段仅能出现一次,且如果所有的配置信息都保存至同一个配置文件中而不分开为多个文件的话,global段必须位于配置文件的最开始处。目前global段中可以定义的参数仅有minor-count, dialog-refresh, disable-ip-verification和usage-count。
common段则用于定义被每一个资源默认继承的参数,可以在资源定义中使用的参数都可以在common段中定义。实际应用中,common段并非必须,但建议将多个资源共享的参数定义为common段中的参数以降低配置文件的复杂度。
resource段则用于定义drbd资源,每个资源通常定义在一个单独的位于/etc/drbd.d目录中的以.res结尾的文件中。资源在定义时必须为其命名,名字可以由非空白的ASCII字符组成。每一个资源段的定义中至少要包含两个host子段,以定义此资源关联至的节点,其它参数均可以从common段或drbd的默认中进行继承而无需定义。
drbd的用户空间工具:drbdadm, drbdmeta, drbdsetup
[[email protected] ~]# rpm -ql drbd /etc/bash_completion.d /etc/bash_completion.d/drbdadm /etc/drbd.conf #主配置文件 /etc/drbd.d /etc/drbd.d/global_common.conf #配置文件 /etc/ha.d/resource.d/drbddisk /etc/ha.d/resource.d/drbdupper /etc/rc.d/init.d/drbd #服务脚本 /etc/udev/rules.d/65-drbd.rules /etc/xen/scripts/block-drbd /lib/drbd/drbdadm-83 /lib/drbd/drbdsetup-83 /sbin/drbdadm #用户空间工具 /sbin/drbdmeta /sbin/drbdsetup /usr/lib/drbd #该目录下是一些功能脚本 /usr/lib/drbd/crm-fence-peer.sh /usr/lib/drbd/crm-unfence-peer.sh /usr/lib/drbd/notify-emergency-reboot.sh ... /usr/lib/ocf /usr/lib/ocf/resource.d /usr/lib/ocf/resource.d/linbit /usr/lib/ocf/resource.d/linbit/drbd #资源代理 /usr/sbin/drbd-overview #该命令用于观察drbd节点状态 /usr/share/doc/drbd-8.4.3 ... [[email protected] ~]# vim /etc/drbd.conf # You can find an example in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf"; include "drbd.d/*.res";
◆先在两个节点上创建用于drbd镜像的分区;注意:用于drbd镜像的分区大小要相同
[[email protected] ~]# fdisk /dev/sda WARNING: DOS-compatible mode is deprecated. It‘s strongly recommended to switch off the mode (command ‘c‘) and change display units to sectors (command ‘u‘). Command (m for help): p Disk /dev/sda: 32.2 GB, 32212254720 bytes 255 heads, 63 sectors/track, 3916 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000f3804 Device Boot Start End Blocks Id System /dev/sda1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 26 1332 10485760 83 Linux /dev/sda3 1332 1593 2097152 82 Linux swap / Solaris Command (m for help): n Command action e extended p primary partition (1-4) e Selected partition 4 First cylinder (1593-3916, default 1593): Using default value 1593 Last cylinder, +cylinders or +size{K,M,G} (1593-3916, default 3916): Using default value 3916 Command (m for help): n First cylinder (1593-3916, default 1593): Using default value 1593 Last cylinder, +cylinders or +size{K,M,G} (1593-3916, default 3916): +2G Command (m for help): p Disk /dev/sda: 32.2 GB, 32212254720 bytes 255 heads, 63 sectors/track, 3916 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000f3804 Device Boot Start End Blocks Id System /dev/sda1 * 1 26 204800 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 26 1332 10485760 83 Linux /dev/sda3 1332 1593 2097152 82 Linux swap / Solaris /dev/sda4 1593 3916 18666534 5 Extended /dev/sda5 1593 1854 2103487+ 83 Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) Syncing disks. [[email protected] ~]# partx -a /dev/sda ... [[email protected] ~]# partx -a /dev/sda ... #在另一节点上执行类似操作
◆配置/etc/drbd.d/global-common.conf
[[email protected] ~]# cd /etc/drbd.d [[email protected] drbd.d]# vim global_common.conf global { usage-count no; #是否通知drbd官方新增使用数,一般设为no # minor-count dialog-refresh disable-ip-verification } common { #common段指定各资源共享的配置 protocol C; #指定节点间的复制模型,缺省即为protocol C handlers { # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. # pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; # local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { #wfc-timeout 120; #degr-wfc-timeout 120; #become-primary-on both; #配置双主模型时需要启用该选项 } options { # cpu-mask on-no-data-accessible } disk { on-io-error detach; #当drbd中的磁盘出现故障时,将其拆除 #fencing resource-only; } net { cram-hmac-alg "sha1"; #消息认证使用哪种算法 shared-secret "mydrbdtest"; #共享密钥 } syncer { rate 300M; #同步的最大速率;初始同步时可调大速率,同步完成后再调小 } }
◆定义一个资源/etc/drbd.d/mysql.res
[[email protected] drbd.d]# vim mysql.res resource mysql { on node1 { device /dev/drbd0; #drbd设备名 disk /dev/sda5; #用到的磁盘或分区 address 192.168.30.10:7798; meta-disk internal; #元数据存于何处 } on node2 { device /dev/drbd0; disk /dev/sda5; address 192.168.30.20:7798; meta-disk internal; } }
◆同步配置文件
[[email protected] drbd.d]# ls global_common.conf mysql.res [[email protected] drbd.d]# scp ./* [email protected]:/etc/drbd.d/ global_common.conf 100% 1425 1.4KB/s 00:00 mysql.res 100% 272 0.3KB/s 00:00
◆初始化资源,在两个节点上分别执行:drbdadm create-md RESOURCE
[[email protected] ~]# drbdadm create-md mysql;ssh [email protected] ‘drbdadm create-md mysql‘ Writing meta data... initializing activity log NOT initializing bitmap lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory New drbd meta data block successfully created. lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory NOT initializing bitmap lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory Writing meta data... initializing activity log New drbd meta data block successfully created. lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory
◆启动服务,在两个节点上分别执行:/etc/init.d/drbd start
[[email protected] ~]# service drbd start;ssh [email protected] ‘service drbd start‘ Starting DRBD resources: [ create res: mysql prepare disk: mysql adjust disk: mysql adjust net: mysql ] .......... *************************************************************** DRBD‘s startup script waits for the peer node(s) to appear. - In case this node was already a degraded cluster before the reboot the timeout is 0 seconds. [degr-wfc-timeout] - If the peer was available before the reboot the timeout will expire after 0 seconds. [wfc-timeout] (These values are for resource ‘mysql‘; 0 sec -> wait forever) To abort waiting enter ‘yes‘ [ 80]: yes . Starting DRBD resources: [ create res: mysql prepare disk: mysql adjust disk: mysql adjust net: mysql ] WARN: stdin/stdout is not a TTY; using /dev/console.
◆查看启动状态:cat /proc/drbd或drbd-overview
[[email protected] ~]# drbd-overview 0:mysql/0 Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
◆将其中一个节点设置为Primary。在要设置为Primary的节点上执行如下命令:
drbdadm primary --force RESOURCE
或 drbdadm -- --overwrite-data-of-peer primary RESOURCE
[[email protected] ~]# drbdadm primary --force mysql [[email protected] ~]# drbd-overview 0:mysql/0 SyncSource Primary/Secondary UpToDate/Inconsistent C r---n- [==>.................] sync‘ed: 16.8% (1753172/2103380)K
[[email protected] ~]# drbd-overview 0:mysql/0 Connected Secondary/Primary UpToDate/UpToDate C r-----
◆创建文件系统并挂载
文件系统的挂载只能在Primary节点进行,因此,也只有在设置了主节点后才能对drbd设备进行格式化:
mke2fs -t ext4 -L DRBD /dev/drbd0
mkdir /mydata
mount /dev/drbd0 /mydata
[[email protected] ~]# mke2fs -t ext4 -L DRBD /dev/drbd0 ... [[email protected] ~]# mkdir /mydata [[email protected] ~]# mount /dev/drbd0 /mydata [[email protected] ~]# mkdir /mydata/{data,binlogs} [[email protected] ~]# chown -R mysql.mysql /mydata/{data,binlogs}
◆在两个节点上安装上mysql-server并在Primary节点上执行mysql的初始化
[[email protected] ~]# yum -y install mysql-server;ssh [email protected] ‘yum -y install mysql-server‘ ... [[email protected] ~]# vim /etc/my.cnf [mysqld] datadir=/mydata/data socket=/var/lib/mysql/mysql.sock user=mysql log-bin=/mydata/binlogs/mysql-bin innodb_file_per_table=ON # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 skip-name-resolve [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid [[email protected] ~]# scp /etc/my.cnf [email protected]:/etc/ my.cnf 100% 326 0.3KB/s 00:00 [[email protected] ~]# service mysqld start Initializing MySQL database: Installing MySQL system tables... OK Filling help tables... OK ... Starting mysqld: [ OK ] [[email protected] ~]# ls /mydata/data ibdata1 ib_logfile0 ib_logfile1 mysql test [[email protected] ~]# mysql ... mysql> create database hellodb; Query OK, 1 row affected (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | hellodb | | mysql | | test | +--------------------+ 4 rows in set (0.01 sec) mysql> grant all on *.* to [email protected]‘192.168.30.%‘ identified by ‘magedu‘; Query OK, 0 rows affected (0.07 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) mysql> exit Bye
◆Primary和secondary节点的切换
对于Primary/Secondary模型的drbd服务来说,在某个时刻只能有一个节点为Primary,因此,要切换两个节点的角色,只能先将原有的Primary节点设置为secondary后,才能将原来的secondary节点设置为Primary:
node1:
umount /mydata
drbdadm secondary mysql
node2:
drbdadm primary mysql
drbd-overview
mkdir /mydata
mount /dev/drbd0 /mydata
[[email protected] ~]# umount /mydata umount: /mydata: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) [[email protected] ~]# service mysqld stop Stopping mysqld: [ OK ] [[email protected] ~]# umount /mydata [[email protected] ~]# drbdadm secondary mysql [[email protected] ~]# drbd-overview 0:mysql/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----
[[email protected] ~]# drbdadm primary mysql [[email protected] ~]# drbd-overview 0:mysql/0 Connected Primary/Secondary UpToDate/UpToDate C r----- [[email protected] ~]# mkdir /mydata [[email protected] ~]# mount /dev/drbd0 /mydata [[email protected] ~]# service mysqld start Starting mysqld: [ OK ] [[email protected] ~]# mysql ... mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | hellodb | | mysql | | test | +--------------------+ 4 rows in set (0.00 sec)