集群(cluster)
(一)集群概念
简单的说,集群就是一组计算机,他们作为一个整体向用户提供一组网络资源。这些单个的计算机系统就是集群的节点(node)。集群(一组协同工作的计算机)是充分利用计算资源的一个重要概念,因为它能够将工作负载从一个超载的系统(或节点)迁移到集群中的另一个系统上。常见的硬件有:结点,网络,存储.软件有:机群系统,节点系统,应用支撑软件。集群具有高可用性HA(集群中的一个节点失效,他的任务可传递给其他节点,可以有效防止单点是失效。)
(二) 集群系统的分类
一般将集群分成两类:
1)HA(High Availability)集群
2)HPC集群,也称为科学计算集群
(三) 集群的搭建
搭建两个节点的集群,要注意时间同步
HA集群维护节点的高可用,只有一个节点是真正工作,另一个节点是处于热备状态,一般来说,两个集群节点会有一根网线直连,还有心跳线,corosyc(心跳引擎,用于检测主机的可用性)每一个主机上都有自己的主机ip,但是对外使用时是用的vip,vip,servers,filesystem构成了一个资源组,当一个节点down了,整个资源组就会迁移到另一个节点上,由于vip不变,在客户端看来是都一样的,这里需要用到ARP协议,mac地址发生了变化。
如果两个节点之间的网络断了,或者正在工作的节点负载过高而down了,那么corosync检测不到心跳信息,就会告知另一个节点接管工作,此时,若原来的节点恢复了,那么就会出现争抢资源组的情况,并有可能同时对文件系统进行操作,那么就会导致文件系统坏掉出现“脑裂”,因此是不合理的,这时就需要fence,当一个节点出现问题的时候,fence就会直接将其断电(不是重启,英因为重启会将内存的信息写入磁盘,而断电就直接将内存清空),那么就不会出现资源组的争抢的情况
一 创建集群
server1:(第一个节点)
1 vim /etc/yum.repos.d/rhel-source.repo
[rhel-source]
name=Red Hat Enterprise Linux $releasever - $basearch - Source
baseurl=http://172.25.38.250/rhel6.5
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
[HighAvailability]
name=HighAvailability
baseurl=http://172.25.38.250/rhel6.5/HighAvailability
gpgcheck=0
[LoadBalancer]
name=LoadBalancer
baseurl=http://172.25.38.250/rhel6.5/LoadBalancer
gpgcheck=0
[ResilientStorage]
name=ResilientStorage
baseurl=http://172.25.38.250/rhel6.5/ResilientStorage
gpgcheck=0
[ScalableFileSystem]
name=ScalableFileSystem
baseurl=http://172.25.38.250/rhel6.5/ScalableFileSystem
gpgcheck=0
2 yum repolist
3 yum install -y ricci ###rucci只是一个集群管理工具,安装在所有集群节点上,是加密的###
4 passwd ricci ###给ricci密码,企业六版本的都要给密码###
5 /etc/init.d/ricci start ###开启服务(端口为11111)###
6 chkconfig ricci on ###开机自启动###
7 yum install -y luci ###提供一个web界面管理ricci,是加密的##
8 /etc/init.d/luci start (端口为8084)
9 chkconfig luci on ###开机自启动###
server2:(第二个节点)
与server1的操作步骤相同,除了不用install luci
测试:
访问https://172.25.78.1:8084
命令方式:
clustat
[[email protected] ~]# clustat
Cluster Status for haha @ Mon Jul 24 05:15:48 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1 1 Online, Local
server2 2 Online
二 创建fence
再真实的物理机上有一个libvirtd服务,是管理虚拟机和其他虚拟化功能,他包含一个API库,一个守护进程(libvirtd)和一个命令行工具(virsh),如果将这个服务关闭了,那么用户就不能通过virsh管理虚拟机,将其作为fence
1 yum install fence-virtd-multicast.x86_64 fence-virtd-libvirt.x86_64 fence-virtd.x86_64 -y
2 fence_virtd -c ###创建fence服务###
【【【【【【【【【【【过程:
Module search path [/usr/lib64/fence-virt]:
Available backends:
libvirt 0.1
Available listeners:
multicast 1.2
Listener modules are responsible for accepting requests
from fencing clients.
Listener module [multicast]: ###监听用多播###
The multicast listener module is designed for use environments
where the guests and hosts may communicate over a network using
multicast.
The multicast address is the address that a client will use to
send fencing requests to fence_virtd.
Multicast IP Address [225.0.0.12]: ###多播地址,此处使用默认的多播地址###
Using ipv4 as family.
Multicast IP Port [1229]: ###默认端口###
Setting a preferred interface causes fence_virtd to listen only
on that interface. Normally, it listens on all interfaces.
In environments where the virtual machines are using the host
machine as a gateway, this *must* be set (typically to virbr0).
Set to ‘none‘ for no interface.
Interface [virbr0]: br0 ###fence接口,集群节点连接到fence###
The key file is the shared key information which is used to
authenticate fencing requests. The contents of this file must
be distributed to each physical host and virtual machine within
a cluster.
Key File [/etc/cluster/fence_xvm.key]: ###虚拟机会通过key进行校验###
Backend modules are responsible for routing requests to
the appropriate hypervisor or management layer.
Backend module [libvirt]: ###连接的后端###
Configuration complete.
=== Begin Configuration ===
backends {
libvirt {
uri = "qemu:///system";
}
}
listeners {
multicast {
port = "1229";
family = "ipv4";
interface = "br0";
address = "225.0.0.12";
key_file = "/etc/cluster/fence_xvm.key";
}
}
fence_virtd {
module_path = "/usr/lib64/fence-virt";
backend = "libvirt";
listener = "multicast";
}
=== End Configuration ===
Replace /etc/fence_virt.conf with the above [y/N]? y
】】】】】】】】】】】
3 mkdir /etc/cluster ###不存在要创建出来
4 dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1 ###/dev/urandom会产生随机数,利用此来产生key###
5 systemctl restart fence_virtd.service ###重启服务一定要在产生随机数之后###
6 scp /etc/cluster/fence_xvm.key [email protected]:/etc/cluster/ ###将key传给节点####
7 scp fence_xvm.key [email protected]:/etc/cluster/
再luci提供的web界面上:fence devices ---> add ---> 添信息 ---> submit (在提交的时候,是在将所有的配置写入/etc/cluster/cluster.conf文件)
集群内部只知道主机名(例:server1,server2),而fance所在的物理机知道的是虚拟机的名字(例:vm1,vm2),因此要将它们做一个映射:vm1 ---> server1 , vm2 ---> server2
nodes ---> server1 ---> add fence method ---> add fence instance
nodes ---> server2 ---> add fence method ---> add fence instance
可以再/etc/cluster/cluster.conf文件下查看刚才再web上进行的操作
[[email protected] cluster]# cat cluster.conf
<?xml version="1.0"?>
<cluster config_version="6" name="haha">
<clusternodes>
<clusternode name="server1" nodeid="1">
<fence>
<method name="fence1">
<device domain="vm1" name="vmfence"/>
</method>
</fence>
</clusternode>
<clusternode name="server2" nodeid="2">
<fence>
<method name="fence2">
<device domain="vm2" name="vmfence"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_xvm" name="vmfence"/>
</fencedevices>
</cluster>
测试:
fence_node server1 ###fence将server1断电了###
将server2的eth0的down掉,会发现server4直接断电,就是因为fence的作用
三 在web上创建域和资源
failover domains ---> add
Prioritized:优先级,数字越小,优先级越高
Restricted:几个节点
No Failback:如果server1挂了又恢复了,不回切。(但是如果server1和server2的性能相差很大,则最好不勾选此项,让它回切)
resources ---> add ---> ip address(vip)
resources ---> add ---> script
service groups ---> add ---> add resource (添加资源的顺序就是启动的顺序,因此要先启动ip再启动脚本)
Automatically Start This Service :开机自动开启服务
Run Exclusive:只进行这一个服务
Recovery Policy:relocate ###一个节点down就立刻让另一个节点接管###
测试:
访问172.25.78.100
命令方式:
[[email protected] ~]# clustat
Cluster Status for haha @ Tue Jul 25 10:47:59 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1 1 Online, Local, rgmanager
server2 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1 started
[[email protected] ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:d5:e3:a3 brd ff:ff:ff:ff:ff:ff
inet 172.25.78.1/24 brd 172.25.78.255 scope global eth0
inet 172.25.78.100/24 scope global secondary eth0 ###分配到了ip###
inet6 fe80::5054:ff:fed5:e3a3/64 scope link
valid_lft forever preferred_lft forever
[[email protected] ~]# ps ax | grep httpd ###查看是否httpd的进程
8199 ? S<s 0:00 /usr/sbin/httpd
8201 ? S< 0:00 /usr/sbin/httpd
8202 ? S< 0:00 /usr/sbin/httpd
8203 ? S< 0:00 /usr/sbin/httpd
8204 ? S< 0:00 /usr/sbin/httpd
8205 ? S< 0:00 /usr/sbin/httpd
8206 ? S< 0:00 /usr/sbin/httpd
8207 ? S< 0:00 /usr/sbin/httpd
8208 ? S< 0:00 /usr/sbin/httpd
16568 pts/0 S+ 0:00 grep httpd
如果将server1的httpd服务down了,那么server2就会接管服务
[[email protected] ~]# /etc/init.d/httpd stop
Stopping httpd: [ OK ]
[[email protected] ~]# clustat
[[email protected] ~]# clustat
Cluster Status for haha @ Tue Jul 25 10:54:21 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1 1 Online, Local, rgmanager
server2 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server2 started
让server2的内核崩溃,server1就会接管服务:
[[email protected] ~]# echo c > /proc/sysrq-trigger
[[email protected] ~]# clustat
Cluster Status for haha @ Tue Jul 25 10:57:20 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1 1 Online, rgmanager
server2 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1 started
[[email protected] ~]# clusvcadm -r apache -m server2 ###将服务重新定位到server2上###
Trying to relocate service:apache to server2...Success
service:apache is now running on server2
[[email protected] ~]# clustat
Cluster Status for haha @ Tue Jul 25 10:59:45 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1 1 Online, Local, rgmanager
server2 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server2 started
四 创建共享设备iscsi
开启一个虚拟机(server3)作为共享存储,设置磁盘8G
1 yum install -y scsi-*
2 vim /etc/tgt/targets.conf
locking_type = 3
内容:
<target iqn.2017-07.com.example:server.target1>
backing-store /dev/vdb
initiator-address 172.25.38.1 ###设置访问控制####
initiator-address 172.25.38.2
</target>
3 /etc/init.d/tgtd start
4 tgt-admin -s ###查看创建的iscsi的信息
集群节点server1和server2的操作:
lvmconf --enable-cluster
vim /etc/lvm/lvm.conf
1 yum install -y iscsi-*
2 iscsiadm -m discovery -t st -p 172.25.38.3
3 iscsiadm -m node -l
####以下步骤只能在一个节点上操作####
***************************************************************************************
###格式化成ext4###
fdisk -cu /dev/sda ###划分一个2G的分区###
partprobe ###同步分区表,如果一个节点有信息,另一个节点没有,就使用该命令进行分区表的同步###
mkfs.ext4 /dev/sda1 ###格式化成ext4,该文件系统是本地文件系统,一个节点往里存东西,另一个节点看不到###
mount /dev/sda1 /mnt/ ###两个节点都要进行mount###
测试:一个节点往里存东西,查看另一个节点的信息,发现看不到存的东西
[[email protected] ~]# cd /mnt/
[[email protected] mnt]# ls
lost+found
[[email protected] mnt]# cp /etc/passwd /mnt/
[[email protected] mnt]# ls
lost+found passwd
[[email protected] ~]# mount /dev/sda1 /mnt/
[[email protected] ~]# cd /mnt/
[[email protected] mnt]# ls
lost+found
[[email protected] ~]# umount /mnt/ ###不过卸载了重新mount就可以看到###
[[email protected] ~]# mount /dev/sda1 /mnt/
[[email protected] ~]# cd /mnt/
[[email protected] mnt]# ls
lost+found passwd
***************************************************************************************
#####创建lvm(只在一个节点上操作)#####
4 fdisk -cu /dev/sda ###创建lvm###
5 pvcreate /dev/sda1 (一个节点创建。另一个节点查看pvs)
6 vgcreate clustervg /dev/sda1(一个节点创建。另一个节点查看vgs)
7 lvcreate -L +2G -n demo clustervg(一个节点创建。另一个节点查看lvs)
8 mkfs.ext4 /dev/clustervg/demo ###格式化###
【lvm创建过程:
server1:
[[email protected] ~]# pvcreate /dev/sda1
Physical volume "/dev/sda1" successfully created
[[email protected] ~]# vgcreate clustervg /dev/sda1
Clustered volume group "clustervg" successfully created.
[[email protected] ~]# lvcreate -L +2G -n demo clustervg
Logical volume "demo" created
server2:
[[email protected] ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda1 clustervg lvm2 a-- 8.00g 8.00g
/dev/vda2 VolGroup lvm2 a-- 19.51g 0
[[email protected] ~]# vgs
VG #PV #LV #SN Attr VSize VFree
VolGroup 1 2 0 wz--n- 19.51g 0
clustervg 1 0 0 wz--nc 8.00g 8.00g
[[email protected] ~]# lvs
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
lv_root VolGroup -wi-ao---- 18.54g
lv_swap VolGroup -wi-ao---- 992.00m
demo clustervg -wi-a----- 2.00g
[[email protected] ~]# lvs
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
lv_root VolGroup -wi-ao---- 18.54g
lv_swap VolGroup -wi-ao---- 992.00m
】
9 在web上的操作(添加资源)
resources ---> add
service groups ---> apache --->
测试:
1
[[email protected] ~]# mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
none on /sys/kernel/config type configfs (rw)
/dev/mapper/clustervg-demo on /var/www/html type ext4 (rw)
[[email protected] ~]# cd /var/www/html/
[[email protected] html]# ls
lost+found
[[email protected] html]# vim index.html
[[email protected] ~]# curl 172.25.78.100
<h1>www.westos.org</h1>
2
[[email protected] html]# clusvcadm -r apache -m server2 ###将资源重定位到server2
Trying to relocate service:apache to server2...
[[email protected] ~]# mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/vda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
none on /sys/kernel/config type configfs (rw)
/dev/mapper/clustervg-demo on /var/www/html type ext4 (rw)
[[email protected] ~]#
五 删除后端存储iscsi
1 在web上删除资源:
service groups --- > apache ---> filesystem --- > remove ---> submit
resources ---> webdata ---> delete
2 删除lvm
clusvcadm -d apache
lvremove /dev/clustervg/demo
vgremove clustervg
pvremove /dev/sda1
iscsiadm -m node -u ###两个节点都要登出###
iscsiadm -m node -o delete ###两个节点都要进行删除###
3 查看/dev/sda是否已经没有了
cat /proc/partitions
六 gfs2文件系统(网络文件系统)
1 iscsiadm -m discovery -t st -p 172.25.78.3
2 iscsiadm -m node -l
####以下操作在一个节点上进行###
3 fdisk -cu /dev/sda
4 pvcreate /dev/sda1
5 vgcreate clustervg /dev/sda1
6 lvcreate -L +2G -n demo clustervg
7 mkfs.gfs2 -j 3 -p lock_dlm -t haha:mygfs2 /dev/clustervg/demo ###-j指定日志的个数,默认日志的大小为128M,-p指定锁,-t:集群的名字
8 mount /dev/clustervg/demo /var/www/html ###两个节点都要挂载###
测试:在一个节点上将往例存东西,另一个节点也可以看到
####开机自动挂载####
1 vim /etc/fstab
内容:
UUID="6f35be88-3402-45a8-8cd1-2943906941f2" /var/www/html gfs2 _netdev,defaults 0 0
2 mount -a
###lvm扩展###
1 lvextentd -L +4G /dev/clustervg/demo
2 gfs2_grow /dev/clustervg/demo