高可用集群的基本部署 / 憋错料

高可用集群的部署

实验环境的准备:

准备三台rhel6.5的虚拟机三台，真机作测试,做好解析。

解析

[[email protected] ~]# cat /etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

172.25.50.10 server1.example.com

172.25.50.20 server2.example.com

172.25.50.30 server3.example.com

172.25.50.250 real50.example.com

yum源的配置

[[email protected] ~]# cat /etc/yum.repos.d/redhat6.repo

[Server]

name=rhel6.5 Server

baseurl=http://172.25.50.250/rhel6.5

gpgcheck=0

[HighAvailability]

name=rhel6.5 HighAvailability

baseurl=http://172.25.50.250/rhel6.5/HighAvailability

gpgcheck=0

[LoadBalancer]

name=rhel6.5 LoadBalancer

baseurl=http://172.25.50.250/rhel6.5/LoadBalancer

gpgcheck=0

[ResilientStorage]

name=rhel6.5 ResilientStorage

baseurl=http://172.25.50.250/rhel6.5/ResilientStorage

gpgcheck=0

[ScalableFileSystem]

name=rhel6.5 ScalableFileSystem

baseurl=http://172.25.50.250/rhel6.5/ScalableFileSystem

gpgcheck=0

[[email protected] ~]# yum repolist

Loaded plugins: product-id, subscription-manager

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

HighAvailability | 3.9 kB 00:00

HighAvailability/primary_db | 43 kB 00:00

LoadBalancer | 3.9 kB 00:00

LoadBalancer/primary_db | 7.0 kB 00:00

ResilientStorage | 3.9 kB 00:00

ResilientStorage/primary_db | 47 kB 00:00

ScalableFileSystem | 3.9 kB 00:00

ScalableFileSystem/primary_db | 6.8 kB 00:00

Server | 3.9 kB 00:00

Server/primary_db | 3.1 MB 00:00

repo id repo name status

HighAvailability rhel6.5 HighAvailability 56

LoadBalancer rhel6.5 LoadBalancer 4

ResilientStorage rhel6.5 ResilientStorage 62

ScalableFileSystem rhel6.5 ScalableFileSystem 7

Server rhel6.5 Server 3,690

repolist: 3,819

三台6.5系统的虚拟机都进行相同的配置。

安装软件

在server1上

[[email protected] yum.repos.d]# yum install ricci -y

[[email protected] yum.repos.d]# passwd ricci

更改用户 ricci 的密码。

新的密码：#westos

无效的密码：它基于字典单词

无效的密码：过于简单

重新输入新的密码：#westos

passwd：所有的身份验证令牌已经成功更新。

[[email protected] yum.repos.d]#/etc/init.d/ricci start#启动服务

[[email protected] yum.repos.d]# /etc/init.d/ricci start

Starting system message bus: [ OK ]

Starting oddjobd: [ OK ]

generating SSL certificates... done

Generating NSS database... done

启动 ricci： [确定]

[[email protected] yum.repos.d]#

Broadcast message from [email protected]

(unknown) at 14:48 ...

The system is going down for reboot NOW!

Connection to 172.25.50.10 closed by remote host.

Connection to 172.25.50.10 closed.

在server2上

[[email protected] yum.repos.d]# yum install ricci -y

[[email protected] yum.repos.d]# passwd ricci

更改用户 ricci 的密码。

新的密码：#westos

无效的密码：它基于字典单词

无效的密码：过于简单

重新输入新的密码：#westos

passwd：所有的身份验证令牌已经成功更新。

[[email protected] yum.repos.d]#/etc/init.d/ricci start#启动服务

[[email protected] yum.repos.d]# /etc/init.d/ricci start

Starting system message bus: [ OK ]

Starting oddjobd: [ OK ]

generating SSL certificates... done

Generating NSS database... done

启动 ricci： [确定]

[[email protected] yum.repos.d]#

Broadcast message from [email protected]

(unknown) at 14:48 ...

The system is going down for reboot NOW!

Connection to 172.25.50.20 closed by remote host.

Connection to 172.25.50.20 closed.

在server3上

安装软件luci

[[email protected] ~]# yum install luci -y

[[email protected] ~]# /etc/init.d/luci start

Adding following auto-detected host IDs (IP addresses/domain names), corresponding to `server3.example.com‘ address, to the configuration of self-managed certificate `/var/lib/luci/etc/cacert.config‘ (you can change them by editing `/var/lib/luci/etc/cacert.config‘, removing the generated certificate `/var/lib/luci/certs/host.pem‘ and restarting luci):

(none suitable found, you can still do it manually as mentioned above)

Generating a 2048 bit RSA private key

writing new private key to ‘/var/lib/luci/certs/host.pem‘

Starting saslauthd: [ OK ]

Start luci... [确定]

Point your web browser to https://server3.example.com:8084 (or equivalent) to access luci

在真机的浏览器上：

https;//server3.example.com:8084

输入登陆主机的root账户和密码-->Create-->输入集群节点的信息

这里的密码是：ricci用户的密码（westos）

选项如下图所示：

名词须知: cman -- 集群管理器

rgmanger -- 资源管理器

fence -- 电源控制机

corosync

点击Fence Dvice

再Fence virt(Multicast Mode) --> Name:vmfence --> sumbit

在浏览器图形界面选择 Nodes --> 选中 server1.example.com --> 选择 Add Fence Method --> Method Name:fence-1 --> 查找对应的 uuid 填入第一个空 ##此处用 uuid 的原因是因为真机不能识别主机名

在浏览器图形界面选择 Nodes --> 选中 server2.example.com --> 选择 Add Fence Method --> Method Name:fence-2 --> 查找对应的 uuid 填入第一个空##此处用 uuid 的原因是因为真机不能识别主机名

填写的uuid（查看uuid命令：virsh list --uuid,这里的uuid与vmmanager中的顺序一致）

在真机上扎安装软件

fence-virtd-multicast-0.3.2-1.el7.x86_64

fence-virtd-0.3.2-2.el7.x86_64

fence-virtd-libvirt-0.3.2-2.el7.x86_64

在真机上

#mkdir /etc/cluster

[[email protected] cluster]# fence_virtd -c

Module search path [/usr/lib64/fence-virt]:

Available backends:

libvirt 0.1

Available listeners:

multicast 1.2

serial 0.4

Listener modules are responsible for accepting requests

from fencing clients.

Listener module [multicast]:

The multicast listener module is designed for use environments

where the guests and hosts may communicate over a network using

multicast.

The multicast address is the address that a client will use to

send fencing requests to fence_virtd.

Multicast IP Address [225.0.0.12]:

Using ipv4 as family.

Multicast IP Port [1229]:

Setting a preferred interface causes fence_virtd to listen only

on that interface. Normally, it listens on all interfaces.

In environments where the virtual machines are using the host

machine as a gateway, this *must* be set (typically to virbr0).

Set to ‘none‘ for no interface.

Interface [br0]: ##不是br0的话写成br0

The key file is the shared key information which is used to

authenticate fencing requests. The contents of this file must

be distributed to each physical host and virtual machine within

a cluster.

Key File [/etc/cluster/fence_xvm.key]:

Backend modules are responsible for routing requests to

the appropriate hypervisor or management layer.

Backend module [libvirt]:

Configuration complete.

=== Begin Configuration ===

fence_virtd {

listener = "multicast";

backend = "libvirt";

module_path = "/usr/lib64/fence-virt";

}

listeners {

multicast {

key_file = "/etc/cluster/fence_xvm.key";

address = "225.0.0.12";

interface = "br0";

family = "ipv4";

port = "1229";

}

backends {

libvirt {

uri = "qemu:///system";

}

=== End Configuration ===

Replace /etc/fence_virt.conf with the above [y/N]? y

[[email protected] etc]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1

记录了1+0 的读入

记录了1+0 的写出

128字节(128 B)已复制，0.000185659 秒，689 kB/秒

# scp fence_xvm.key [email protected]:/etc/cluster/

# scp fence_xvm.key [email protected]:/etc/cluster/

如果server1.server2上没有这个目录就创建一个

最后重启systemctl restart fence_virtd 这个服务

测试：

在 server1 中执行命令: fence_node server2.example.com ##此处一定要用域名

或者

在 server2 中执行命令: fence_node server1.example.com ##此处一定要用域名

高可用集群的配置

#在真机重启之后，需要先将sevrer3的fence_virtd 服务打开

systemctl start fence_virtd

在图形界面上选择Failover Domain选项

点击add-->填写名字，下面的全选-->create

设置优先级。越小的，优先级越高

再选择resources选项-->add-->ip address-->设置虚拟ip：172.25.50.100-->掩码为：24-->Monitor Link:对勾最后的一行为延迟时间:5 --> Submit

选择 Resources --> add --> Script --> Name:httpd --> Full Path to Script File: /etc/init.d/httpd --> Submit

选择 Service Groups --> add --> Name:apache --> 全对勾 --> Failover Domain:webfile --> Recovery Policy:Relocate --> Add Resource --> 先添加 ip address，再添加Script --> submit## Run Excluslve -- 运行独占(只允许此服务运行)

测试：

[[email protected] cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:08:17 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, Local, rgmanager

server2.example.com 2 Online, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server1.example.com started

测试：

[[email protected] cluster]# /etc/init.d/httpd stop

停止 httpd： [确定]

在把server2主机上用clustat 命令查看，

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:04 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, rgmanager

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server1.example.com started

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:10 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, rgmanager

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache none recovering

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:11 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, rgmanager

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server2.example.com starting

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:12 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, rgmanager

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server2.example.com starting

经过观察后发现：服务从server1上切换到了server2

测试2：

在server2上： ip link set down eth0 #将网卡关闭

这时候在sevrer1上：

[[email protected] cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:21:05 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, Local, rgmanager

server2.example.com 2 Offline

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server1.example.com started

[[email protected] cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:21:11 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Online, Local, rgmanager

server2.example.com 2 Online

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server1.example.com started

测试3：

[[email protected] cluster]# echo c > /proc/sysrq-trigger

在server2中查看

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:30:31 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Offline

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server1.example.com started

[[email protected] ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:30:34 2017

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

server1.example.com 1 Offline

server2.example.com 2 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:apache server2.example.com starting

时间： 2024-08-19 08:06:35

高可用集群的基本部署

高可用集群的基本部署的相关文章

CapitalOne - Artifactory高可用集群的自动化部署实践

Hadoop2.6(NN/RM)高可用集群安装与部署

MySQL分片高可用集群之Cobar部署使用

玩转MHA高可用集群

部署redis主从高可用集群

Puppet自动化高可用集群部署

线上测试高可用集群部署文档【我的技术我做主】

kubeadm部署k8s1.9高可用集群--4部署master节点

SpringCloud组件：Eureka高可用集群部署