当集群容量或者计算资源达到一定限定时,就需要对集群进行扩容,扩容操作主要可以分为两种 :
1、纵向扩展:向已有节点中添加磁盘,容量增加,集群计算性能不变;
2、横向扩展:添加新的节点,包括磁盘、内存、cpu资源,可以达到扩容性能提升的效果;
一、 在生产环境中避免新增节点影响性能,添加标识位
生产环境中,一般不会在新节点加入ceph集群后,立即开始数据回填,这样会影响集群性能。所以我们需要设置一些标志位,来完成这个目的。
[[email protected] ~]##ceph osd set noin
[[email protected] ~]##ceph osd set nobackfill
在用户访问的非高峰时,取消这些标志位,集群开始在平衡任务。
[[email protected] ~]##ceph osd unset noin
[[email protected] ~]##ceph osd unset nobackfill
二、新节点安装ceph
(1)#手动yum集群部署
[[email protected] ~]# yum -y install ceph ceph-radosgw
(2)#检查安装的包
[[email protected] ~]# rpm -qa | egrep -i "ceph|rados|rbd"
(3)#检查ceph 安装本版,需要统一版本
[[email protected] ~]# ceph -v 全部都是(nautilus版本)
ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
三、向ceph集群中新增节点
ceph可以无缝扩展,支持在线添加osd和monitor节点
(1)健康集群
[[email protected] ~]# ceph -s
cluster:
id: 58a12719-a5ed-4f95-b312-6efd6e34e558
health: HEALTH_OK
services:
mon: 2 daemons, quorum node140,node142 (age 8d)
mgr: admin(active, since 8d), standbys: node140
mds: cephfs:1 {0=node140=up:active} 1 up:standby
osd: 16 osds: 16 up (since 5m), 16 in (since 2w)
data:
pools: 5 pools, 768 pgs
objects: 2.65k objects, 9.9 GiB
usage: 47 GiB used, 8.7 TiB / 8.7 TiB avail
pgs: 768 active+clean
(2)当前集群节点数量为3
[[email protected] ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.71826 root default
-2 3.26935 host node140
0 hdd 0.54489 osd.0 up 1.00000 1.00000
1 hdd 0.54489 osd.1 up 1.00000 1.00000
2 hdd 0.54489 osd.2 up 1.00000 1.00000
3 hdd 0.54489 osd.3 up 1.00000 1.00000
4 hdd 0.54489 osd.4 up 1.00000 1.00000
5 hdd 0.54489 osd.5 up 1.00000 1.00000
-3 3.26935 host node141
12 hdd 0.54489 osd.12 up 1.00000 1.00000
13 hdd 0.54489 osd.13 up 1.00000 1.00000
14 hdd 0.54489 osd.14 up 1.00000 1.00000
15 hdd 0.54489 osd.15 up 1.00000 1.00000
16 hdd 0.54489 osd.16 up 1.00000 1.00000
17 hdd 0.54489 osd.17 up 1.00000 1.00000
-4 2.17957 host node142
6 hdd 0.54489 osd.6 up 1.00000 1.00000
9 hdd 0.54489 osd.9 up 1.00000 1.00000
10 hdd 0.54489 osd.10 up 1.00000 1.00000
11 hdd 0.54489 osd.11 up 1.00000 1.00000
(3)集群节点复制配置文件和密钥到新增节点node143
[[email protected] ceph]# ls
ceph.client.admin.keyring ceph.conf
(4)新增节点具备访问集群权限
[[email protected] ceph]# ceph -s
cluster:
id: 58a12719-a5ed-4f95-b312-6efd6e34e558
health: HEALTH_OK
services:
mon: 2 daemons, quorum node140,node142 (age 8d)
mgr: admin(active, since 8d), standbys: node140
mds: cephfs:1 {0=node140=up:active} 1 up:standby
osd: 16 osds: 16 up (since 25m), 16 in (since 2w)
data:
pools: 5 pools, 768 pgs
objects: 2.65k objects, 9.9 GiB
usage: 47 GiB used, 8.7 TiB / 8.7 TiB avail
pgs: 768 active+clean
(5)准备好磁盘
[[email protected] ceph]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 557.9G 0 disk
├─sda1 8:1 0 200M 0 part /boot
└─sda2 8:2 0 519.4G 0 part
└─centos-root 253:0 0 519.4G 0 lvm /
sdb 8:16 0 558.9G 0 disk
sdc 8:32 0 558.9G 0 disk
sdd 8:48 0 558.9G 0 disk
sde 8:64 0 558.9G 0 disk
sdf 8:80 0 558.9G 0 disk
sdg 8:96 0 558.9G 0 disk
(6)#将作为osd磁盘标记为GPT格式
[[email protected] ]# parted /dev/sdc mklabel GPT
[[email protected] ]# parted /dev/sdd mklabel GPT
[[email protected] ]# parted /dev/sdf mklabel GPT
[[email protected] ]#parted /dev/sdg mklabel GPT
[[email protected] ]# parted /dev/sdb mklabel GPT
[[email protected] ]# parted /dev/sde mklabel GPT
(7)#格式化成为xfs文件系统
[[email protected] ]# mkfs.xfs -f /dev/sdc
[[email protected] ]# mkfs.xfs -f /dev/sdd
[[email protected] ]# mkfs.xfs -f /dev/sdb
[[email protected] ]# mkfs.xfs -f /dev/sdf
[[email protected] ]# mkfs.xfs -f /dev/sdg
[[email protected] ]# mkfs.xfs -f /dev/sde
(8)创建osd
[[email protected] ~]# ceph-volume lvm create --data /dev/sdb
--> ceph-volume lvm activate successful for osd ID: 0
--> ceph-volume lvm create successful for: /dev/sdb
[[email protected] ~]# ceph-volume lvm create --data /dev/sdc
[[email protected] ~]# ceph-volume lvm create --data /dev/sdd
[[email protected] ~]# ceph-volume lvm create --data /dev/sdf
[[email protected] ~]# ceph-volume lvm create --data /dev/sdg
[[email protected] ~]# ceph-volume lvm create --data /dev/sde
[[email protected] ~]# blkid
/dev/mapper/centos-root: UUID="7616a088-d812-456b-8ae8-38d600eb9f8b" TYPE="xfs"
/dev/sda2: UUID="6V8bFT-ylA6-bifK-gmob-ah3I-zZ4G-N7EYwD" TYPE="LVM2_member"
/dev/sda1: UUID="eee4c9af-9f12-44d9-a386-535bde734678" TYPE="xfs"
/dev/sdb: UUID="TcjeCg-YsBQ-RHbm-UNYT-UoQv-iLFs-f1st2X" TYPE="LVM2_member"
/dev/sdd: UUID="aSLPmt-ohdJ-kG7W-JOB1-dzOD-D0zp-krWW5m" TYPE="LVM2_member"
/dev/sdc: UUID="7ARhbT-S9sC-OdZw-kUCq-yp97-gSpY-hfoPFa" TYPE="LVM2_member"
/dev/sdg: UUID="9MDhh1-bXIX-DwVf-RkIt-IUVm-fPEH-KSbsDd" TYPE="LVM2_member"
/dev/sde: UUID="oc2gSZ-j3WO-pOUs-qJk6-ZZS0-R8V7-1vYaZv" TYPE="LVM2_member"
/dev/sdf: UUID="jxQjNS-8xpV-Hc4p-d2Vd-1Q8O-U5Yp-j1Dn22" TYPE="LVM2_member"
(9)#查看创建osd
[[email protected] ~]# ceph-volume lvm list
[[email protected] ~]# lsblk
(10)#OSD会自动启动
[[email protected] ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 11.98761 root default
-2 3.26935 host node140
0 hdd 0.54489 osd.0 up 1.00000 1.00000
1 hdd 0.54489 osd.1 up 1.00000 1.00000
2 hdd 0.54489 osd.2 up 1.00000 1.00000
3 hdd 0.54489 osd.3 up 1.00000 1.00000
4 hdd 0.54489 osd.4 up 1.00000 1.00000
5 hdd 0.54489 osd.5 up 1.00000 1.00000
-3 3.26935 host node141
12 hdd 0.54489 osd.12 up 1.00000 1.00000
13 hdd 0.54489 osd.13 up 1.00000 1.00000
14 hdd 0.54489 osd.14 up 1.00000 1.00000
15 hdd 0.54489 osd.15 up 1.00000 1.00000
16 hdd 0.54489 osd.16 up 1.00000 1.00000
17 hdd 0.54489 osd.17 up 1.00000 1.00000
-4 2.17957 host node142
6 hdd 0.54489 osd.6 up 1.00000 1.00000
9 hdd 0.54489 osd.9 up 1.00000 1.00000
10 hdd 0.54489 osd.10 up 1.00000 1.00000
11 hdd 0.54489 osd.11 up 1.00000 1.00000
-9 3.26935 host node143
7 hdd 0.54489 osd.7 up 1.00000 1.00000
8 hdd 0.54489 osd.8 up 1.00000 1.00000
18 hdd 0.54489 osd.18 up 0 1.00000
19 hdd 0.54489 osd.19 up 0 1.00000
20 hdd 0.54489 osd.20 up 0 1.00000
21 hdd 0.54489 osd.21 up 0 1.00000
====== osd.0 =======
显示osd.num,num后面会用到。
[[email protected] ~]# systemctl enable [email protected]
[[email protected] ~]# systemctl enable [email protected]
[[email protected] ~]# systemctl enable [email protected]
[[email protected] ~]# systemctl enable [email protected]
[[email protected] ~]# systemctl enable [email protected]
[[email protected] ~]# systemctl enable [email protected]
(11)查看集群,扩容成功
[[email protected] ~]# ceph -s
cluster:
id: 58a12719-a5ed-4f95-b312-6efd6e34e558
health: HEALTH_WARN
noin,nobackfill flag(s) set
services:
mon: 2 daemons, quorum node140,node142 (age 8d)
mgr: admin(active, since 8d), standbys: node140
mds: cephfs:1 {0=node140=up:active} 1 up:standby
osd: 22 osds: 22 up (since 4m), 18 in (since 9m); 2 remapped pgs
flags noin,nobackfill
data:
pools: 5 pools, 768 pgs
objects: 2.65k objects, 9.9 GiB
usage: 54 GiB used, 12 TiB / 12 TiB avail
pgs: 766 active+clean
1 active+remapped+backfilling
1 active+remapped+backfill_wait
(12)记得低峰时段取消标志位
在用户访问的非高峰时,取消这些标志位,集群开始在平衡任务。
[[email protected] ~]##ceph osd unset noin
[[email protected] ~]##ceph osd unset nobackfill
原文地址:https://blog.51cto.com/7603402/2439762