一、布局
主机共有node1,node2,node3三个,每台主机有三个OSD,如下图所示,其中osd1,3,5,6,7,8为SSD盘,2,3,4为SATA盘。
三台主机上各有一个Monitor,也各有一个MDS。
我们用osd1,3,4建一个pool名叫ssd,采用三副本的形式,osd0,2,4建一个Pool名字叫sata,采用纠删码的形式,k=2,m=1,即用两个osd存数据分片,一个osd存校验信息,osd6,7,8建一个pool名叫metadata用来存放cephfs的元数据。
pool ssd和sata构成一个writeback模式的cache分层,ssd为hotstorage,即缓存,sata为coldstorage即后端存储;sata和metadata两个pool构建一个cephfs,挂载到/mnt/cephfs目录下。
二、步骤
1、安装软件
(1)安装依赖
apt-get install autoconf automake autotools-dev libbz2-dev debhelper default-jdk git javahelper junit4 libaio-dev libatomic-ops-dev libbabeltrace-ctf-dev libbabeltrace-dev libblkid-dev libboost-dev
libboost-program-options-dev libboost-system-dev libboost-thread-dev libcurl4-gnutls-dev libedit-dev libexpat1-dev libfcgi-dev libfuse-dev libgoogle-perftools-dev libkeyutils-dev libleveldb-dev libnss3-dev libsnappy-dev liblttng-ust-dev libtool libudev-dev
libxml2-dev pkg-config python python-argparse python-nose uuid-dev uuid-runtime xfslibs-dev yasm
apt-get install uuid-dev libkeyutils-dev libgoogle-perftools-dev libatomic-ops-dev libaio-dev libgdata-common libgdata13 libsnappy-dev libleveldb-dev
(2)安装软件包
wget http://ceph.com/download/ceph-0.89.tar.gz
./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
make -j4
make install
http://docs.ceph.com/docs/master/install/manual-deployment/
2、搭建monitor
(1)cp src/init-ceph /etc/init.d/ceph
(2)uuidgen
2fc115bf-b7bf-439a-9c23-8f39f025a9da
vim /etc/ceph/ceph.conf; set fsid = 2fc115bf-b7bf-439a-9c23-8f39f025a9da
(3)在/tmp/ceph.mon.keyring文件下产生一个keyring
ceph-authtool --create-keyring /tmp/ceph.mon.keyring --gen-key -n mon. --cap mon ‘allow *‘
(4)在/tmp/ceph.client.admin.keyring文件下产生一个keyring
ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon ‘allow *‘ --cap osd ‘allow *‘ --cap mds ‘allow *‘
(5)将ceph.client.admin.keyring导入到ceph.mon.keyring
ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
(6)在node1上创建一个Mon,名字叫node1,/tmp/monmap 存monmap
monmaptool --create --add node1 172.10.2.171 --fsid 2fc115bf-b7bf-439a-9c23-8f39f025a9da /tmp/monmap
(7)mkdir -p /var/lib/ceph/mon/ceph-node1
(8)ceph-mon --mkfs -i node1 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
(9)touch /var/lib/ceph/mon/ceph-node1/done
(10)/etc/init.d/ceph start mon.node1
3、加入OSD
(1)做磁盘的格式化工作
ceph-disk prepare --cluster ceph --cluster-uuid 2fc115bf-b7bf-439a-9c23-8f39f025a9da --fs-type xfs /dev/sdb
mkdir -p /var/lib/ceph/bootstrap-osd/
mkdir -p /var/lib/ceph/osd/ceph-0
(2)挂载
ceph-disk activate /dev/sdb1 --activate-key /var/lib/ceph/bootstrap-osd/ceph.keyring
(3)在/etc/ceph/ceph.conf里加入[osd]的信息后/etc/init.d/ceph start就可以启动所有的OSD
如果启动后ceph osd stat查看还是没有up
就rm -rf /var/lib/ceph/osd/ceph-2/upstart
再次启动/etc/init.d/ceph start
(4)设置node2免登陆
ssh-keygen
ssh-copy-id node2
(5)第二个节点上加osd需要拷一些配置
scp /etc/ceph/ceph.conf [email protected]:/etc/ceph/
scp /etc/ceph/ceph.client.admin.keyring [email protected]:/etc/ceph/
scp /var/lib/ceph/bootstrap-osd/ceph.keyring [email protected]:/var/lib/ceph/bootstrap-osd/
然后按照上述(1)-(3)操作
以此类推在node1,2,3上各建立3个osd
4、创建mds,创建文件系统
(1)mkdir -p /var/lib/ceph/mds/ceph-node1/
(2)ceph auth get-or-create mds.node1 mon ‘allow rwx‘ osd ‘allow *‘ mds ‘allow *‘ -o (3)/var/lib/ceph/mds/ceph-node1/keyring
(4)/etc/init.d/ceph start mds.node1
以此类推在node1,2,3上都建立MDS
5、在第二个节点上加Monitor
(1)ssh node2
(2)mkdir -p /var/lib/ceph/mon/ceph-node2
(3)ceph auth get mon. -o /tmp/ceph.mon.keyring
(4)ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
(5)ceph mon getmap -o /tmp/monmap
(6)ceph-mon --mkfs -i node2 --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring
(7)touch /var/lib/ceph/mon/ceph-node2/done
(8)rm -f /var/lib/ceph/mon/ceph-node2/upstart
(9)/etc/init.d/ceph start mon.node2
以此类推在node1,2,3上都建立Monitor
至此ps -ef | grep ceph应该可以查看到每个Node上都有一个Mon进程,一个mds进程,3个osd进程,ceph -s命令也可以查看。配置文件如下:
[global]
fsid = 2fc115bf-b7bf-439a-9c23-8f39f025a9da
mon initial members = node1,node2,node3
mon host = 172.10.2.171,172.10.2.172,172.10.2.173
public network = 172.10.2.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 1024
filestore xattr use omap = true
osd pool default size = 3
osd pool default min size = 1
osd pool default pg num = 333
osd pool default pgp num = 333
osd crush chooseleaf type = 1
[mon.node1]
host = node1
mon addr = 172.10.2.171:6789
[mon.node2]
host = node2
mon addr = 172.10.2.172:6789
[mon.node3]
host = node3
mon addr = 172.10.2.173:6789
[osd]
osd crush update on start = false
[osd.0]
host = node1
addr = 172.10.2.171:6789
[osd.1]
host = node1
addr = 172.10.2.171:6789
[osd.2]
host = node2
addr = 172.10.2.172:6789
[osd.3]
host = node2
addr = 172.10.2.172:6789
[osd.4]
host = node3
addr = 172.10.2.173:6789
[osd.5]
host = node3
addr = 172.10.2.173:6789
[osd.6]
host = node3
addr = 172.10.2.173:6789
[osd.7]
host = node2
addr = 172.10.2.172:6789
[osd.8]
host = node1
addr = 172.10.2.171:6789
[mds.node1]
host = node1
[mds.node2]
host = node2
[mds.node3]
host = node3
6、修改crushmap
(1)获取crush map
ceph osd getcrushmap -o compiled-crushmap-filename
(2)反编译
crushtool -d compiled-crushmap-filename -o decompiled-crushmap-filename
(3)编辑decompiled-crushmap-filename ,加入ruleset,一共三个root对应三个pool,再次建立root和osd的对应关系,在ruleset中和root连接起来,设置pool的类型等。
(4)编译
crushtool -c decompiled-crushmap-filename -o compiled-crushmap-filename
(5)设置crush map
ceph osd setcrushmap -i compiled-crushmap-filename
编辑后的crushmap如下:
[email protected]:~# cat decompiled-crushmap-filename
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root
root sata {
id -1 # do not change unnecessarily
# weight 0.000
alg straw
hash 0 # rjenkins1
item osd.0 weight 0.1
item osd.2 weight 0.1
item osd.4 weight 0.1
}
root ssd {
id -8 # do not change unnecessarily
#weight 0.000
alg straw
hash 0 # rjenkins1
item osd.1 weight 0.1
item osd.3 weight 0.1
item osd.5 weight 0.1
}
root metadata {
id -9 # do not change unnecessarily
#weight 0.000
alg straw
hash 0 # rjenkins1
item osd.7 weight 0.1
item osd.6 weight 0.1
item osd.8 weight 0.1
}
rule ssd {
ruleset 1
type replicated
min_size 1
max_size 10
step take ssd
step chooseleaf firstn 0 type osd
step emit
}
rule sata {
ruleset 0
type erasure
min_size 1
max_size 10
step take sata
step chooseleaf firstn 0 type osd
step emit
}
rule metadata {
ruleset 2
type replicated
min_size 1
max_size 10
step take metadata
step chooseleaf firstn 0 type osd
step emit
}
7、建立pool
(1)建立ssd pool
ceph osd pool create ssd 128 128 repicated ssd
(2)建立sata pool
ceph osd pool create sata 128 128 erasure default sata
(3)建立metadata pool
ceph osd pool create metadata128 128 repicated metadata
查看pg状态ceph pg dump可以检查哪些PG在哪些OSD中
8、建立cache tier
(1)创建一个tier
ceph osd tier add sata ssd
(2)设置tier mode,有writeback和readonly两种
ceph osd tier cache-mode ssd writeback
Writeback cache tiers overlay the backing storage tier, so they require one additional step: you must direct all client traffic from the storage pool to the cache pool. To direct client traffic
directly to the cache pool, execute the following:
ceph osd tier set-overlay sata ssd
(3)设置参数
ceph osd pool set ssd hit_set_type bloom
ceph osd pool set ssd hit_set_count 1
ceph osd pool set ssd hit_set_period 3600
ceph osd pool set ssd target_max_bytes 1000000000000
ceph osd pool set sata cache_target_dirty_ratio 0.4
ceph osd pool set sata cache_target_full_ratio 0.8
ceph osd pool ssd target_max_bytes 1000000000000
ceph osd pool set ssd target_max_objects 1000000
ceph osd pool set ssd cache_min_flush_age 600
ceph osd pool set ssd cache_min_evict_age 1800
9、创建cephfs
ceph fs new cephfs metadata sata
10、挂载cephfs
(1)获得密码
[email protected]:~# ceph-authtool --print-key /etc/ceph/ceph.client.admin.keyring
AQBNw5dU9K5MCxAAxnDaE0f9UCA/zAWo/hfnSg==
(2)挂载
[email protected]:~# mount -t ceph node1:6789:/ /mnt/cephfs -o name=admin,secret=AQBNw5dU9K5MCxAAxnDaE0f9UCA/zAWo/hfnSg==