Ceph: Mix SATA and SSD Within the Same Box

The use case is simple, I want to use both SSD disks and SATA disks within the same machine and ultimately create pools pointing to SSD or SATA disks. In order to achieve our goal, we need to modify the CRUSH map. My example has 2 SATA disks and 2 SSD disks on each host and I have 3 hosts in total.

To illustrate, please refer to the following picture:

I. CRUSH Map

CRUSH is very flexible and topology aware which is extremely useful in our scenario. We are about to create two different root or entry point from which the CRUSH algorithm will go through to store our objects. We will have one root for our SSD disks and another one for our SATA disks. Looking at the CRUSH map below you will see that we duplicated our topology, it is like we let CRUSH thinking that we had two different platforms which not entirely true. We only represented a logical view of what we wish to accomplish.

Here the CRUSH map:

##
# OSD SATA DECLARATION
##
host ceph-osd2-sata {
  id -2   # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.0 weight 1.000
  item osd.3 weight 1.000
}
host ceph-osd1-sata {
  id -3   # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.2 weight 1.000
  item osd.5 weight 1.000
}
host ceph-osd0-sata {
  id -4   # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.1 weight 1.000
  item osd.4 weight 1.000
}

##
# OSD SSD DECLARATION
##

host ceph-osd2-ssd {
  id -22    # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.6 weight 1.000
  item osd.9 weight 1.000
}
host ceph-osd1-ssd {
  id -23    # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.8 weight 1.000
  item osd.11 weight 1.000
}
host ceph-osd0-ssd {
  id -24    # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item osd.7 weight 1.000
  item osd.10 weight 1.000
}

Now we create our two roots containing our OSDs:

##
# SATA ROOT DECLARATION
##

root sata {
  id -1   # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item ceph-osd2-sata weight 4.000
  item ceph-osd1-sata weight 4.000
  item ceph-osd0-sata weight 4.000
}

##
# SATA ROOT DECLARATION
##

root ssd {
  id -21    # do not change unnecessarily
  # weight 0.000
  alg straw
  hash 0  # rjenkins1
  item ceph-osd2-ssd weight 4.000
  item ceph-osd1-ssd weight 4.000
  item ceph-osd0-ssd weight 4.000
}

I create 2 new rules:

##
# SSD RULE DECLARATION
##

# rules
rule ssd {
 ruleset 0
 type replicated
 min_size 1
 max_size 10
 step take ssd
 step chooseleaf firstn 0 type host
 step emit
}

##
# SATA RULE DECLARATION
##

rule sata {
 ruleset 1
 type replicated
 min_size 1
 max_size 10
 step take sata
 step chooseleaf firstn 0 type host
 step emit
}

Compile and inject the new map:

$ crushtool -c lamap.txt -o lamap.coloc
$ sudo ceph osd setcrushmap -i lamap.coloc

Then see the result:

$ sudo ceph osd tree
# id  weight  type name up/down reweight
-21 12  root ssd
-22 4       host ceph-osd2-ssd
6 1             osd.6 up  1
9 1             osd.9 up  1
-23 4       host ceph-osd1-ssd
8 1             osd.8 up  1
11  1           osd.11  up  1
-24 4       host ceph-osd0-ssd
7 1             osd.7 up  1
10  1           osd.10  up  1
-1  12  root sata
-2  4       host ceph-osd2-sata
0 1             osd.0 up  1
3 1             osd.3 up  1
-3  4       host ceph-osd1-sata
2 1             osd.2 up  1
5 1             osd.5 up  1
-4  4       host ceph-osd0-sata
1 1             osd.1 up  1
4 1             osd.4 up  1

II. CRUSH rules

Pools configuration

Create pools:

[email protected]:~# ceph osd pool create ssd 128 128
pool ‘ssd‘ created
[email protected]:~# ceph osd pool create sata 128 128
pool ‘sata‘ created

Assign rules to the pools:

[email protected]:~# ceph osd pool set ssd crush_ruleset 0
set pool 8 crush_ruleset to 0
[email protected]:~# ceph osd pool set sata crush_ruleset 1
set pool 9 crush_ruleset to 1

Result from ceph osd dump:

pool 8 ‘ssd‘ replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 116 flags hashpspool stripe_width 0
pool 9 ‘sata‘ replicated size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 117 flags hashpspool stripe_width 0

III. OSDs configuration

Yes, you can disable updating the crushmap on start of the daemon:

[osd]
osd crush update on start = false

时间: 2024-10-31 00:08:35

Ceph: Mix SATA and SSD Within the Same Box的相关文章

使用FIO对SATA、SSD和PCIe Flash进行测试

首先声明,同事做的实验 使用fio对SATA.SSD.PCIE进行了测试 测试说明: 1.测试命名   sync_write_4k_32 sync表示测试方式,可以是sync或者libaio,sync就是发起IO请求等待IO完成后,此thread继续发起IO请求,实现并发采用fio发起多线程实现:libaio,异步IO,thread发起IO请求后,IO请求进行IO队列,此模式为了实现并发多测试,采用控制iodepth实现 write:为测试IO请求方法,包括write.read.rw.randw

SATA SAS SSD 硬盘介绍和评测

SATA SATA的全称是Serial Advanced Technology Attachment,是由Intel.IBM.Dell.APT.Maxtor和Seagate公司共同提出的硬盘接口规范. SATA硬盘采用新的设计结构,数据传输快,节省空间,相对于IDE硬盘具有很多优势: 1 .SATA硬盘比IDE硬盘传输速度高.目前SATA可以提供150MB/s的高峰传输速率.今后将达到300 MB/s和600 MB/s.到时我们将得到比IDE硬盘快近10倍的传输速率. 2. 相对于IDE硬盘的P

选盘秘籍:用户如何选择SSD/SATA/SAS?

先学习下一些专业词汇 IDE (Integrated Drive Electronics) 电子集成驱动器 它的本意是指把"硬盘控制器"与"盘体"集成在一起的硬盘驱动器.把盘体与控制器集成在一起的做法减少了硬盘接口的电缆数目与长度,数据传输的可靠性得到了增强,硬盘制造起来变得更容易,因为硬盘生产厂商不需要再担心自己的硬盘是否与其它厂商生产的控制器兼容.对用户而言,硬盘安装起来也更为方便.IDE这一接口技术从诞生至今就一直在不断发展,性能也不断的提高,其拥有的价格低廉

ceph SSD HDD分离与openstack调用

本例子ceph L版本采用的是filestore,而不是bluestore. 一.查看class类型,只有一个hdd,.Luminous 为每个OSD添加了一个新的属性:设备类.默认情况下,OSD将根据Linux内核公开的硬件属性自动将其设备类设置为HDD.SSD或NVMe(如果尚未设置).这些设备类在ceph osd tree 中列出(实验环境无ssd硬盘,生产环境有ssd可以直接认到并自动创建ssd class,不需要第二步到第四步) , 修改前集群拓扑: [[email protected

关于ceph tier的一些想法

ceph的实验环境在公司内部用了一段时间,主要是利用rbd提供的块设备创建虚拟机.为虚拟机分配块,还是很稳定的.但现在的环境大部分配置还是ceph的默认值,只是将journal分离出来写到了一个单独的分区.后面打算利用ceph tier和ssd做一些优化: 1. 将journal写入一块单独的ssd磁盘. 2. 利用ssd配置一个ssd pool,将这个pool作为其它pool的cache,这就需要ceph tier. 网上搜索了一下,目前还没有这么实践的文章以及这么做后性能到底会提升多少.所以

Ceph环境搭建(二)

一.布局 主机共有node1,node2,node3三个,每台主机有三个OSD,如下图所示,其中osd1,3,5,6,7,8为SSD盘,2,3,4为SATA盘. 三台主机上各有一个Monitor,也各有一个MDS. 我们用osd1,3,4建一个pool名叫ssd,采用三副本的形式,osd0,2,4建一个Pool名字叫sata,采用纠删码的形式,k=2,m=1,即用两个osd存数据分片,一个osd存校验信息,osd6,7,8建一个pool名叫metadata用来存放cephfs的元数据. pool

ceph部署过程中的错误

ceph版本-jewel 用ssd盘来journal ,格式分区权限问题 [ceph-node2][WARNIN] ceph_disk.main.FilesystemTypeError: Cannot discover filesystem type: device /dev/sdc: Line is truncated: [ceph-node2][ERROR ] RuntimeError: command returned non-zero exit status: 1[ceph_deploy

在北美闪存峰会上的报告——NVMe SSD数据保护技术

今天在Flash Memory Summit(FMS)峰会上做了关于NVMe SSD数据保护的技术报告.今年是NVMe SSD大规模使用的元年,NVMe SSD在服务器得到了大量支持.互联网大规模部署NVMe SSD.传统行业开始使用NVMe.存储阵列开始向NVMe的方向努力,NVMe SSD将会成为未来的主流存储介质.在NVMe SSD向前推进的过程中,有一个问题不得不需要面对与解决,这个问题就是NVMe SSD盘的数据保护问题.传统SATA/SAS SSD可以使用硬RAID来解决数据保护问题

NVMe SSD是什么?

一直对闪存存储关注的朋友对NVMe SSD一定非常熟悉,NVMe SSD是现如今性能最好的存储盘.这种高性能盘在互联网领域已经得到了大规模应用,但是在行业用户还没有得以大范围普及.很多人对NVMe SSD也许还比较陌生,不知道如何应用该类型盘,并且给自己的业务带来价值.和SATA/SAS SSD以及HDD相比,她到底又有何神秘之处呢?在此我想对NVMe SSD做一些介绍. 提到NVMe SSD,不得不提Fusion IO率先研制的PCIe SSD,10年前Fusion IO率先采用PCIe接口的