ceph的使用

Ceph uniquely delivers object, block, and file storage in one unified system.
ceph提供了filesystem, block device, object store三种使用方式.
准确来说下面我们只讲它的block device,由于他们的基础都是要有一个可工作的Ceph Storage Cluster,因此这里先补充说明一些命令.

1.ceph相关命令

1)查看监控集群状态:

ceph health

ceph status

ceph osd stat

ceph osd dump

ceph osd tree

ceph mon dump

ceph quorum_status

ceph mds stat

ceph mds dump

你可以分别试试看这些命令.
2)pools 大概可以理解为命名空间
查看已经存在的pools

[[email protected] ~]# ceph osd lspools

0 data,1 metadata,2 rbd,

查看data pool中的pg_num属性

[[email protected] ~]# ceph osd pool get data pg_num

pg_num: 256

查看data pool中的pgp_num属性

[[email protected] ~]# ceph osd pool get data pgp_num

pgp_num: 256

创建一个pool ‘test-pool’

[[email protected] ~]# ceph osd pool create test-pool 256 256

pool ‘test-pool‘ created

[[email protected] ~]# ceph osd lspools

0 data,1 metadata,2 rbd,3 test-pool,

删除 ‘test-pool’

[[email protected] ~]# ceph osd pool delete test-pool test-pool  --yes-i-really-really-mean-it

pool ‘test-pool‘ deleted

[[email protected] ~]# ceph osd lspools

0 data,1 metadata,2 rbd,

3)CRUSH map相关
获取现有集群的crush map

[[email protected] ~]# ceph osd getcrushmap -o crush.map

got crush map from osdmap epoch 734

反编译

[[email protected] ~]# cat crush.txt

# begin crush map

# devices

device 0 osd.0

device 1 osd.1

device 2 osd.2

# types

type 0 osd

type 1 host

type 2 rack

type 3 row

type 4 room

type 5 datacenter

type 6 root

# buckets

host test-1 {

id -2           # do not change unnecessarily

# weight 1.000

alg straw

hash 0  # rjenkins1

item osd.0 weight 1.000

}

host test-2 {

id -4           # do not change unnecessarily

# weight 1.000

alg straw

hash 0  # rjenkins1

item osd.1 weight 1.000

}

host test-3 {

id -5           # do not change unnecessarily

# weight 1.000

alg straw

hash 0  # rjenkins1

item osd.2 weight 1.000

}

rack unknownrack {

id -3           # do not change unnecessarily

# weight 3.000

alg straw

hash 0  # rjenkins1

item test-1 weight 1.000

item test-2 weight 1.000

item test-3 weight 1.000

}

root default {

id -1           # do not change unnecessarily

# weight 3.000

alg straw

hash 0  # rjenkins1

item unknownrack weight 3.000

}

# rules

rule data {

ruleset 0

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

rule metadata {

ruleset 1

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

rule rbd {

ruleset 2

type replicated

min_size 1

max_size 10

step take default

step chooseleaf firstn 0 type host

step emit

}

# end crush map

仔细观察这个输出信息,是不是发现了些什么有意思的事?请看官方文档的说明CRUSH
当你修改好了以后编译crush map

crushtool -c crush.txt -o crush.map

将这个生成的crush map设置到集群中

ceph osd setcrushmap -i crush.map

2.ceph block device相关命令

1)基本操作
创建一个block device image

[[email protected] ~]# rbd create test-image --size 1024 --pool test-pool

[[email protected] ~]# rbd ls test-pool

test-image

查看这个image的详细信息

[[email protected] ~]# rbd --image test-image info --pool test-pool

rbd image ‘test-image‘:

size 1024 MB in 256 objects

order 22 (4096 kB objects)

block_name_prefix: rb.0.1483.6b8b4567

format: 1

删除这个image

[[email protected] ~]# rbd rm test-image -p test-pool

Removing image: 100% complete...done.

2)Kernel Modules
有时候我们需要将image挂载到本地,同时修改image中的一些信息,这就需要用到了map操作.
首先我们需要在内核中载入rbd模块(请确保之前内核升级的时候已选上了rbd相关)

modprobe rbd

map test-image

rbd map test-image --pool test-pool --id admin

查看mapped的设备

[[email protected] mycephfs]# rbd showmapped

id pool      image      snap device

1  test-pool test-image -    /dev/rbd1

我们看下/dev/rbd1的磁盘信息,然后mkfs,再挂载到/mnt/mycephfs目录下,在向里面创建一个包含’hello world’字符串的文件

[[email protected] ~]# fdisk -lu /dev/rbd1

Disk /dev/rbd1: 1073 MB, 1073741824 bytes

255 heads, 63 sectors/track, 130 cylinders, total 2097152 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 4194304 bytes / 4194304 bytes

Disk identifier: 0x00000000

[[email protected] ~]# mkfs.ext4 /dev/rbd1

mke2fs 1.41.12 (17-May-2010)

Filesystem label=

OS type: Linux

Block size=4096 (log=2)

Fragment size=4096 (log=2)

Stride=1024 blocks, Stripe width=1024 blocks

65536 inodes, 262144 blocks

13107 blocks (5.00%) reserved for the super user

First data block=0

Maximum filesystem blocks=268435456

8 block groups

32768 blocks per group, 32768 fragments per group

8192 inodes per group

Superblock backups stored on blocks:

32768, 98304, 163840, 229376

Writing inode tables: done

Creating journal (8192 blocks): done

Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 33 mounts or

180 days, whichever comes first.  Use tune2fs -c or -i to override.

[[email protected] ~]# mount /dev/rbd1 /mnt/mycephfs/

[[email protected] ~]# ll /mnt/mycephfs/

total 16

drwx------ 2 root root 16384 Nov 27 13:40 lost+found

[[email protected] ~]# cd /mnt/mycephfs/

[[email protected] mycephfs]# ls

lost+found

[[email protected] mycephfs]# echo ‘hello‘ > hello.txt

[[email protected] mycephfs]# ls

hello.txt  lost+found

[[email protected] mycephfs]# df -h /mnt/mycephfs/

Filesystem            Size  Used Avail Use% Mounted on

/dev/rbd1             976M  1.3M  908M   1% /mnt/mycephfs

我们同时也可以改变image的容量大小

[[email protected] mycephfs]# rbd resize --size 2048 test-image

rbd: error opening image test-image: (2) No such file or directory

2013-11-27 13:48:24.290564 7fcf3b185760 -1 librbd::ImageCtx: error finding header: (2) No such file or directory

[[email protected] mycephfs]# rbd resize --size 2048 test-image --pool test-pool

Resizing image: 100% complete...done.

[[email protected] mycephfs]# df -h /mnt/mycephfs/

Filesystem            Size  Used Avail Use% Mounted on

/dev/rbd1             976M  1.3M  908M   1% /mnt/mycephfs

[[email protected] mycephfs]# blockdev --getsize64 /dev/rbd1

2147483648

[[email protected] mycephfs]# resize2fs /dev/rbd1

resize2fs 1.41.12 (17-May-2010)

Filesystem at /dev/rbd1 is mounted on /mnt/mycephfs; on-line resizing required

old desc_blocks = 1, new_desc_blocks = 1

Performing an on-line resize of /dev/rbd1 to 524288 (4k) blocks.

The filesystem on /dev/rbd1 is now 524288 blocks long.

[[email protected] mycephfs]# df -h /mnt/mycephfs/

Filesystem            Size  Used Avail Use% Mounted on

/dev/rbd1             2.0G  1.6M  1.9G   1% /mnt/mycephfs

[[email protected] mycephfs]# ls

hello.txt  lost+found

当我们修改完毕image内容后就可以unmap掉它了,之前你需要执行umount操作,当你下次map的时候之前创建的hello.txt依然会存在挂载目录下.

[[email protected] mnt]# umount /dev/rbd1

[[email protected] mnt]# rbd unmap /dev/rbd1

3)快照相关
有些时候我们需要对image进行snapshot操作,以便将来可以随时恢复到当时状态.
好我们对test-pool下的test-image进行snap操作

[[email protected] mnt]# rbd snap create test-pool/[email protected]

rbd: failed to create snapshot: (22) Invalid argument

2013-11-27 14:56:53.109819 7f5bea81d760 -1 librbd: failed to create snap id: (22) Invalid argument

提示错误:Invalid argument,搞了半天才知道问题出在’test-pool’, ‘test-image’名字中的’-’上面,
我们新建个pool叫’mypool’同时在下面创建一个’myimage’

[[email protected] ceph]# ceph osd pool create mypool 256 256

pool ‘mypool‘ created

[[email protected] ceph]# rbd create myimage --size 1024 --pool mypool

[[email protected] ceph]# rbd --pool mypool ls

myimage

好,接下来创建snap,快照名字叫’snapimage’

[[email protected] ceph]# rbd snap create mypool/[email protected]

查看myimage的snap

[[email protected] ceph]# rbd snap ls mypool/myimage

SNAPID NAME         SIZE

2 snapimage 1024 MB

接下来我们测试下这个snap吧

[[email protected] ceph]# rbd snap create mypool/[email protected]

[[email protected] ceph]# rbd map mypool/myimage

[[email protected] ceph]# mount /dev/rbd1 /mnt/mycephfs/

[[email protected] ceph]# ls /mnt/mycephfs/

hello.txt  lost+found

[[email protected] ceph]# echo  ‘welcome to zhengtianbao.com ‘ > /mnt/mycephfs/info.txt

[[email protected] ceph]# ls /mnt/mycephfs/

hello.txt  info.txt  lost+found

[[email protected] ceph]# umount /dev/rbd1

[[email protected] ceph]# rbd unmap /dev/rbd1

[[email protected] ceph]# rbd snap rollback mypool/[email protected]

Rolling back to snapshot: 100% complete...done.

[[email protected] ceph]# rbd map mypool/myimage

[[email protected] ceph]# mount /dev/rbd1 /mnt/mycephfs/

[[email protected] ceph]# ls /mnt/mycephfs/

hello.txt  lost+found

是不是如预计的那样myimage回到了snapimage3时候的状态,之后创建的info.txt已经消失了.
删除snap

[[email protected] ceph]# rbd snap ls mypool/myimage

SNAPID NAME          SIZE

2 snapimage  1024 MB

3 snapimage2 1024 MB

4 snapimage3 1024 MB

[[email protected] ceph]# rbd snap rm mypool/[email protected]

[[email protected] ceph]# rbd snap ls mypool/myimage

SNAPID NAME          SIZE

3 snapimage2 1024 MB

4 snapimage3 1024 MB

删除myimage的全部snapshot

[[email protected] ceph]# rbd snap purge mypool/myimage

Removing all snapshots: 100% complete...done.

4)libvirt
与libvirt配合使用,libvirt中定义domain的device使用ceph block device.
关于libvirt,大体的就是一个中间层,与rbd配合使用的关系大概如下:

libvirt-->qemu-->librbd-->librados-->osds

|--->monitors

有关libvirt和qemu以后有机会再补上.
另外,请确保qemu在configure的时候enable rbd.
首先需要有一个制作好的镜像,我这里用centos6的一个镜像

[[email protected]1 ~]# file centos6

centos6: x86 boot sector; GRand Unified Bootloader, stage1 version 0x3, boot drive 0x80, 1st sector stage2 0x849d4, GRUB version 0.94; partition 1: ID=0x83, active, starthead 32, startsector 2048, 1024000 sectors; partition 2: ID=0x8e, starthead 221, startsector 1026048, 19945472 sectors, code offset 0x48

通过qemu-img convert命令将这个镜像放置到mypool中,取名为centos

[[email protected] ceph]# qemu-img convert ~/centos6 rbd:mypool/centos

[[email protected] ceph]# rbd ls --pool mypool

centos

myimage

[[email protected] ceph]# rbd info centos --pool mypool

rbd image ‘centos‘:

size 10240 MB in 2560 objects

order 22 (4096 kB objects)

block_name_prefix: rb.0.14d4.6b8b4567

format: 1

然后我们创建一个libvirt需要用到的domain xml文件,这里只是个简单的例子
test.xml

<domaintype=‘kvm‘>

<name>test-ceph</name>

<memoryunit=‘KiB‘>4194304</memory>

<currentMemoryunit=‘KiB‘>4194304</currentMemory>

<vcpuplacement=‘static‘>4</vcpu>

<os>

<typearch=‘x86_64‘machine=‘pc-i440fx-1.5‘>hvm</type>

<bootdev=‘hd‘/>

<bootmenuenable=‘yes‘/>

</os>

<features>

<acpi/>

<apic/>

<pae/>

</features>

<clockoffset=‘utc‘/>

<on_poweroff>destroy</on_poweroff>

<on_reboot>restart</on_reboot>

<on_crash>restart</on_crash>

<devices>

<emulator>/usr/libexec/qemu-kvm</emulator>

<disktype=‘network‘device=‘disk‘>

<drivername=‘qemu‘type=‘raw‘/>

<sourceprotocol=‘rbd‘name=‘mypool/centos‘>

<hostname=‘localhost‘port=‘6789‘/>

</source>

<targetdev=‘hda‘bus=‘ide‘/>

<addresstype=‘drive‘controller=‘0‘bus=‘0‘target=‘0‘unit=‘0‘/>

</disk>

<controllertype=‘ide‘index=‘0‘>

<addresstype=‘pci‘domain=‘0x0000‘bus=‘0x00‘slot=‘0x01‘function=‘0x1‘/>

</controller>

<controllertype=‘usb‘index=‘0‘>

<addresstype=‘pci‘domain=‘0x0000‘bus=‘0x00‘slot=‘0x01‘function=‘0x2‘/>

</controller>

<inputtype=‘tablet‘bus=‘usb‘/>

<inputtype=‘mouse‘bus=‘ps2‘/>

<graphicstype=‘vnc‘port=‘-1‘autoport=‘yes‘/>

<video>

<modeltype=‘vga‘ram=‘65536‘vram=‘9216‘heads=‘1‘/>

<addresstype=‘pci‘domain=‘0x0000‘bus=‘0x00‘slot=‘0x02‘function=‘0x0‘/>

</video>

<memballoonmodel=‘virtio‘>

<addresstype=‘pci‘domain=‘0x0000‘bus=‘0x00‘slot=‘0x05‘function=‘0x0‘/>

</memballoon>

</devices>

</domain>

接下来通过virsh命令创建虚拟机,查看vnc端口

[[email protected] ceph]# virsh define test.xml

[[email protected] ceph]# virsh list --all

Id    Name                           State

----------------------------------------------------

-     test-ceph                      shut off

[[email protected] ceph]# virsh start test-ceph

Domain test-ceph started

[[email protected] ceph]# virsh list

Id    Name                           State

----------------------------------------------------

1     test-ceph                      running

[[email protected] ceph]# virsh vncdisplay 1

:0

ok,现在我们可以通过vnc客户端连接到host:5900端口的虚拟机中进行操作了,同时你也可以在虚拟机中测试下ceph的读写性能如何…

一些链接:

[1]IBM关于ceph的说明: http://www.ibm.com/developerworks/cn/linux/l-ceph/
[2]ceph架构方面: http://www.ustack.com/blog/ceph_infra/
[3]ceph性能测试: http://tech.uc.cn/?p=1223#more-1223

ceph的使用

时间: 2024-10-13 20:57:45

ceph的使用的相关文章

ceph集群常用命令

结合网络.官网.手动查询等多方渠道,整理ceph维护管理常用命令,并且梳理常规命令在使用过程中的逻辑顺序.另外整理期间发现ceph 集群的命令体系有点乱,详细情况各自体验. 一:ceph集群启动.重启.停止 1:ceph 命令的选项如下: 选项简写描述 --verbose-v详细的日志. --valgrindN/A(只适合开发者和质检人员)用 Valgrind 调试. --allhosts-a在 ceph.conf 里配置的所有主机上执行,否 则它只在本机执行. --restartN/A核心转储

记录一次ceph recovery经历

一次ceph recovery经历 背景 这是一个测试环境. 该环境中是cephfs 一共12个节点, 2个client.2个mds.8个osd mds: 2颗CPU,每个4核,一共是8核. 128G内存, 单独的两个节点,只作为mds cpu型号: Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz osd节点, 每个24核, 8 × 4T SATA盘, 1 SSD:INTEL SSD SC2BB48 (480G) 64G内存 cpu型号: Intel(R) X

交换机死机,导致ceph ( requests are blocked ) 异常解决方法

问题描述: 万兆交换机死机后,导致在交换机上的ceph 的cluster网络会中断,用户正在对数据块的访问没有完成导致请求被blocked,同时部分pg会处于不同步状态,因此交换机重启后,通过ceph health会发现ceph集群不在OK 状态 health HEALTH_ERR 1 pgs inconsistent; 1 pgs repair; 2 requests are blocked > 32 sec; 1 scrub errorspg 6.89 is active+clean+inc

ceph ( pgs inconsistent) pgs不一致 异常状态处理方式

问题描述: 在某些情况下,osd出现异常,导致pgs出现不一致状态# ceph health detailHEALTH_ERR 1 pgs inconsistent; 1 scrub errorspg 6.89 is active+clean+inconsistent, acting [12,1,10]1 scrub errors 可以看到,pg 6.89处于不一致状态 解决方式:#ceph pg repair 6.89instructing pg 6.89 on osd.12 to repai

ceph radosgw与keystone整合

1.参考http://penguintux.blog.51cto.com/3021117/1872939部署好ceph radosgw ceph版本:jewel docker镜像:ceph/daemon:tag-build-master-jewel-centos-7 2.安装keystone,这里使用kolla newton安装好了keystone 参考http://penguintux.blog.51cto.com/3021117/1865832,仅需要安装keyston,kolla的glob

ceph集群osd故障修复实例演示

集群安装方式:1: ceph-deploy 方式安装ceph集群,模拟osd磁盘损坏: 分别采用如下两种方式修复: 1:使用ceph-deploy 方式修复故障osd: 2:手动修复故障osd: #######使用ceph-deploy方式修复过程演示######## 1:停止osd/etc/init.d/ceph stop osd.3 2:查看osd磁盘挂载情况:[[email protected] ceph]# lsblk NAME   MAJ:MIN RM  SIZE RO TYPE MO

ceph cache pool配置

0.引入 本文介绍如何配置cache pool tiering. cache pool的作用是提供可扩展的cache,用来缓存ceph的热点数据或者直接用来作为高速pool.如何建立一个cache pool:首先利用ssd盘做一个虚拟的bucket tree, 然后创建一个cache pool,设置其crush映射rule和相关配置,最后关联需要用到的pool到cache pool. 1.建立ssd bucket tree 这是新增ssd bucket(vrack)后的osd tree.其中os

使用Ceph作为OpenStack的后端存储

概述 libvirt配置了librbd的QEMU接口,通过它可以在OpenStack中使用Ceph块存储.Ceph块存储是集群对象,这意味着它比独立的服务器有更好的性能. 在OpenStack中使用Ceph块设备,必须首先安装QEMU,libvirt和OpenStack,下图描述了 OpenStack和Ceph技术层次结构: http://my.oschina.net/JerryBaby/blog/376580 我翻译的官方文档,仅供参考 ^ ^. 系统规划 OpenStack集群: 控制节点:

Ceph安装QEMU报错:User requested feature rados block device configure was not able to find it

CentOS6.3中,要想使用Ceph的block device,需要安装更高版本的QEMU. 安装好ceph后,安装qemu-1.5.2 # tar -xjvf qemu-1.5.2.tar.bz2 # cd qemu-1.5.2 # ./configure --enable-rbd 一定要加上--enable-rbd选项,这样qemu才能支持rbd协议. 这一步可能会报错: ERROR: User requested feature rados block device configure

ceph文件系统安装配置

1     前言 Ceph是一种为优秀的性能.可靠性和可扩展性而设计的统一的.分布式文件系统. l  Ceph OSDs: Ceph OSD 守护进程( Ceph OSD )的功能是存储数据,处理数据的复制.恢复.回填.再均衡,并通过检查其他OSD 守护进程的心跳来向 Ceph Monitors 提供一些监控信息.当 Ceph 存储集群设定为有2个副本时,至少需要2个 OSD 守护进程,集群才能达到active+clean 状态( Ceph 默认有3个副本,但你可以调整副本数). l  Moni