openstack项目中遇到的各种问题总结 其二(云主机迁移、ceph及扩展分区)

二、日常工作汇集

2.1、迁移总汇

2.1.1、在虚机的环境下的云主机迁移

在vmware workstation创建多台linux虚机,在这几台虚机中搭建openstack环境,然后做云主机的迁移实验。

例如下面的实验:

操作主机

主机IP  主机名    角色

192.168.0.11    YUN-11            控制节点

192.168.0.12    YUN-12            扩展节点

下面以控制节点为例,但是每台涉及迁移的主机都要做操作

1)各节点之间nova账号无密码访问

1.1)在各个需要相互无密码访问节点上做以下操作

# usermod -s /bin/bash nova

# su nova

$ cd

$ ssh-keygen

$ touch .ssh/authorized_keys

1.2)、把其他节点的公钥拷贝过来,追加到本地的认证文件中

以控制节点为例

$ scp [email protected]:/var/lib/nova/.ssh/id_rsa.pub .

$ cat id_rsa.pub >> .ssh/authorized_keys

$ scp [email protected]:/var/lib/nova/.ssh/id_rsa.pub .

$ cat id_rsa.pub >> .ssh/authorized_keys

之后两个扩展节点就能够利用nova用户无密码访问控制节点了

依照这种方法在其他节点做类似操作,最终就会实现各节点之间nova用户的无密码访问

2)【可选,确认即可】网上文档上做了修改,但是本集群按默认配置

如果希望可以在Dashboard里设置root的密码

inject_password=true

修改虚拟机配置,不需要迁移

allow_resize_to_same_host=true

(可选)

迁移和修改配置,不需要手工确认,1表示1秒的时间让你确认,如果没确认就继续

resize_confirm_window=1

重启服务

service openstack-nova-compute restart

3)热迁移(block-migration)

3.1)所有的节点上修改nova.conf

live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_UNSAFE

开启热迁移功能

3.2)【确认即可,此处也按系统默认配置】

然后需要配置versh免密码连接,修改/etc/libvirt/libvirtd.conf

去掉注释

listen_tls = 0

listen_tcp = 1

去掉注释并修改值

auth_tcp = “none” # 注意这里必须设为none,否则需要认证。

测试下:

virsh --connect qemu+tcp://192.168.0.12/system list

如果不需要输入用户名和密码就能够列出所有的虚拟机,则表示配置成功。

重启所有计算节点nova-compute libvirt-bin服务

此时就可以使用novaclient命令进行迁移,比如要把vm1从测试机迁移到YUN-12,则

nova live-migration --block-migrate vm1 YUN-12

注意选项--block-migrate是必要的,否则默认以共享存储的方式迁移,另外需要在控制节点做/etc/hosts文件主机名和IP的解析

4)测试迁移

测试之前关闭两台虚机系统的防火墙

在虚机的环境下测试迁移和物理机下不同在于,以上步骤在虚机下就可以完成迁移了,整个云平台也没有问题,但是在物理机下还需要做额外的配置,物理机系统中防火墙不能关闭。

另外一点就是在虚机环境中云平台在做云主机增加资源的操作和物理环境也有不同,虚机环境下增加和减除资源都可以做到,而在物理机环境下只能做到增加云主机资源。

2.2、物理机环境下的云主机迁移

2.2.1、物理机在系统关闭防火墙后出现异常

在物理机平台上的openstack云平台,在控制节点关闭系统防火墙后,计算节点上都无法创建云主机,这时候就需要打开防火墙,重启物理机,转而选择在防火墙配置文件中添加策略的方式。

另外selinux也需要关闭,修改/etc/sysconfig/selinux

SELINUX=enforcing

to

SELINUX=disable

在虚机环要下面的配置就可以了,编境下修改防火墙只需辑/etc/sysconfig/iptables

添加

-A INPUT -p tcp -m multiport --port 16509 -m comment --comment "libvirt" -j ACCEPT

-A INPUT -p tcp -m multiport --port 49152:49216 -m comment --comment "migraton" -j ACCEPT

但是在物理机环境下需要做下面的配置,在防火墙配置文件中做修改

修改之前的状态

YUN-11防火墙

-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 5900:5999,16509 -m comment --comment "001 nova compute incoming nova_compute" -j ACCEPT

-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.11_192.168.0.11" -j ACCEPT

YUN-12防火墙

-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 5900:5999,16509 -m comment --comment "001 nova compute incoming nova_compute" -j ACCEPT

-A INPUT -s 192.168.0.12/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.12_192.168.0.12" -j ACCEPT

做修改

YUN-11防火墙配置需要添加

-A INPUT -s 192.168.0.12/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.11_192.168.0.12" -j ACCEPT

YUN-12防火墙配置需要添加

-A INPUT -s 192.168.0.11/32 -p tcp -m multiport --dports 16509,49152:49215 -m comment --comment "001 nova qemu migration incoming nova_qemu_migration_192.168.0.12_192.168.0.11" -j ACCEPT

依照上面的实例,如果有其他的物理机的话则需根据实际情况添加策略。

2.2.2、把原来集群的云主机冷迁移到新创建的集群一

查看镜像信息

[[email protected] 51222f9c-5074-440d-92c6-fccaeadc8032_resize(keystone_admin)]# qemu-img info disk

image: disk

file format: qcow2

virtual size: 1.0G (1073741824 bytes)

disk size: 868K

cluster_size: 65536

backing file: /var/lib/nova/instances/_base/87ae9a3ca6476837c0cb656bd99ee1dcca238134

[[email protected] 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# ll

total 1508

-rw-rw----. 1 root root   16618 Apr 23 10:01 console.log

-rw-r--r--. 1 root root 1572864 Apr 23 14:08 disk

-rw-r--r--. 1 nova nova      79 Apr 23 10:01 disk.info

-rw-r--r--. 1 nova nova    1634 Apr 23 10:01 libvirt.xml

[[email protected] 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# chmod o+r console.log

[[email protected] 51222f9c-5074-440d-92c6-fccaeadc8032(keystone_admin)]# ll

total 1508

-rw-rw-r--. 1 root root   16618 Apr 23 10:01 console.log

-rw-r--r--. 1 root root 1572864 Apr 23 14:08 disk

-rw-r--r--. 1 nova nova      79 Apr 23 10:01 disk.info

-rw-r--r--. 1 nova nova    1634 Apr 23 10:01 libvirt.xml

# su nova

[[email protected] instances(keystone_admin)]$ cp -r 51222f9c-5074-440d-92c6-fccaeadc8032 51222f9c-5074-440d-92c6-fccaeadc8032_resize

[[email protected] instances(keystone_admin)]$ cd 51222f9c-5074-440d-92c6-fccaeadc8032_resize

[[email protected] 51222f9c-5074-440d-92c6-fccaeadc8032_resize(keystone_admin)]$ ls -la

total 900

drwxr-xr-x. 2 nova nova    4096 Apr 23 16:35 .

drwxr-xr-x. 8 nova nova    4096 Apr 23 16:35 ..

-rw-r--r--. 1 nova nova   16618 Apr 23 16:35 console.log

-rw-r--r--. 1 nova nova 1572864 Apr 23 16:35 disk

-rw-r--r--. 1 nova nova      79 Apr 23 16:35 disk.info

-rw-r--r--. 1 nova nova    1634 Apr 23 16:35 libvirt.xml

这个过程最后的结果貌似没有记录,,有待以后测试

2.2.3、把原来集群的云主机冷迁移到新创建的集群二

另一种冷迁移方式类似于vmware workstation中虚拟机的迁移一样

实例处理:

实例选择linux的系统,在系统中

创建目录、编辑文件,迁移后查看创建的目录和修改的文档是否正常

迁移之前关闭要迁移的实例

查看实例所在目录下的文档信息

[[email protected] 2dccde39-31a4-48d5-8f62-0f963ffec481_copy]# ll

total 6896

-rw-r-----. 1 root root       1 Apr 30 10:18 console.log

-rw-r--r--. 1 root root 7536640 Apr 30 10:18 disk

-rw-r--r--. 1 root root      79 Apr 30 10:18 disk.info

-rw-r--r--. 1 root root    1635 Apr 30 10:18 libvirt.xml

[[email protected] 2dccde39-31a4-48d5-8f62-0f963ffec481_copy]# qemu-img convert -O qcow disk disk4

把镜像disk4拷贝到YUN-11上(YUN-11是YUN-17所在集群的控制节点)

添加镜像

[[email protected] ~(keystone_admin)]# glance add name=test-26 is_public=true container_format=bare disk_format=raw < /root/disk4

Added new image with ID: 3573cf89-7697-48cd-b07c-51344f416156

在dash中从test-26镜像启动一个云主机

启动成功,之后进入该系统的控制台,发现主机中的目录和文件保存完整

如果出现在绑定浮动IP后云主机PING不通的现象,如下面所示:

从镜像test-27启动的实例在绑定浮动ip后,发现外部的机器PING不通

解决办法:

进入实例发现网卡是eth1,但是网卡配置文件时ifcfg-eth0,配置文件中没有MAC和IP的信息,只是BOOTPROTO=dhcp

对比正常的实例

正常的实例网卡是eth0,网卡配置文件是ifcfg-eth0,配置文件也中没有MAC和IP的信息,只是BOOTPROTO=dhcp

另外发现使用cirros镜像的实例所做的迁移,没有出现这样的情况,在绑定浮动IP后,外部机器可以PING通

所做处理

在test-27实例上,修改网卡配置文件

mv ifcfg-eth0 ifcfg-eth1

修改文件参数

vi ifcfg-eth1

DEVICE=eth0

to

DEVICE=eth1

保存修改后重启网络

发现外部机器可以ping通该实例

2.3、Ceph测试前升级系统内核

2.3.1、升级内核出现的问题一

升级内核步骤如下

查看现在系统的内核参数

[[email protected] ~]# uname -r

2.6.32-431.el6.x86_64

上传内核源码包linux-3.19.3.tar.xz

解压

# tar -Jxvf linux-3.19.3.tar.xz -C /usr/src

安装需要的组件

# yum install -y gcc

# yum install -y ncurses ncurses-devel

# yum install -y bc

调整参数

# make menuconfig

编译安装

# make

# make modules_install install

异常:

sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \

System.map "/boot"

ERROR: modinfo: could not find module ipt_MASQUERADE

ERROR: modinfo: could not find module iptable_nat

ERROR: modinfo: could not find module crc_t10dif

ERROR: modinfo: could not find module scsi_tgt

修改启动项

# vi /boot/grub/grub.conf

default=1

to

default=0

# reboot

在升级内核之前就已经存在openstack环境

系统启动之后出现的状况:

实例启动失败

恢复状态失败

[[email protected] ~(keystone_admin)]# nova reset-state test

[[email protected] ~(keystone_admin)]# nova stop test

[[email protected] ~(keystone_admin)]# nova start test

ERROR: Instance 93efe724-7288-4269-92d7-0346a00a724a in vm_state error. Cannot start while the instance is in this state. (HTTP 409) (Request-ID: req-d080c99c-e18a-4013-b677-d0ac98bf4575)

创建实例失败

[[email protected] ~]# nova boot --image test-mini --flavor 1 test-1 --availability-zone nova:YUN-15 --nic net-id=e49ae481-4

ERROR: You must provide a username via either --os-username or env[OS_USERNAME]

在dashboard中查看“管理员”---“主机集合”,可以看到YUN-15的服务为停止状态

在YUN-15主机上

# service openstack-nova-compute restart

再次创建实例

出错信息:

错误:创建实例“test-1”失败: 请稍后再试[错误: Unexpected vif_type=binding_failed].

用命令查看异常的主机YUN-15和正常的主机服务的区别

[[email protected] ~]# openstack-status

== Nova services ==

openstack-nova-api:                     dead      (disabled on boot)

openstack-nova-compute:                 active

openstack-nova-network:                 dead      (disabled on boot)

openstack-nova-scheduler:               dead      (disabled on boot)

== neutron services ==

neutron-server:                         inactive  (disabled on boot)

neutron-dhcp-agent:                     inactive  (disabled on boot)

neutron-l3-agent:                       inactive  (disabled on boot)

neutron-metadata-agent:                 inactive  (disabled on boot)

neutron-lbaas-agent:                    inactive  (disabled on boot)

neutron-openvswitch-agent:              active

== Ceilometer services ==

openstack-ceilometer-api:               dead      (disabled on boot)

openstack-ceilometer-central:           dead      (disabled on boot)

openstack-ceilometer-compute:           active

openstack-ceilometer-collector:         dead      (disabled on boot)

== Support services ==

libvirtd:                               active

openvswitch:                            active

messagebus:                             active

Warning novarc not sourced

[[email protected] ~]# openstack-status

== Nova services ==

openstack-nova-api:                     dead      (disabled on boot)

openstack-nova-compute:                 active

openstack-nova-network:                 dead      (disabled on boot)

openstack-nova-scheduler:               dead      (disabled on boot)

== neutron services ==

neutron-server:                         inactive  (disabled on boot)

neutron-dhcp-agent:                     inactive  (disabled on boot)

neutron-l3-agent:                       inactive  (disabled on boot)

neutron-metadata-agent:                 inactive  (disabled on boot)

neutron-lbaas-agent:                    inactive  (disabled on boot)

neutron-openvswitch-agent:              dead

== Ceilometer services ==

openstack-ceilometer-api:               dead      (disabled on boot)

openstack-ceilometer-central:           dead      (disabled on boot)

openstack-ceilometer-compute:           active

openstack-ceilometer-collector:         dead      (disabled on boot)

== Support services ==

libvirtd:                               active

openvswitch:                            dead

messagebus:                             active

Warning novarc not sourced

发现YUN-15和YUN-14的区别在于openvswitch和neutron-openvswitch-agent服务的状态,YUN-15是关闭状态

重启服务

[[email protected] ~]# service openvswitch restart

Killing ovsdb-server (4862)                                [  OK  ]

Starting ovsdb-server                                      [  OK  ]

Configuring Open vSwitch system IDs                        [  OK  ]

Starting ovs-vswitchd                                      [  OK  ]

Enabling remote OVSDB managers                             [  OK  ]

[[email protected] ~]# service neutron-openvswitch-agent restart

Stopping neutron-openvswitch-agent:                        [FAILED]

Starting neutron-openvswitch-agent:                        [  OK  ]

再次在YUN-15创建云主机,发现云主机可以看见配置的地址,只是一直处于创建状态

2.3.2、升级内核出现的问题二

在另外一台物理机上做

和上面的操作基本一致,仅在下面一步上变化

# make menuconfig

在选择"IPv4"模块时没有勾选其下面的ipt_MASQUERADE

编译安装异常

# make modules_install install

sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \

System.map "/boot"

ERROR: modinfo: could not find module crc_t10dif

ERROR: modinfo: could not find module scsi_tgt

重启之后物理机器死机

再次重启系统启动成功

扩展这台升级内核之后的系统为openstack计算节点

扩展失败

错误信息:

192.168.0.11_neutron.pp:                             [ DONE ]

192.168.0.16_neutron.pp:                          [ ERROR ]

Applying Puppet manifests                         [ ERROR ]

ERROR : Error appeared during Puppet run: 192.168.0.16_neutron.pp

Error: sysctl -p /etc/sysctl.conf returned 255 instead of one of [0]

You will find full trace in log /var/tmp/packstack/20150421-110210-5TXrme/manifests/192.168.0.16_neutron.pp.log

Please check log file /var/tmp/packstack/20150421-110210-5TXrme/openstack-setup.log for more information

查看日志文件/var/tmp/packstack/20150421-110210-5TXrme/openstack-setup.log

2015-04-21 11:09:22::ERROR::run_setup::921::root:: Traceback (most recent call last):

File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 916, in main

_main(confFile)

File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 605, in _main

runSequences()

File "/usr/lib/python2.6/site-packages/packstack/installer/run_setup.py", line 584, in runSequences

controller.runAllSequences()

File "/usr/lib/python2.6/site-packages/packstack/installer/setup_controller.py", line 68, in runAllSequences

sequence.run(config=self.CONF, messages=self.MESSAGES)

File "/usr/lib/python2.6/site-packages/packstack/installer/core/sequences.py", line 98, in run

step.run(config=config, messages=messages)

File "/usr/lib/python2.6/site-packages/packstack/installer/core/sequences.py", line 44, in run

raise SequenceError(str(ex))

SequenceError: Error appeared during Puppet run: 192.168.0.16_neutron.pp

Error: sysctl -p /etc/sysctl.conf returned 255 instead of one of [0]^[[0m

You will find full trace in log /var/tmp/packstack/20150421-110210-5TXrme/manifests/192.168.0.16_neutron.pp.log

网上查看相关信息,初步认定是内核缺少模块(内核编译问题)

现在就有个问题,内核编译和扩展节点先后顺序比较

之前是在YUN-15机器上先扩展节点,再升级内核,最后扩展完节点,实例起不来;现在是先升级内核,再扩展节点,又出现由于缺少网桥模块无法扩展节点的情况

2.3.3、正确的方式升级内核

1)、查看内核升级前的系统版本

[[email protected]YUN-17 ~]# uname -r

2.6.32-431.el6.x86_64

2)、环境准备

# yum install -y xz

# tar -Jxvf linux-3.19.3.tar.xz -C /usr/src       (这里选择的linux版本是linux-3.19.3.tar.xz)

# yum install -y hmaccalc zlib-devel binutils-devel elfutils-libelf-devel

# yum install -y bc

# cd /usr/src/linux-3.19.3

# yum groupinstall -y "Development Tools"

3)、升级内核的配置文件

这里选择使用老的内核配置,因如果编辑内核配置文件的话,在编译安装的时候就会出现缺少内核模块的错误

[[email protected]YUN-17 linux-3.19.3]# cp /boot/config-2.6.32-431.el6.x86_64 .config

# sh -c 'yes "" | make oldconfig'

# make oldconfig

# make

# make modules_install install

sh ./arch/x86/boot/install.sh 3.19.3 arch/x86/boot/bzImage \

System.map "/boot"

ERROR: modinfo: could not find module crc_t10dif

ERROR: modinfo: could not find module scsi_tgt

这里出现的错误提示可以忽略

4)、给YUN-17扩展节点,创建实例

创建实例成功

正常运行

注意事项:

注意系统升级内核和扩展节点的先后顺序,如果先扩展节点再升级内核的话,计算节点将出现异常,不能再创建实例

2.4、升级完内核后搭建ceph集群

2.4.1、配置主机名

2.4.2、配置hosts文件

在deploy节点上

192.168.1.200    admin-node

192.168.1.201    node1

192.168.1.202    node2

192.168.1.203    node3

2.4.3、配置本地YUM源

163源、cephh源和epel源

[ceph]

name=Ceph packages for $basearch

gpgkey=http://192.168.1.199/ceph.com/release.asc

enabled=1

baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/$basearch

priority=1

gpgcheck=1

type=rpm-md

[ceph-source]

name=Ceph source packages

gpgkey=http://192.168.1.199/ceph.com/release.asc

enabled=1

baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/SRPMS

priority=1

gpgcheck=1

type=rpm-md

[ceph-noarch]

name=Ceph noarch packages

gpgkey=https://192.168.1.199/ceph.com/release.asc

enabled=1

baseurl=http://192.168.1.199/ceph.com/rpm-giant/el6/noarch

priority=1

gpgcheck=1

type=rpm-md

2.4.4、在deploy节点升级并安装部署工具

2.4.5、ntp服务

2.4.6、创建用户

每个节点上都做

[[email protected] ~]# useradd -d /home/ceph -m ceph

[[email protected] ~]# passwd ceph

[[email protected] ~]# echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph

ceph ALL = (root) NOPASSWD:ALL

[[email protected] ~]# chmod 0440 /etc/sudoers.d/ceph

2.4.7、无密码访问

[[email protected] ~]# su ceph

[[email protected] root]$ cd

[[email protected] ~]$ ssh-keygen

[[email protected] ~]$ ssh-copy-id [email protected]

bash: ssh-copy-id: command not found

[[email protected] ~]$ sudo yum install openssh-clients -y

[[email protected] ~]$ ssh-copy-id [email protected]

[[email protected] ~]$ ssh-copy-id [email protected]

[[email protected] ~]$ ssh-copy-id [email protected]

测试

[[email protected] ~]$ ssh [email protected]

[[email protected] ~]$ exit

logout

Connection to node1 closed.

[[email protected] ~]$ ssh [email protected]

[[email protected] ~]$ exit

logout

Connection to node2 closed.

[[email protected] ~]$ ssh [email protected]

[[email protected] ~]$ exit

logout

Connection to node3 closed.

[[email protected] ~]$ pwd

/home/ceph

[[email protected] ~]$ vi .ssh/config

Host node1

Hostname node1

User ceph

Host node2

Hostname node2

User ceph

Host node3

Hostname node3

User ceph

再次验证出错

[[email protected] ~]$ ssh [email protected]

Bad owner or permissions on /home/ceph/.ssh/config

修改权限解决问题

[[email protected] .ssh]$ ll

total 16

-rw-rw-r--. 1 ceph ceph  135 Mar  2 18:47 config

-rw-------. 1 ceph ceph 1675 Mar  2 18:31 id_rsa

-rw-r--r--. 1 ceph ceph  397 Mar  2 18:31 id_rsa.pub

-rw-r--r--. 1 ceph ceph 1203 Mar  2 18:42 known_hosts

[[email protected] .ssh]$ ssh [email protected]

Bad owner or permissions on /home/ceph/.ssh/config

[[email protected] .ssh]$ chmod 600 *

[[email protected] .ssh]$ ll

total 16

-rw-------. 1 ceph ceph  135 Mar  2 18:47 config

-rw-------. 1 ceph ceph 1675 Mar  2 18:31 id_rsa

-rw-------. 1 ceph ceph  397 Mar  2 18:31 id_rsa.pub

-rw-------. 1 ceph ceph 1203 Mar  2 18:42 known_hosts

[[email protected] .ssh]$ ssh [email protected]

Last login: Mon Mar  2 20:40:38 2015 from 192.168.1.120

[[email protected] ~]$ exit

logout

Connection to node1 closed.

2.4.8、防火墙和selinux

[[email protected] ~]$ ifconfig

eth2      Link encap:Ethernet  HWaddr 00:0C:29:59:C7:57

inet addr:192.168.1.201  Bcast:192.168.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe59:c757/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:45308 errors:0 dropped:0 overruns:0 frame:0

TX packets:7103 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:59108143 (56.3 MiB)  TX bytes:648018 (632.8 KiB)

eth3      Link encap:Ethernet  HWaddr 00:0C:29:59:C7:61

inet addr:172.16.1.201  Bcast:172.16.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe59:c761/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:2317 errors:0 dropped:0 overruns:0 frame:0

TX packets:7 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:219673 (214.5 KiB)  TX bytes:538 (538.0 b)

网卡都相同

mon端口

[[email protected] .ssh]$ sudo iptables -A INPUT -i eth2 -p tcp -s 192.168.1.201/24 --dport 6789 -j ACCEPT

osd端口

[[email protected] .ssh]$ sudo iptables -A INPUT -i eth2  -m multiport -p tcp -s 192.168.1.202/24 --dports 6800:7100 -j ACCEPT

[[email protected] .ssh]$ sudo iptables -A INPUT -i eth2  -m multiport -p tcp -s 192.168.1.203/24 --dports 6800:7100 -j ACCEPT

TTY

sudo visudo

Defaults requiretty

to

Defaults:ceph !requiretty

selinux

[[email protected] .ssh]$ sudo setenforce 0

2.4.9、部署

[[email protected] ~]$ cd my-cluster/

[[email protected] my-cluster]$ ceph-deploy new node1

出错:

[ceph_deploy][ERROR ] RuntimeError: remote connection got closed, ensure ``requiretty`` is disabled for node1

Error in sys.exitfunc:

解决方法: 需要在node1、node2、和node3三个节点中使用ceph用户的身份执行sudo visudo命令,然后修改

Defaults requiretty 为Defaults:ceph !requiretty

删除配置

[[email protected] my-cluster]$ ceph-deploy purgedata node1

[[email protected] my-cluster]$ ceph-deploy forgetkeys

[[email protected] my-cluster]$ ceph-deploy purge node1

[[email protected] my-cluster]$ ceph-deploy new node1

生成

ceph.conf  ceph.log  ceph.mon.keyring

vi ceph.conf

osd pool default size = 2

[email protected] my-cluster]$ ceph-deploy install node1 node2 node3

出错:

[node1][INFO  ] Running command: sudo rpm --import https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

[node1][WARNIN] curl: (6) Couldn't resolve host 'ceph.com'

[node1][WARNIN] error: https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc: import read failed(2).

[node1][ERROR ] RuntimeError: command returned non-zero exit status: 1

[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm --import https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

解决办法:

查找配置文件

[[email protected] my-cluster]# find / -type f -name "*.py" | xargs grep "https://ceph.com/git"

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/fedora/install.py:                    "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/fedora/install.py:                "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key),

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py:                    "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py:                "https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key),

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/debian/install.py:                'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc'.format(key=key),

/usr/lib/python2.6/site-packages/ceph_deploy/conf/cephdeploy.py:# gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc

/usr/lib/python2.6/site-packages/ceph_deploy/conf/cephdeploy.py:# gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc

/usr/lib/python2.6/site-packages/ceph_deploy/install.py:        gpg_fallback = 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'

cd /usr/lib/python2.6/site-packages/ceph_deploy

可以看到install.py 、install.pyc 和install.pyo三个文件

pyc是一种二进制文件,是由py文件经过编译后,生成的文件,是一种byte code

vi  install.py

#gpg_fallback = 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc'

gpg_fallback = 'http://192.168.1.199/ceph.com/release.asc'

还是出错

[[email protected] centos]$ pwd

/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos

[[email protected] centos]$ ls

__init__.py   __init__.pyo  install.pyc  mon     pkg.pyc  uninstall.py   uninstall.pyo

__init__.pyc  install.py    install.pyo  pkg.py  pkg.pyo  uninstall.pyc

再次修改文件

vi /usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py

if adjust_repos:

if version_kind != 'dev':

remoto.process.run(

distro.conn,

[

'rpm',

'--import',

#"https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/{key}.asc".format(key=key)

"http://192.168.1.199/ceph.com/release.asc"

]

)

if version_kind == 'stable':

url = 'http://192.168.1.199/ceph.com/rpm-{version}/{repo}/'.format(

version=version,

repo=repo_part,

)

elif version_kind == 'testing':

url = 'http://192.168.1.199/ceph.com/rpm-testing/{repo}/'.format(repo=repo_part)

#remoto.process.run(

#    distro.conn,

#    [

#        'rpm',

#        '-Uvh',

#        '--replacepkgs',

#        '{url}noarch/ceph-release-1-0.{dist}.noarch.rpm'.format(url=url, dist=dist),

#    ],

#)

再次执行  ceph-deploy install node1 node2 node3

执行成功

结束

[node3][DEBUG ] Complete!

[node3][INFO  ] Running command: sudo ceph --version

[node3][DEBUG ] ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)

Error in sys.exitfunc:

2.5、LVM分区格式下扩展系统根分区

[[email protected] ~]# umount /home

[[email protected] ~]# e2fsck -f /dev/mapper/vg_YUN2-lv_home

e2fsck 1.41.12 (17-May-2010)

e2fsck: No such file or directory while trying to open /dev/mapper/vg_YUN2-lv_home

The superblock could not be read or does not describe a correct ext2

filesystem.  If the device is valid and it really contains an ext2

filesystem (and not swap or ufs or something else), then the superblock

is corrupt, and you might try running e2fsck with an alternate superblock:

e2fsck -b 8193 <device>

[[email protected] ~]# e2fsck -f /dev/mapper/vg_YUN17-lv_home

e2fsck 1.41.12 (17-May-2010)

/dev/mapper/vg_YUN17-lv_home is in use.

e2fsck: Cannot continue, aborting.

[[email protected] ~]# resize2fs -p /dev/mapper/vg_YUN17-lv_home 2G

resize2fs 1.41.12 (17-May-2010)

resize2fs: Device or resource busy while trying to open /dev/mapper/vg_YUN17-lv_home

Couldn't find valid filesystem superblock.

[[email protected] ~]# mount /home

[[email protected] ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/vg_YUN17-lv_root

50G  9.9G   37G  22% /

tmpfs                 253G  4.0K  253G   1% /dev/shm

/dev/sda1             477M   62M  387M  14% /boot

/srv/loopback-device/swiftloopback

1.9G  3.1M  1.7G   1% /srv/node/swiftloopback

/dev/mapper/vg_YUN17-lv_home

769G   69M  730G   1% /home

[[email protected] ~]# lvreduce -L 2G /dev/mapper/vg_YUN17-lv_home

WARNING: Reducing active and open logical volume to 2.00 GiB

THIS MAY DESTROY YOUR DATA (filesystem etc.)

Do you really want to reduce lv_home? [y/n]: y

Size of logical volume vg_YUN17/lv_home changed from 780.90 GiB (199911 extents) to 2.00 GiB (512 extents).

Logical volume lv_home successfully resized

[[email protected] ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/vg_YUN17-lv_root

50G  9.9G   37G  22% /

tmpfs                 253G  4.0K  253G   1% /dev/shm

/dev/sda1             477M   62M  387M  14% /boot

/srv/loopback-device/swiftloopback

1.9G  3.1M  1.7G   1% /srv/node/swiftloopback

/dev/mapper/vg_YUN17-lv_home

769G   69M  730G   1% /home

[[email protected] ~]# vgdisplay

--- Volume group ---

VG Name               vg_YUN17

System ID

Format                lvm2

Metadata Areas        1

Metadata Sequence No  5

VG Access             read/write

VG Status             resizable

MAX LV                0

Cur LV                3

Open LV               3

Max PV                0

Cur PV                1

Act PV                1

VG Size               834.90 GiB

PE Size               4.00 MiB

Total PE              213735

Alloc PE / Size       14336 / 56.00 GiB

Free  PE / Size       199399 / 778.90 GiB

VG UUID               UY5pX2-BCLJ-x4ig-Cr0z-sAS1-mUEc-nixVw9

--- Volume group ---

VG Name               cinder-volumes

System ID

Format                lvm2

Metadata Areas        1

Metadata Sequence No  1

VG Access             read/write

VG Status             resizable

MAX LV                0

Cur LV                0

Open LV               0

Max PV                0

Cur PV                1

Act PV                1

VG Size               20.60 GiB

PE Size               4.00 MiB

Total PE              5273

Alloc PE / Size       0 / 0

Free  PE / Size       5273 / 20.60 GiB

VG UUID               ND2yf7-BRxo-sYm1-VB6A-t0nt-aVPg-Fq3MPH

[[email protected] ~]# lvextend -L +750G /dev/mapper/vg_YUN17-lv_root

Size of logical volume vg_YUN17/lv_root changed from 50.00 GiB (12800 extents) to 800.00 GiB (204800 extents).

Logical volume lv_root successfully resized

[[email protected] ~]# resize2fs -p /dev/mapper/vg_YUN17-lv_root

resize2fs 1.41.12 (17-May-2010)

Filesystem at /dev/mapper/vg_YUN17-lv_root is mounted on /; on-line resizing required

old desc_blocks = 4, new_desc_blocks = 50

Performing an on-line resize of /dev/mapper/vg_YUN17-lv_root to 209715200 (4k) blocks.

The filesystem on /dev/mapper/vg_YUN17-lv_root is now 209715200 blocks long.

[[email protected] ~]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/vg_YUN17-lv_root

788G   10G  745G   2% /

tmpfs                 253G  4.0K  253G   1% /dev/shm

/dev/sda1             477M   62M  387M  14% /boot

/srv/loopback-device/swiftloopback

1.9G  3.1M  1.7G   1% /srv/node/swiftloopback

/dev/mapper/vg_YUN17-lv_home

769G   69M  730G   1% /home

[[email protected] ~]# fdisk -l

Disk /dev/sda: 897.0 GB, 896998047744 bytes

255 heads, 63 sectors/track, 109053 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x0000292d

Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          64      512000   83  Linux

Partition 1 does not end on cylinder boundary.

/dev/sda2              64      109054   875461632   8e  Linux LVM

Disk /dev/mapper/vg_YUN17-lv_root: 859.0 GB, 858993459200 bytes

255 heads, 63 sectors/track, 104433 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x00000000

Disk /dev/mapper/vg_YUN17-lv_swap: 4294 MB, 4294967296 bytes

255 heads, 63 sectors/track, 522 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x00000000

Disk /dev/mapper/vg_YUN17-lv_home: 2147 MB, 2147483648 bytes

255 heads, 63 sectors/track, 261 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Sector size (logical/physical): 512 bytes / 4096 bytes

I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk identifier: 0x00000000

原文地址:http://blog.51cto.com/xiaoxiaozhou/2113356

时间: 2024-09-29 17:33:52

openstack项目中遇到的各种问题总结 其二(云主机迁移、ceph及扩展分区)的相关文章

openstack项目中遇到的各种问题总结

目录: 一. 从开始到现在遇到的各种问题      1.1.单网卡下搭建openstack出错      1.2.云平台上虚机搭建MDS系统遇到的问题 2         1.2.1.内部网路和外部网络的联通问题 3         1.2.2.windows虚机下对于3D的支持问题 5         1.2.3.对于windows的兼容问题 5     1.3.扩展节点的部分问题 5         1.3.1..扩展节点出错 5         1.3.2.删除扩展节点信息 7     1

openstack项目中遇到的各种问题总结 其三(命令汇总)

三.注意事项 3.1.挂载目录是需要注意的事项 把分区挂载到目录下的操作要谨慎小心 对于存放系统重要文件的目录不要挂载,对于目录下有重要文件的目录需要做备份,因为挂载过程中会把目录清空 四.命令汇总 4.1.openstack命令汇总 查看openstack节点上服务状态 [[email protected] ~]# openstack-status 当实例启动失败时可以尝试一下命令(test为虚机的名字) 重置虚机的状态 [[email protected] ~(keystone_admin)

openstack项目中遇到的各种问题总结 其四(迁移实验)

五.实验 5.1.迁移实验 冷迁移实验 1).暂停云主机 2).ssh登录YUN-12主机 # cd /var/lib/nova/instances # scp -rp dbaab72b-75c3-4dc5-99f2-95a579a315c5 [email protected] -compute:/var/lib/nova/instances 3).ssh登录YUN-11主机修改数据库 # mysql use nova; update instances set host='YUN-12' wh

OpenStack项目概况

转载于:http://doc.okbase.net/limlee/archive/451.html OpenStack旗下包含了一组由社区维护的开源项目,他们分别是OpenStackCompute(Nova)&OpenStackObjectStorage(Swift)& OpenStackImageService(Glance). OpenStackCompute[1],为云组织的控制器,它提供一个工具来部署云,包括运行实例.管理网络以及控制用户和其他项目对云的访问(thecloudthr

【openstack N版】——创建云主机

一.启动实例 1.1 已准备服务介绍 MySql:为各个服务提供数据存储. RabbitMQ:为各个服务之间通信提供交通枢纽. keystone:为各个服务之间通信提供认证和服务注册. Glance:为虚拟机提供镜像管理. Nova:为虚拟机提供计算资源. Neutron:为虚拟机提供网络资源. 1.2 网络(flat) 1.2.1创建虚拟网络 1 #share 允许所有项目使用虚拟网络 2 [[email protected] ~]# openstack network create --sh

中小企业openstack私有云布署实践【11.3 计算nova - compute节点-nova用户免密登录(用于云主机冷迁移+扩展云主机大小)】

云主机迁移+扩展云主机大小 ,官方说它依赖nova用户之间的免密登录.确保每个resion区域的compute节点服务器他们可以相互SSH免密 compute1-7     他们相互SSH免密 kxcompute1-9  他们相互SSH免密 1.注意!是每台机器上的nova用户向另一台机器的nova用户的免密登录 每台compute节点修改ssh配置,目的是为了不让其提示输入yes保存密钥 vi /etc/ssh/ssh_config 尾部添加 StrictHostKeyChecking no

Openstack新建云主机的流程

前言 前天晚上没睡觉,所以昨天睡得很早,导致今天4点就起来了 时间是人最宝贵的财富,于是我打消了钻在被子里刷剧的念头,爬起来整理一下在Openstack中新建一个云主机的流程. Openstack可以让你在登录dashboard之后只需要点一下"创建主机"的按钮,选择相应配置,在几十秒内就可以创建好一台云主机供你使用,这么牛逼的事情是怎么做到的呢? 别着急,听我跟你慢慢道来 新建一个云主机流程总览图 图中流程-1 首先你访问dashboard之后,显示的是一个登录页面,人家horizo

用java写一个远程视频监控系统,实时监控(类似直播)我想用RPT协议,不知道怎么把RPT协议集成到项目中

我最近在用java写一个远程视频监控系统,实时监控(类似直播)我想用RPT协议,不知道怎么把RPT协议集成到项目中,第一次写项目,写过这类项目的多多提意见,哪方面的意见都行,有代码或者demo的求赏给我,谢谢

DotNet项目中的一些常用验证操作

在项目中需要对用户输入的信息,以及一些方法生成的结果进行验证,一般在项目中较多的采用js插件或js来进行有关信息的校验,但是从项目安全性的角度进行考虑,可对系统进行js注入. 如果在后台对用户输入的信息进行验证会相对的安全,在出现信息验证不合法时,可以直接在程序中抛出异常,终止程序的运行. 现在提供几种较为常用的验证方法,可以减少在项目中开发时间和错误性: 1.判断域名:         /// <summary>         /// 普通的域名         /// </summ