heartbeat+drbd拓扑图
if 136网段=60网段
then
一、在拔掉60网段网线之前:
drbd1主服务器负载很高,top的%wa达到60左右。cat /proc/drbd发现当前服务器同步状态变成了diskless(on-io-error detach; #策略:发生I/O错误的节点将放弃底层设备,以diskless mode继续工作)。
查message日志,发现15 号的时候磁盘同步状态就变成了Diskless,提示IO错误。
Jun 15 08:56:06 drbd1 kernel: block drbd0: local WRITE IO error sector 1600487576+8 on dm-2 Jun 15 08:56:06 drbd1 kernel: block drbd0: disk( UpToDate -> Failed ) Jun 15 08:56:06 drbd1 kernel: block drbd0: Local IO failed in __req_mod. Detaching... Jun 15 08:56:06 drbd1 kernel: block drbd0: receiver updated UUIDs to effective data uuid: C6B3D27C4098E93E Jun 15 08:56:06 drbd1 kernel: block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Jun 15 08:56:06 drbd1 kernel: block drbd0: disk( Failed -> Diskless )
备注: drbdadm detach all可以运行这个命令模拟上述错误,即ds变成了Diskless,但是drbd分区依然可以挂在,依然可以访问。
权威解释(DRBD书籍和DRBD中文应用指南):如果某个节点作为DRBD的后端磁盘设备出现故障,DRBD可能把这个I/O错误传递给上层(通常是文件系统),或者DRBD可能对上层屏蔽了I/O错误。
Passing on I/O errors:如果DRBD被配置为 pass on I/O错误,则任何底层设备的错误都会透明地传递给上层I/O层。这样,就由上层来处理错误(这会导致文件系统被重新挂载为read-only)。这个策略不保证服务持续性,并且对大多数用户来说也不推荐。
Masking I/O errors:如果DRBD被配置为
detach
底层I/O错误,则DRBD将分离错误。这个I/O错误被DRBD对上层屏蔽,并且DRBD透明地通过网络从对端节点提取受影响的数据块。在这种情况下,DRBD被称为运行在diskless模式,并处理所有相应的I/O操作,读写实际上都是发生在对端(不是本地)。这种diskless运行模式会影响性能,但是服务将继续运行不受影响,并且可以从容地在一个合适的时间迁移到对端节点。(这个方式有点类似
Soft RAID1,当镜像磁盘发生故障时可以确保应用继续运行并提供恢复机会。)
参考配置I/O错误处理策略有关配置I/O处理策略的信息。
二、在拔掉60网段网线之后:
1、拔掉60网段(drbd数据同步+心跳线)网线后,ha没有做切换(排查了/var/log/ha-debug日志),即heartbeat正常。
但是drbd分区此时却莫名其妙没挂载,ls挂载目录好像提示了一个错误,但是没注意看(漂移IP地址还在),正常情况下不会造成drbd挂载异常的(虚拟机测试过)。
2、message日志就报大量IO错误了(这里就有点疑问了,为什么拔掉网线后就这么多IO错误,拔掉之前总报了一个IO错误然后ds变成了Diskless):
Jun 19 10:28:04 drbd1 kernel: e1000: ens34 NIC Link is Down Jun 19 10:28:10 drbd1 NetworkManager[806]: <info> [1529375290.6437] device (ens34): state change: activated -> unavailable (reason 'carrier-changed', sys-iface-state: 'managed') Jun 19 10:28:10 drbd1 dbus[801]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Jun 19 10:28:10 drbd1 systemd: Starting Network Manager Script Dispatcher Service... Jun 19 10:28:10 drbd1 dbus[801]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Jun 19 10:28:10 drbd1 systemd: Started Network Manager Script Dispatcher Service. Jun 19 10:28:10 drbd1 nm-dispatcher: req:1 'down' [ens34]: new request (3 scripts) Jun 19 10:28:10 drbd1 nm-dispatcher: req:1 'down' [ens34]: start running ordered scripts... Jun 19 10:28:23 drbd1 kernel: drbd r0: PingAck did not arrive in time. Jun 19 10:28:23 drbd1 kernel: drbd r0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jun 19 10:28:23 drbd1 kernel: drbd r0: ack_receiver terminated Jun 19 10:28:23 drbd1 kernel: drbd r0: Terminating drbd_a_r0 Jun 19 10:28:23 drbd1 kernel: drbd r0: error receiving DataReply, e: -5 l: 212992! Jun 19 10:28:23 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 Jun 19 10:28:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm pri-on-incon-degr minor-0 exit code 0 (0x0) Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607824 (offset 0 size 0 starting block 150058741) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 150058741 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 209230140) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 209230140 Jun 19 10:28:24 drbd1 kernel: Aborting journal on device drbd0-8. Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366649) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366649 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366654) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366654 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366855) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366855 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 147366869) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 147366869 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704209) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704209 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704214) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704214 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704227) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704227 Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on device drbd0, logical block 146704228 Jun 19 10:28:24 drbd1 kernel: EXT4-fs warning (device drbd0): ext4_end_bio:316: I/O error -5 writing to inode 50607811 (offset 1843200 size 0 starting block 146704233) Jun 19 10:28:24 drbd1 kernel: drbd r0: Connection closed Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 427170072+1024 Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1011089408+8 Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 126386176, lost sync page write Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( NetworkFailure -> Unconnected ) Jun 19 10:28:24 drbd1 kernel: drbd r0: receiver terminated Jun 19 10:28:24 drbd1 kernel: drbd r0: Restarting receiver thread Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:28:24 drbd1 kernel: drbd r0: receiver (re)started Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 427170488+8 Jun 19 10:28:24 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 473052608+1024 Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( Unconnected -> WFConnection ) Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_journal_check_start:56: Detected aborted journal Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): Remounting filesystem read-only Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: drbd r0: bind before listen failed, err = -99 Jun 19 10:28:24 drbd1 kernel: drbd r0: create_listen_socket failed, err = -5 Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( WFConnection -> Disconnecting ) Jun 19 10:28:24 drbd1 kernel: drbd r0: Connection closed Jun 19 10:28:24 drbd1 kernel: drbd r0: conn( Disconnecting -> StandAlone ) Jun 19 10:28:24 drbd1 kernel: drbd r0: Not fencing peer, I'm not even Consistent myself. Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): I/O error while writing superblock Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_journal_check_start:56: Detected aborted journal Jun 19 10:28:24 drbd1 kernel: drbd r0: Not fencing peer, I'm not even Consistent myself. Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_reserve_inode_write:5173: Journal has aborted Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_dirty_inode:5290: Journal has aborted Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0) in ext4_da_write_begin:2718: IO failure Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: JBD2: Error -5 detected when updating journal superblock for drbd0-8. Jun 19 10:28:24 drbd1 kernel: JBD2: Detected IO errors while flushing file data on drbd0-8 Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #50331649: comm mysqld: reading directory lblock 0 Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #52428801: comm postgres: reading directory lblock 0 Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): __ext4_get_inode_loc:4180: inode #50629208: block 202377413: comm mysqld: unable to read itable block Jun 19 10:28:24 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm postgres: Cannot read block bitmap - block_group = 1251, block_bitmap = 40894467 Jun 19 10:28:24 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4006: comm postgres: Error loading buddy information for 1251 Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service: main process exited, code=killed, status=6/ABRT Jun 19 10:28:24 drbd1 systemd: mongod.service: main process exited, code=exited, status=14/n/a Jun 19 10:28:24 drbd1 pg_ctl: pg_ctl: directory "/store/pgsql" is not a database cluster directory Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:28:24 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:28:24 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:28:24 drbd1 mongod: Stopping mongod: [FAILED] Jun 19 10:28:24 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:28:24 drbd1 systemd: mongod.service failed. Jun 19 10:28:25 drbd1 kernel: drbd r0: State change failed: Need a connection to start verify or resync Jun 19 10:28:25 drbd1 kernel: drbd r0: mask = 0x1f0 val = 0x80 Jun 19 10:28:25 drbd1 kernel: drbd r0: old_conn:StandAlone wanted_conn:WFConnection Jun 19 10:28:25 drbd1 kernel: drbd r0: receiver terminated Jun 19 10:28:25 drbd1 kernel: drbd r0: Terminating drbd_r_r0 Jun 19 10:28:29 drbd1 kernel: block drbd0: 729 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:28:29 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1677722552+8 Jun 19 10:28:29 drbd1 kernel: buffer_io_error: 177 callbacks suppressed Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715319, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715482, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715619, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715647, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715726, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715801, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715857, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715879, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715885, lost async page write Jun 19 10:28:29 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 209715898, lost async page write Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Link Status update: Link drbd2.db.com/ens34 now has status dead Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Asking other side for ping node count. Jun 19 10:28:33 drbd1 ipfail: [8426]: info: Checking remote count of ping nodes. Jun 19 10:28:35 drbd1 ipfail: [8426]: info: Ping node count is balanced. Jun 19 10:28:35 drbd1 ipfail: [8426]: info: No giveup timer to abort. Jun 19 10:30:02 drbd1 systemd-logind: Removed session 6994. Jun 19 10:30:09 drbd1 kernel: block drbd0: 41 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:30:09 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:30:09 drbd1 kernel: EXT4-fs warning: 117 callbacks suppressed Jun 19 10:30:09 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0) Jun 19 10:30:32 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:30:32 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0) Jun 19 10:31:04 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 74960+8 Jun 19 10:31:04 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 2, block 0)
3、由于访问的是对端的drbd分区,所以拔掉网线后就访问不了,影响业务。这个时候停止heartbeat服务后讲将资源切换到另外一台drbd2。
Jun 19 10:31:34 drbd1 systemd: Stopping Heartbeat High Availability Cluster Communication and Membership... Jun 19 10:31:35 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs error: 174 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #50331649: comm mariadb-prepare: reading directory lblock 0 Jun 19 10:31:35 drbd1 kernel: EXT4-fs: 176 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:31:35 drbd1 kernel: buffer_io_error: 32 callbacks suppressed Jun 19 10:31:35 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:35 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:35 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:35 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:35 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:35 drbd1 systemd: mariadb.service failed. Jun 19 10:31:35 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:35 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:35 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:35 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:35 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:35 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1673593088+8 Jun 19 10:31:35 drbd1 kernel: EXT4-fs error (device drbd0): ext4_find_entry:1312: inode #52297729: comm mongod: reading directory lblock 0 Jun 19 10:31:35 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:35 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 0+8 Jun 19 10:31:35 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:35 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:35 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:35 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:35 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:35 drbd1 systemd: mongod.service failed. Jun 19 10:31:36 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:36 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:36 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:36 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:36 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:36 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:36 drbd1 systemd: mariadb.service failed. Jun 19 10:31:36 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:36 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:36 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:36 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:36 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:36 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:36 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:36 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:36 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:36 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:36 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:36 drbd1 systemd: mongod.service failed. Jun 19 10:31:37 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:37 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:37 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:37 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:37 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:37 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:37 drbd1 systemd: mariadb.service failed. Jun 19 10:31:37 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:37 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:37 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:37 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:37 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:37 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:37 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:37 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:37 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:37 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:37 drbd1 systemd: mongod.service failed. Jun 19 10:31:38 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:38 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:38 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:38 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:38 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:38 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:38 drbd1 systemd: mariadb.service failed. Jun 19 10:31:38 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:38 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:38 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:38 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:38 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:38 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:38 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:38 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:38 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:38 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:38 drbd1 systemd: mongod.service failed. Jun 19 10:31:39 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:39 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:39 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:39 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:39 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:39 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:39 drbd1 systemd: mariadb.service failed. Jun 19 10:31:39 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:39 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:39 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:39 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:39 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:39 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:39 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:39 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:39 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:39 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:39 drbd1 systemd: mongod.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:40 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:40 drbd1 systemd: mariadb.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:40 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:40 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:40 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:40 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:40 drbd1 systemd: mongod.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:41 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:41 drbd1 systemd: mariadb.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:41 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:41 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:41 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:41 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:41 drbd1 systemd: mongod.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:43 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:43 drbd1 systemd: mariadb.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:43 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:43 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:43 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:43 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:43 drbd1 systemd: mongod.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for mariadb.service Jun 19 10:31:44 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:44 drbd1 systemd: mariadb.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:44 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:44 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:44 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:44 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:44 drbd1 systemd: mongod.service failed. Jun 19 10:31:45 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:45 drbd1 kernel: block drbd0: 4 messages suppressed in /builddir/build/BUILD/drbd-8.4.11-1/drbd/drbd_req.c:1446. Jun 19 10:31:45 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:45 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:45 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:45 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:45 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:45 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:45 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:45 drbd1 systemd: mariadb.service failed. Jun 19 10:31:45 drbd1 systemd: start request repeated too quickly for postgresql-9.5.service Jun 19 10:31:45 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:45 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:45 drbd1 systemd: start request repeated too quickly for mongod.service Jun 19 10:31:45 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:45 drbd1 systemd: mongod.service failed. Jun 19 10:31:46 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:46 drbd1 kernel: block drbd0: IO ERROR: neither local nor remote data, sector 1610678528+8 Jun 19 10:31:46 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:46 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:46 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mariadb.service failed. Jun 19 10:31:46 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:46 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:46 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:46 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:46 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:46 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mongod.service failed. Jun 19 10:31:46 drbd1 systemd: Starting MariaDB database server... Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Database MariaDB is not initialized, but the directory /store/mysql is not empty, so initialization cannot be done. Jun 19 10:31:46 drbd1 kernel: EXT4-fs warning (device drbd0): __ext4_read_dirblock:902: error reading directory block (ino 50331649, block 0) Jun 19 10:31:46 drbd1 mariadb-prepare-db-dir: Make sure the /store/mysql is empty before running mariadb-prepare-db-dir. Jun 19 10:31:46 drbd1 systemd: mariadb.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start MariaDB database server. Jun 19 10:31:46 drbd1 systemd: Unit mariadb.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mariadb.service failed. Jun 19 10:31:46 drbd1 systemd: Starting PostgreSQL 9.5 database server... Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: "/store/pgsql/" is missing or empty. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: Use "/usr/pgsql-9.5/bin/postgresql95-setup initdb" to initialize the database cluster. Jun 19 10:31:46 drbd1 postgresql95-check-db-dir: See /usr/share/doc/postgresql95-9.5.13/README.rpm-dist for more information. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start PostgreSQL 9.5 database server. Jun 19 10:31:46 drbd1 systemd: Unit postgresql-9.5.service entered failed state. Jun 19 10:31:46 drbd1 systemd: postgresql-9.5.service failed. Jun 19 10:31:46 drbd1 systemd: Starting SYSV: Mongo is a scalable, document-oriented database.... Jun 19 10:31:46 drbd1 mongod: Starting mongod: [FAILED] Jun 19 10:31:46 drbd1 systemd: mongod.service: control process exited, code=exited status=1 Jun 19 10:31:46 drbd1 systemd: Failed to start SYSV: Mongo is a scalable, document-oriented database.. Jun 19 10:31:46 drbd1 systemd: Unit mongod.service entered failed state. Jun 19 10:31:46 drbd1 systemd: mongod.service failed. Jun 19 10:31:48 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 5640, block_bitmap = 184549384 Jun 19 10:31:48 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:48 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:48 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 5640 Jun 19 10:31:48 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:48 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:49 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 6142, block_bitmap = 200802318 Jun 19 10:31:49 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:49 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:49 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 6142 Jun 19 10:31:49 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:49 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 1270, block_bitmap = 41418758 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 1270 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_wait_block_bitmap:497: comm umount: Cannot read block bitmap - block_group = 1275, block_bitmap = 41418763 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: EXT4-fs error (device drbd0): ext4_discard_preallocations:4013: comm umount: Error reading block bitmap for 1275 Jun 19 10:31:50 drbd1 kernel: EXT4-fs (drbd0): previous I/O error to superblock detected Jun 19 10:31:50 drbd1 kernel: Buffer I/O error on dev drbd0, logical block 0, lost sync page write Jun 19 10:31:50 drbd1 kernel: VFS: Dirty inode writeback failed for block device drbd0 (err=-5). Jun 19 10:31:50 drbd1 kernel: block drbd0: role( Primary -> Secondary ) Jun 19 10:31:53 drbd1 systemd: Stopped Heartbeat High Availability Cluster Communication and Membership. Jun 19 10:31:57 drbd1 systemd: Started Heartbeat High Availability Cluster Communication and Membership. Jun 19 10:31:57 drbd1 systemd: Starting Heartbeat High Availability Cluster Communication and Membership... Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: WARN: heartbeat: udp port 1112 reserved for service "icp". Jun 19 10:31:57 drbd1 heartbeat: heartbeat: udpport setting must precede media statementsheartbeat: baudrate setting must precede media statementsJun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: Pacemaker support: false Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: WARN: Logging daemon is disabled --enabling logging daemon is recommended Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: ************************** Jun 19 10:31:57 drbd1 heartbeat: Jun 19 10:31:57 drbd1.db.com heartbeat: [4628]: info: Configuration validated. Starting heartbeat 3.0.6 Jun 19 10:32:03 drbd1 ipfail: [4655]: info: Ping node count is balanced. Jun 19 10:32:20 drbd1 systemd: Stopping DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:21 drbd1 kernel: drbd r0: Terminating drbd_w_r0 Jun 19 10:32:21 drbd1 kernel: drbd: module cleanup done. Jun 19 10:32:21 drbd1 drbd: Stopping all DRBD resources: . Jun 19 10:32:21 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:21 drbd1 kernel: Request for unknown module key 'The ELRepo Project (http://elrepo.org): ELRepo.org Secure Boot Key: f365ad3481a7b20e3427b61b2a26635b83fe427b' err -11 Jun 19 10:32:21 drbd1 kernel: drbd: initialized. Version: 8.4.11-1 (api:1/proto:86-101) Jun 19 10:32:21 drbd1 kernel: drbd: GIT-hash: 66145a308421e9c124ec391a7848ac20203bb03c build by [email protected], 2018-04-26 12:10:42 Jun 19 10:32:21 drbd1 kernel: drbd: registered as block device major 147 Jun 19 10:32:21 drbd1 drbd: Starting DRBD resources: drbd.d/db.res:18: in resource r0, on drbd1.db.com: Jun 19 10:32:21 drbd1 drbd: IP 192.168.60.54 not found on this host. Jun 19 10:32:21 drbd1 systemd: drbd.service: main process exited, code=exited, status=20/n/a Jun 19 10:32:21 drbd1 systemd: Failed to start DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:32:21 drbd1 systemd: Unit drbd.service entered failed state. Jun 19 10:32:21 drbd1 systemd: drbd.service failed. Jun 19 10:32:28 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:32:28 drbd1 drbd: Starting DRBD resources: drbd.d/db.res:18: in resource r0, on drbd1.db.com: Jun 19 10:32:28 drbd1 drbd: IP 192.168.60.54 not found on this host. Jun 19 10:32:28 drbd1 systemd: drbd.service: main process exited, code=exited, status=20/n/a Jun 19 10:32:28 drbd1 systemd: Failed to start DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:32:28 drbd1 systemd: Unit drbd.service entered failed state. Jun 19 10:32:28 drbd1 systemd: drbd.service failed. Jun 19 10:47:00 drbd1 kernel: e1000: ens34 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Jun 19 10:47:00 drbd1 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): ens34: link becomes ready Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7939] device (ens34): carrier: link connected Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7946] device (ens34): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7958] policy: auto-activating connection 'ens34' Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7975] device (ens34): Activation: starting connection 'ens34' (94aea789-efb3-ef4c-81b0-e8b18ecc9797) Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7978] device (ens34): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.7986] device (ens34): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.8010] device (ens34): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9347] device (ens34): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9363] device (ens34): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9367] device (ens34): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Jun 19 10:47:00 drbd1 NetworkManager[806]: <info> [1529376420.9390] device (ens34): Activation: successful, device activated. Jun 19 10:47:00 drbd1 dbus[801]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' Jun 19 10:47:00 drbd1 systemd: Starting Network Manager Script Dispatcher Service... Jun 19 10:47:00 drbd1 dbus[801]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Jun 19 10:47:00 drbd1 systemd: Started Network Manager Script Dispatcher Service. Jun 19 10:47:00 drbd1 nm-dispatcher: req:1 'up' [ens34]: new request (3 scripts) Jun 19 10:47:00 drbd1 nm-dispatcher: req:1 'up' [ens34]: start running ordered scripts... Jun 19 10:47:01 drbd1 ipfail: [4655]: info: Link Status update: Link drbd2.db.com/ens34 now has status up Jun 19 10:48:22 drbd1 systemd: Starting DRBD -- please disable. Unless you are NOT using a cluster manager.... Jun 19 10:48:22 drbd1 drbd: Starting DRBD resources: [ Jun 19 10:48:22 drbd1 drbd: create res: r0 Jun 19 10:48:22 drbd1 drbd: prepare disk: r0 Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting worker thread (from drbdsetup-84 [4920]) Jun 19 10:48:23 drbd1 kernel: block drbd0: disk( Diskless -> Attaching ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Method to ensure write ordering: flush Jun 19 10:48:23 drbd1 kernel: block drbd0: max BIO size = 1048576 Jun 19 10:48:23 drbd1 kernel: block drbd0: Adjusting my ra_pages to backing device's (32 -> 1024) Jun 19 10:48:23 drbd1 kernel: block drbd0: drbd_bm_resize called with capacity == 2023943792 Jun 19 10:48:23 drbd1 kernel: block drbd0: resync bitmap: bits=252992974 words=3953016 pages=7721 Jun 19 10:48:23 drbd1 kernel: block drbd0: size = 965 GB (1011971896 KB) Jun 19 10:48:23 drbd1 kernel: block drbd0: recounting of set bits took additional 5 jiffies Jun 19 10:48:23 drbd1 kernel: block drbd0: 4948 MB (1266688 bits) marked out-of-sync by on disk bit-map. Jun 19 10:48:23 drbd1 kernel: block drbd0: disk( Attaching -> UpToDate ) Jun 19 10:48:23 drbd1 kernel: block drbd0: attached to UUIDs A918D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 Jun 19 10:48:23 drbd1 drbd: adjust disk: r0 Jun 19 10:48:23 drbd1 drbd: adjust net: r0 Jun 19 10:48:23 drbd1 drbd: ] Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( StandAlone -> Unconnected ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting receiver thread (from drbd_w_r0 [4921]) Jun 19 10:48:23 drbd1 kernel: drbd r0: receiver (re)started Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( Unconnected -> WFConnection ) Jun 19 10:48:23 drbd1 drbd: WARN: stdin/stdout is not a TTY; using /dev/consoleoutdated-wfc-timeout has to be shorter than degr-wfc-timeout Jun 19 10:48:23 drbd1 drbd: outdated-wfc-timeout implicitly set to degr-wfc-timeout (120s) Jun 19 10:48:23 drbd1 kernel: drbd r0: Handshake successful: Agreed network protocol version 101 Jun 19 10:48:23 drbd1 kernel: drbd r0: Feature flags enabled on protocol level: 0xf TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES. Jun 19 10:48:23 drbd1 kernel: drbd r0: conn( WFConnection -> WFReportParams ) Jun 19 10:48:23 drbd1 kernel: drbd r0: Starting ack_recv thread (from drbd_r_r0 [4926]) Jun 19 10:48:23 drbd1 kernel: block drbd0: drbd_sync_handshake: Jun 19 10:48:23 drbd1 kernel: block drbd0: self A918D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 bits:1266688 flags:0 Jun 19 10:48:23 drbd1 kernel: block drbd0: peer C6B3D27C4098E93F:A918D4C3621EAB6C:5A76D6F6AD549604:5A75D6F6AD549605 bits:8839904 flags:0 Jun 19 10:48:23 drbd1 kernel: block drbd0: uuid_compare()=-1 by rule 50 Jun 19 10:48:23 drbd1 kernel: block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate ) Jun 19 10:48:23 drbd1 drbd: . Jun 19 10:48:23 drbd1 systemd: Started DRBD -- please disable. Unless you are NOT using a cluster manager.. Jun 19 10:48:23 drbd1 kernel: block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 1081767(265), total 1081767; compression: 96.6% Jun 19 10:48:23 drbd1 kernel: block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 1041300(255), total 1041300; compression: 96.8% Jun 19 10:48:23 drbd1 kernel: block drbd0: conn( WFBitMapT -> WFSyncUUID ) Jun 19 10:48:24 drbd1 kernel: block drbd0: updated sync uuid A919D4C3621EAB6C:0000000000000000:5A76D6F6AD549605:5A75D6F6AD549605 Jun 19 10:48:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 Jun 19 10:48:24 drbd1 kernel: block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0) Jun 19 10:48:24 drbd1 kernel: block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) Jun 19 10:48:24 drbd1 kernel: block drbd0: Began resync as SyncTarget (will sync 39533428 KB [9883357 bits set]).
4、最后我将数据从drbd2同步到原主服务器,然后资源也切换过来了,不知道原主服务器还会不会出现IO错误。
虚拟机测试环境,测试断掉60网段线路,drbd正常日志:
message日志: Jun 19 11:18:02 drbd1 kernel: e1000: eth1 NIC Link is Down Jun 19 11:18:03 drbd1 kernel: d-con r0: PingAck did not arrive in time. Jun 19 11:18:03 drbd1 kernel: d-con r0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Jun 19 11:18:03 drbd1 kernel: d-con r0: asender terminated Jun 19 11:18:03 drbd1 kernel: d-con r0: Terminating drbd_a_r0 Jun 19 11:18:03 drbd1 kernel: block drbd0: new current UUID 5BEECA7E9782384F:457559D613869C63:90760DB10A2C370F:90750DB10A2C370F Jun 19 11:18:03 drbd1 kernel: d-con r0: Connection closed Jun 19 11:18:03 drbd1 kernel: d-con r0: out of mem, failed to invoke fence-peer helper Jun 19 11:18:03 drbd1 kernel: d-con r0: conn( NetworkFailure -> Unconnected ) Jun 19 11:18:03 drbd1 kernel: d-con r0: receiver terminated Jun 19 11:18:03 drbd1 kernel: d-con r0: Restarting receiver thread Jun 19 11:18:03 drbd1 kernel: d-con r0: receiver (re)started Jun 19 11:18:03 drbd1 kernel: d-con r0: conn( Unconnected -> WFConnection ) Jun 19 11:18:30 drbd1 ipfail: [1729]: info: Link Status update: Link drbd2.gxm.com/eth1 now has status dead Jun 19 11:18:31 drbd1 ipfail: [1729]: info: Asking other side for ping node count. Jun 19 11:18:31 drbd1 ipfail: [1729]: info: Checking remote count of ping nodes. Jun 19 11:18:33 drbd1 ipfail: [1729]: info: Ping node count is balanced. Jun 19 11:18:34 drbd1 ipfail: [1729]: info: No giveup timer to abort. ha-debug日志: Jun 19 11:18:30 drbd1.gxm.com heartbeat: [1680]: info: Link drbd2.gxm.com:eth1 dead. Jun 19 11:18:30 drbd1.gxm.com ipfail: [1729]: info: Link Status update: Link drbd2.gxm.com/eth1 now has status dead Jun 19 11:18:30 drbd1.gxm.com ipfail: [1729]: debug: Found ping node 192.168.1.1! Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: info: Asking other side for ping node count. Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: debug: Message [num_ping] sent. Jun 19 11:18:31 drbd1.gxm.com ipfail: [1729]: info: Checking remote count of ping nodes. Jun 19 11:18:32 drbd1.gxm.com ipfail: [1729]: debug: Got asked for num_ping. Jun 19 11:18:32 drbd1.gxm.com ipfail: [1729]: debug: Found ping node 192.168.1.1! Jun 19 11:18:33 drbd1.gxm.com ipfail: [1729]: info: Ping node count is balanced. Jun 19 11:18:33 drbd1.gxm.com ipfail: [1729]: debug: Abort message sent. Jun 19 11:18:34 drbd1.gxm.com ipfail: [1729]: info: No giveup timer to abort
原文地址:http://blog.51cto.com/net881004/2130642