【Mongodb】 Replica set的自动故障切换

Replica set 为我们提供了自动故障切换功能,这个机制是由mongodb自己来操作的,它根据从库的优先级或者数据新鲜度(也就是最新的从主库同步数据的那个节点)来选择primary,而当以前的primary起来之后,会成为secondary ,接受新的primary 的日志。

完整的replica sets

primary 当机

 mongodb 会根据数据的新鲜度来选择下一个主库


接上一篇文章,搭建好了replica set,查看端口为 27018 27020两个服务的状态:

[[email protected] bin]$ ./mongo 127.0.0.1:27018

MongoDB shell version: 2.0.1

connecting to: 127.0.0.1:27018/test

PRIMARY> db.isMaster();

{

"setName" : "myset",

"ismaster" : true,  --为主库

"secondary" : false,

"hosts" : [

"10.250.7.220:27018",

"10.250.7.220:27020",

"10.250.7.220:27019"

],

"primary" : "10.250.7.220:27018",

"me" : "10.250.7.220:27018",

"maxBsonObjectSize" : 16777216,

"ok" : 1

}

PRIMARY> exit

bye

[[email protected] bin]$ ./mongo 127.0.0.1:27020

MongoDB shell version: 2.0.1

connecting to: 127.0.0.1:27020/test

SECONDARY>

SECONDARY> db.isMaster();

{

"setName" : "myset",

"ismaster" : false,

   "secondary" : true, --为从库

"hosts" : [

"10.250.7.220:27020",

"10.250.7.220:27019",

"10.250.7.220:27018"

],

"primary" : "10.250.7.220:27018",

"me" : "10.250.7.220:27020",

"maxBsonObjectSize" : 16777216,

"ok" : 1

}

PRIMARY> 手工杀掉primary 

[[email protected] ~]# ps -ef | grep 27018

mongodb  14826 14794  1 20:24 pts/4    00:00:05 ./mongod --dbpath /opt/mongodata/r1 --port 27018 --replSet myset --rest

mongodb  14999 14430  0 20:28 pts/2    00:00:00 ./mongo 127.0.0.1:27018

[[email protected] ~]# kill -9 14826 14794

[[email protected] ~]# ps -ef | grep mongodb |grep -v root

mongodb  14883 14853  1 20:26 pts/7    00:00:05 ./mongod --dbpath /opt/mongodata/r2 --port 27019 --replSet myset --rest

mongodb  14901 14548  1 20:27 pts/6    00:00:07 ./mongod --dbpath /opt/mongodata/r3 --port 27020 --replSet myset --rest

mongodb  14999 14430  0 20:28 pts/2    00:00:00 ./mongo 127.0.0.1:27018

mongodb  15102 15072  0 20:30 pts/5    00:00:00 ./mongo 127.0.0.1:27019

mongodb  15136 15106  0 20:30 pts/8    00:00:00 ./mongo 127.0.0.1:27020

[[email protected] ~]#

27019 端口的mongodb 输出日志显示的选择10.250.7.220 作为主库的日志记录

Mon Oct 31 20:27:59 [FileAllocator] allocating new datafile /opt/mongodata/r2/local.2, filling with zeroes...

Mon Oct 31 20:27:59 [rsHealthPoll] replSet info member 10.250.7.220:27018 is up

Mon Oct 31 20:27:59 [rsHealthPoll] replSet member 10.250.7.220:27018 is now in state SECONDARY

Mon Oct 31 20:27:59 [rsHealthPoll] replSet info 10.250.7.220:27020 is down (or slow to respond): still initializing

Mon Oct 31 20:27:59 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state DOWN

Mon Oct 31 20:28:01 [initandlisten] connection accepted from 10.250.7.220:10857 #3

Mon Oct 31 20:28:05 [conn2] replSet RECOVERING

Mon Oct 31 20:28:05 [conn2] replSet info voting yea for 10.250.7.220:27018 (0)

Mon Oct 31 20:28:07 [rsHealthPoll] replSet member 10.250.7.220:27018 is now in state PRIMARY

Mon Oct 31 20:28:09 [FileAllocator] done allocating datafile /opt/mongodata/r2/local.2, size: 1024MB,  took 10.89 secs

Mon Oct 31 20:28:10 [rsSync] ******

Mon Oct 31 20:28:10 [rsSync] replSet initial sync pending

Mon Oct 31 20:28:10 [rsSync] replSet syncing to: 10.250.7.220:27018

Mon Oct 31 20:28:10 [rsSync] build index local.me { _id: 1 }

Mon Oct 31 20:28:10 [rsSync] build index done 0 records 0.001 secs

Mon Oct 31 20:28:10 [rsSync] replSet initial sync drop all databases

Mon Oct 31 20:28:10 [rsSync] dropAllDatabasesExceptLocal 1

Mon Oct 31 20:28:10 [rsSync] replSet initial sync clone all databases

Mon Oct 31 20:28:10 [rsSync] replSet initial sync query minValid

Mon Oct 31 20:28:10 [rsSync] replSet initial oplog application from 10.250.7.220:27018 starting at Oct 31 20:27:53:1 to Oct 31 20:27:53:1

Mon Oct 31 20:28:13 [rsHealthPoll] replSet info member 10.250.7.220:27020 is up

Mon Oct 31 20:28:13 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state STARTUP2

Mon Oct 31 20:28:14 [rsSync] replSet initial sync finishing up

Mon Oct 31 20:28:14 [rsSync] replSet set minValid=4eae9449:1

Mon Oct 31 20:28:14 [rsSync] build index local.replset.minvalid { _id: 1 }

Mon Oct 31 20:28:14 [rsSync] build index done 0 records 0.005 secs

Mon Oct 31 20:28:14 [rsSync] replSet initial sync done

Mon Oct 31 20:28:15 [rsSync] replSet syncing to: 10.250.7.220:27018

Mon Oct 31 20:28:15 [rsSync] replSet SECONDARY

Mon Oct 31 20:28:15 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state RECOVERING

Mon Oct 31 20:28:26 [clientcursormon] mem (MB) res:16 virt:2677 mapped:1232

Mon Oct 31 20:28:52 [initandlisten] connection accepted from 10.250.7.220:10872 #4

Mon Oct 31 20:28:52 [initandlisten] connection accepted from 10.250.7.220:10873 #5

Mon Oct 31 20:28:52 [rsGhostSync] handshake between 2 and 10.250.7.220:27018

Mon Oct 31 20:28:53 [slaveTracking] build index local.slaves { _id: 1 }

Mon Oct 31 20:28:53 [slaveTracking] build index done 0 records 0.003 secs

Mon Oct 31 20:28:55 [conn5] end connection 10.250.7.220:10873

Mon Oct 31 20:28:55 [conn4] end connection 10.250.7.220:10872

Mon Oct 31 20:28:57 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state SECONDARY

Mon Oct 31 20:29:27 [clientcursormon] mem (MB) res:19 virt:2693 mapped:1232

Mon Oct 31 20:30:21 [initandlisten] connection accepted from 127.0.0.1:44672 #6

Mon Oct 31 20:33:35 [conn2] end connection 10.250.7.220:42493

Mon Oct 31 20:33:35 [rsSync] replSet syncThread: 10278 dbclient error communicating with server: 10.250.7.220:27018

Mon Oct 31 20:33:35 [rsHealthPoll] DBClientCursor::init call() failed

Mon Oct 31 20:33:35 [rsHealthPoll] replSet info 10.250.7.220:27018 is down (or slow to respond): DBClientBase::findN: transport error: 10.250.7.220:27018 query: { replSetHeartbeat: "myset", v: 1, pv: 1, checkEmpty: false, from: "10.250.7.220:27019" }

Mon Oct 31 20:33:35 [rsHealthPoll] replSet member 10.250.7.220:27018 is now in state DOWN

Mon Oct 31 20:33:35 [rsMgr] not electing self, 10.250.7.220:27020 would veto

Mon Oct 31 20:33:36 [conn3] replSet info voting yea for 10.250.7.220:27020 (2)

Mon Oct 31 20:33:37 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state PRIMARY

Mon Oct 31 20:33:46 [rsSync] replSet syncing to: 10.250.7.220:27020

Mon Oct 31 20:34:27 [clientcursormon] mem (MB) res:19 virt:2693 mapped:1232

27020 端口的mongodb 输出日志显示的选择10.250.7.220 作为主库的日志记录

Mon Oct 31 20:33:35 [rsSync] replSet syncThread: 10278 dbclient error communicating with server: 10.250.7.220:27018

Mon Oct 31 20:33:36 [rsHealthPoll] DBClientCursor::init call() failed

Mon Oct 31 20:33:36 [rsHealthPoll] replSet info 10.250.7.220:27018 is down (or slow to respond): DBClientBase::findN: transport error: 10.250.7.220:27018 query: { replSetHeartbeat: "myset", v: 1, pv: 1, checkEmpty: false, from: "10.250.7.220:27020" }

Mon Oct 31 20:33:36 [rsHealthPoll] replSet member 10.250.7.220:27018 is now in state DOWN

Mon Oct 31 20:33:36 [rsMgr] replSet info electSelf 2

Mon Oct 31 20:33:36 [rsMgr] replSet PRIMARY

Mon Oct 31 20:33:46 [initandlisten] connection accepted from 10.250.7.220:37261 #5

Mon Oct 31 20:33:47 [slaveTracking] build index local.slaves { _id: 1 }

Mon Oct 31 20:33:47 [slaveTracking] build index done 0 records 0.001 secs

Mon Oct 31 20:33:48 [clientcursormon] mem (MB) res:19 virt:2692 mapped:1232

Mon Oct 31 20:34:35 [conn4] end connection 127.0.0.1:17500

Mon Oct 31 20:34:37 [initandlisten] connection accepted from 127.0.0.1:36525 #6

进入数据库查看:

[[email protected] bin]$ ./mongo 127.0.0.1:27020

MongoDB shell version: 2.0.1

connecting to: 127.0.0.1:27020/test

PRIMARY>

PRIMARY>

PRIMARY> db.isMaster();

{

"setName" : "myset",

"ismaster" : true,--成为主库master

"secondary" : false,

"hosts" : [

"10.250.7.220:27020",

"10.250.7.220:27019",

"10.250.7.220:27018"

],

"primary" : "10.250.7.220:27020",

"me" : "10.250.7.220:27020",

"maxBsonObjectSize" : 16777216,

"ok" : 1

}

PRIMARY>

重新启动端口为27018的mongodb的数据库服务:从日志中可以看出其进行恢复的操作记录

[[email protected] bin]$ ./mongod --dbpath /opt/mongodata/r1 --port 27018  --rest --replSet myset &

[1] 16290

[[email protected] bin]$ Mon Oct 31 20:48:32 [initandlisten] MongoDB starting : pid=16290 port=27018 dbpath=/opt/mongodata/r1 64-bit host=rac4

Mon Oct 31 20:48:32 [initandlisten] db version v2.0.1, pdfile version 4.5

Mon Oct 31 20:48:32 [initandlisten] git version: 3a5cf0e2134a830d38d2d1aae7e88cac31bdd684

Mon Oct 31 20:48:32 [initandlisten] build info: Linux bs-linux64.10gen.cc 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41

Mon Oct 31 20:48:32 [initandlisten] options: { dbpath: "/opt/mongodata/r1", port: 27018, replSet: "myset", rest: true }

Mon Oct 31 20:48:32 [initandlisten] journal dir=/opt/mongodata/r1/journal

Mon Oct 31 20:48:32 [initandlisten] recover begin

Mon Oct 31 20:48:32 [initandlisten] recover lsn: 231055

Mon Oct 31 20:48:32 [initandlisten] recover /opt/mongodata/r1/journal/j._0

Mon Oct 31 20:48:32 [initandlisten] recover skipping application of section seq:198962 < lsn:231055

Mon Oct 31 20:48:32 [initandlisten] recover cleaning up

Mon Oct 31 20:48:32 [initandlisten] removeJournalFiles

Mon Oct 31 20:48:32 [initandlisten] recover done

Mon Oct 31 20:48:32 [initandlisten] waiting for connections on port 27018

Mon Oct 31 20:48:32 [websvr] admin web console waiting for connections on port 28018

Mon Oct 31 20:48:32 [initandlisten] connection accepted from 127.0.0.1:11930 #1

Mon Oct 31 20:48:32 [rsStart] replSet STARTUP2

Mon Oct 31 20:48:32 [rsHealthPoll] replSet info member 10.250.7.220:27019 is up

Mon Oct 31 20:48:32 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state SECONDARY

Mon Oct 31 20:48:32 [rsHealthPoll] replSet info member 10.250.7.220:27020 is up

Mon Oct 31 20:48:32 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state PRIMARY

Mon Oct 31 20:48:32 [rsSync] replSet SECONDARY

Mon Oct 31 20:48:33 [initandlisten] connection accepted from 10.250.7.220:35971 #2

Mon Oct 31 20:48:34 [initandlisten] connection accepted from 10.250.7.220:35972 #3

Mon Oct 31 20:48:36 [rsSync] replSet syncing to: 10.250.7.220:27020

Mon Oct 31 20:48:36 [rsSync] build index local.me { _id: 1 }

Mon Oct 31 20:48:36 [rsSync] build index done 0 records 0 secs

[[email protected] bin]$ ./mongo 127.0.0.1:27018

MongoDB shell version: 2.0.1

connecting to: 127.0.0.1:27018/test

SECONDARY>

SECONDARY> db.isMaster();

{

"setName" : "myset",

"ismaster" : false,   --端口为 27018的数据库服务变为从库

"secondary" : true,

"hosts" : [

"10.250.7.220:27018",

"10.250.7.220:27020",

"10.250.7.220:27019"

],

"primary" : "10.250.7.220:27020",

"me" : "10.250.7.220:27018",

"maxBsonObjectSize" : 16777216,

"ok" : 1

}

SECONDARY>

2.JPG

3.JPG

4.JPG

时间: 2024-10-11 13:32:58

【Mongodb】 Replica set的自动故障切换的相关文章

第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)

说明:该篇内容部分来自红丸编写的MongoDB实战文章. 1.简介 MongoDB支持在多个机器中通过异步复制达到故障转移和实现冗余,多机器中同一时刻只有一台是用于写操作,正是由于这个情况,为了MongoDB提供了数据一致性的保障,担当primary角色的服务能把读操作分发给Slave(详情请看前两篇关于Replica Set成员组成和理解). MongoDB高可用分为两种: Master-Slave主从复制:只需要在某一个服务启动时加上-master参数,而另外一个服务加上-slave与-so

MongoDB Replica Set使用经验分享

MongoDB Replica Set是MongoDB官方推荐的主从复制和高可用方案,用于替代原有的Master-Slave主从复制方案.Replicat Set具有自动切换功能,当Primary挂掉之后,可以自动由Replica Set中的某一个Secondary来切换到Primary,以实现高可用的目的,不像MySQL那样需要使用第三方软件. 目前很多游戏公司都开始使用MongoDB作为数据库,我们公司线上使用的版本是2.4.6. 一  MongoDB Replica Set的原理 复制主要

mongodb replica set 和 nodejs中使用mongoose连接replica

一.mongodb replication 介绍 官网上的第一句话就是Replication is the process of synchronizing data across multiple servers.翻译过来就是replication(复制)是跨多个服务器同步的过程,基本原理就是一个主服务器和很多从服务器通过同步日志的方式来达到数据一致的目的,并且有且只有一个主服务器,在mongodb中也叫主节点(primary node)负责写操作,而从服务器,也叫次要节点(secondary

MongoDB (replica set) 集群配置

MongoDB Replica Sets的结构类似于以集群,完全可以把他当成一个集群,因为他确实与集群实现的作用是一样的:如果其中一个节点出现故障,其他的节点会马上将业务接管过来.而无需停机操作 MongoDB Replica Sets的配置步骤: 1:启动三个节点 (mongodb\mongodb\bin\  为mongo 所在文件夹) 启动第1个节点:mongodb\mongodb\bin\mongod --replSet rs3/127.0.0.1:28011,127.0.0.1:2801

关于MongoDb Replica Set的故障转移集群——实战篇

如果你还不了解Replica Set的相关理论,请猛戳传送门阅读笔者的上一篇博文. 因为Replica Set已经属于MongoDb的进阶应用,下文中关于MongoDb的基础知识笔者就不再赘述了,请参考MongoDb Manual. 下面分各种场景讲述如何创建一个Replica Set. Standalone到Replica Set 这是相对简单的一种情况.如果你刚刚在生产环境应用MongoDb,很有可能适用于这种场景. 一台独立的MongoDb实例变为Replica Set的首位成员很容易,需

Simple Automated Backups for MongoDB Replica Sets

There are a bunch of different methods you can use to back up your MongoDB data, but if you want to avoid downtime and/or potential performance degradation, the most common advice seems to be that you should simply do all your backups on a slave. Thi

MongoDB Replica Set搭建集群

MongoDB做集群,版本3.2官网推荐的集群方式Replica Set 准备服务器3台 两个standard节点(这两个节点直接可以互切primary secondary). 一个arbiter节点,它手中握着一张选票,决定上面两个standard节点中的哪一个可以成为primay. 机器名称和预承担的角色如下: test39 primary test41 secondary test42 arbiter 介绍一下涉及到的参数 --dbpath   数据文件路径 --logpath  日志文件

mongodb replica set 添加/删除节点方法--http://www.ii123.com/jc/bc/bczh/258948.html

replica set多服务器主从,添加,删除节点,肯定会经常遇到的.下面详细说明一下,添加,删除节点的2种方法. 一,利用rs.reconfig,来添加,删除节点 1,添加节点  代码如下   repmore:PRIMARY> config = {_id:"repmore",members:[{_id:0,host:'127.0.0.1:27017',priority :2},{_id:1,host:'127.0.0.1:27018',priority:1}]};   //添加

通过Keepalived实现Redis Failover自动故障切换功能

通过Keepalived实现Redis Failover自动故障切换功能[实践分享] 参考资料: http://patrick-tang.blogspot.com/2012/06/redis-keepalived-failover-system.html http://deidara.blog.51cto.com/400447/302402 目前,Redis还没有一个类似于MySQL Proxy或Oracle RAC的官方HA方案.Redis作者有一个名为Redis Sentinel的计划(ht