MongoDB复制集的概述
复制集是额外的数据副本,是跨多个服务器同步数据的过程,复制集提供了冗余并增加了数据可用性,通过复制集可以对硬件故障和中断的服务进行恢复。
复制集工作原理
- MongoDB的复制集至少需要两个节点。其中一个是主节点(primary),负责处理客户端的请求,其余都是从节点(Secondary),负责复制主节点上的数据。
- MongoDB各个节点常见的搭配方式为:一主一从或一主多从。主节点记录其上的所有操作到oplog中,从节点定期轮询主节点获取这些操作,然后对自己的数副本执行这些操作,从而保证从节点的数据与主节点一致。
复制集的特点
- N个节点的群集
- 任何节点可作为主节点
- 所有写入操作都在主节点上
- 自动故障转移
- 自动恢复
MongoDB复制集部署
1.配置复制集
(1)创建数据文件和日志文件存储路径
[[email protected] ~]# mkdir -p /data/mongodb/mongodb{2,3,4}
[[email protected] ~]# cd /data/mongodb/
[[email protected] mongodb]# mkdir logs
[[email protected] mongodb]# touch logs/mongodb{2,3,4}.log
[[email protected] mongodb]# cd logs/
[[email protected] logs]# ls
mongodb2.log mongodb3.log mongodb4.log
[[email protected] logs]# chmod 777 *.log
(2)编辑4个MongoDB实例的配置文件
先编辑Mongodb的配置文件,配置replSet参数值都为kgcrs,并复制3份,具体操作如下:
[[email protected] etc]# vim mongod.conf
path: /var/log/mongodb/mongod.log
# Where and how to store data.
storage:
dbPath: /var/lib/mongo
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# how the process runs
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 27017
bindIp: 0.0.0.0 # Listen to local interface only, comment to listen on all interfaces.
#security:
#operationProfiling:
replication:
replSetName: kgcrs
#sharding:
## Enterprise-Only Options
#auditLog:
#snmp:
然后将mongodb2.conf中的port参数配置为27018,mongodb3.conf中的port参数配置为27019,mongodb4.conf中的port参数配置为27020。同样也将dbpath和logpath参数修改为对应的路径值。
(3)启动4个MongoDB节点实列并查看进程信息
[[email protected] etc]# mongod -f /etc/mongod.conf --shutdown //先关闭//
[[email protected] etc]# mongod -f /etc/mongod.conf //再开启//
[[email protected] etc]# mongod -f /etc/mongod2.conf
[[email protected] etc]# mongod -f /etc/mongod3.conf
[[email protected] etc]# mongod -f /etc/mongod4.conf
[[email protected] etc]# netstat -ntap | grep mongod
tcp 0 0 0.0.0.0:27019 0.0.0.0:* LISTEN 17868/mongod
tcp 0 0 0.0.0.0:27020 0.0.0.0:* LISTEN 17896/mongod
tcp 0 0 0.0.0.0:27017 0.0.0.0:* LISTEN 17116/mongod
tcp 0 0 0.0.0.0:27018 0.0.0.0:* LISTEN 17413/mongod
(4)配置三个节点的复制集
[[email protected] etc]# mongo
> rs.status() //查看复制集//
{
"info" : "run rs.initiate(...) if not yet done for the set",
"ok" : 0,
"errmsg" : "no replset config has been received",
"code" : 94,
"codeName" : "NotYetInitialized",
"$clusterTime" : {
"clusterTime" : Timestamp(0, 0),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
> cfg={"_id":"kgcrs","members":[{"_id":0,"host":"192.168.126.132:27017"},{"_id":1,"host":"192.168.126.132:27018"},{"_id":2,"host":"192.168.126.132:27019"}]} //添加复制集//
{
"_id" : "kgcrs",
"members" : [
{
"_id" : 0,
"host" : "192.168.126.132:27017"
},
{
"_id" : 1,
"host" : "192.168.126.132:27018"
},
{
"_id" : 2,
"host" : "192.168.126.132:27019"
}
]
}
> rs.initiate(cfg) //初始化配置时保证从节点没有数据//
(5)查看复制集状态
启动复制集后,再次通过rs.status()命令查看复制集的完整状态信息
kgcrs:SECONDARY> rs.status()
{
"set" : "kgcrs",
"date" : ISODate("2018-07-17T07:18:52.047Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"appliedOpTime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
}
},
"members" : [
{
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY", //主节点//
"uptime" : 2855,
"optime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-07-17T07:18:48Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1531811847, 1),
"electionDate" : ISODate("2018-07-17T07:17:27Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //从节点//
"uptime" : 95,
"optime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-07-17T07:18:48Z"),
"optimeDurableDate" : ISODate("2018-07-17T07:18:48Z"),
"lastHeartbeat" : ISODate("2018-07-17T07:18:51.208Z"),
"lastHeartbeatRecv" : ISODate("2018-07-17T07:18:51.720Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "192.168.126.132:27017",
"syncSourceHost" : "192.168.126.132:27017",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //从节点//
"uptime" : 95,
"optime" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1531811928, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-07-17T07:18:48Z"),
"optimeDurableDate" : ISODate("2018-07-17T07:18:48Z"),
"lastHeartbeat" : ISODate("2018-07-17T07:18:51.208Z"),
"lastHeartbeatRecv" : ISODate("2018-07-17T07:18:51.822Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "192.168.126.132:27017",
"syncSourceHost" : "192.168.126.132:27017",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1531811928, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1531811928, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
其中,health为1代表健康,0代表宕机。state为1代表主节点,为2代表从节点。
在复制集初始化配置时要保证从节点上没有数据
MongoDB复制集切换
MongoDB复制集可以实现群集的高可用,当其中主节点出现故障时会自动切换到其他节点。也可手动进行复制集的主从切换。
1.故障转移切换
[[email protected] etc]# ps aux | grep mongod //查看进程//
root 17116 1.2 5.8 1546916 58140 ? Sl 14:31 0:51 mongod -f /etc/mongod.conf
root 17413 1.0 5.7 1445624 57444 ? Sl 14:34 0:39 mongod -f /etc/mongod2.conf
root 17868 1.2 5.5 1446752 55032 ? Sl 15:05 0:23 mongod -f /etc/mongod3.conf
root 17896 0.8 4.7 1037208 47552 ? Sl 15:05 0:16 mongod -f /etc/mongod4.conf
root 18836 0.0 0.0 112676 980 pts/1 S+ 15:38 0:00 grep --color=auto mongod
[[email protected] etc]# kill -9 17116 ///杀死27017进程//
[[email protected] etc]# ps aux | grep mongod
root 17413 1.0 5.7 1453820 57456 ? Sl 14:34 0:40 mongod -f /etc/mongod2.conf
root 17868 1.2 5.5 1454948 55056 ? Sl 15:05 0:24 mongod -f /etc/mongod3.conf
root 17896 0.8 4.7 1037208 47552 ? Sl 15:05 0:16 mongod -f /etc/mongod4.conf
root 18843 0.0 0.0 112676 976 pts/1 R+ 15:38 0:00 grep --color=auto mongod
[[email protected] etc]# mongo --port 27019
kgcrs:PRIMARY> rs.status()
"members" : [
{
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 0, //宕机状态//
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
{
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //从节点//
"uptime" : 1467,
"optime" : {
"ts" : Timestamp(1531813296, 1),
"t" : NumberLong(2)
},
"optimeDurable" : {
"ts" : Timestamp(1531813296, 1),
"t" : NumberLong(2)
},
{
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY", //主节点//
"uptime" : 2178,
"optime" : {
"ts" : Timestamp(1531813296, 1),
"t" : NumberLong(2)
}
2.手动进行主从切换
kgcrs:PRIMARY> rs.freeze(30) //暂停30s不参与选举
kgcrs:PRIMARY> rs.stepDown(60,30) //交出主节点位置,维持从节点状态不少于60秒,等待30秒使主节点和从节点日志同步
2018-07-17T15:46:19.079+0800 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command ‘replSetStepDown‘ on host ‘127.0.0.1:27019‘ :
[email protected]/mongo/shell/db.js:168:1
[email protected]/mongo/shell/db.js:186:16
[email protected]/mongo/shell/utils.js:1341:12
@(shell):1:1
2018-07-17T15:46:19.082+0800 I NETWORK [thread1] trying reconnect to 127.0.0.1:27019 (127.0.0.1) failed
2018-07-17T15:46:19.085+0800 I NETWORK [thread1] reconnect 127.0.0.1:27019 (127.0.0.1) ok
kgcrs:SECONDARY> //交出主节点后立马变成从节点//
kgcrs:SECONDARY> rs.status()
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 0, //宕机状态//
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
{
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY", //主节点状态//
"uptime" : 1851,
"optime" : {
"ts" : Timestamp(1531813679, 1),
"t" : NumberLong(3)
{
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //从节点状态//
"uptime" : 2563,
"optime" : {
"ts" : Timestamp(1531813689, 1),
"t" : NumberLong(3)
MongoDB复制集的选举原理
节点类型分为标准节点(host)、被动节点(passive)和仲裁节点(arbiter)。
- 只有标准节点可能被选举为活跃(primary)节点,有选举权。被动节点有完整副本,不可能成为活跃节点,有选举权。仲裁节点不复制数据,不可能成为活跃节点,只有选举权。
- 标准节点与被动节点的区别:priority值高者是标准节点,低者为被动节点。
- 选举规则是票数高者获胜,priority是优先权为0~1000的值,相当于额外增加0~1000的票数。选举结果:票数高者获胜;若票数相同,数据新者获胜。
1.配置复制集的优先级
1)重新配置4个节点的MongoDB复制集,设置两个标准节点,一个被动节点和一个仲裁节点。
[[email protected] etc]# mongo
> cfg={"_id":"kgcrs","members":[{"_id":0,"host":"192.168.126.132:27017","priority":100},{"_id":1,"host":"192.168.126.132:27018","priority":100},{"_id":2,"host":"192.168.126.132:27019","priority":0},{"_id":3,"host":"192.168.126.132:27020","arbiterOnly":true}]}
> rs.initiate(cfg) //重新配置//
kgcrs:SECONDARY> rs.isMaster()
{
"hosts" : [ //标准节点//
"192.168.126.132:27017",
"192.168.126.132:27018"
],
"passives" : [ //被动节点//
"192.168.126.132:27019"
],
"arbiters" : [ //仲裁节点//
"192.168.126.132:27020"
2)模拟主节点故障
如果主节点出现故障,另一个标准节点将会选举成为新的主节点
[[email protected] etc]# mongod -f /etc/mongod.conf --shutdown //标准节点27017//
[[email protected] etc]# mongo --port 27018 //此时会选举第二个标准节点为主节点//
kgcrs:PRIMARY> rs.status()
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 0, //宕机状态//
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY", //标准节点//
"uptime" : 879,
"optime" : {
"ts" : Timestamp(1531817473, 1),
"t" : NumberLong(2)
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //被动节点//
"uptime" : 569,
"optime" : {
"ts" : Timestamp(1531817473, 1),
"t" : NumberLong(2)
"_id" : 3,
"name" : "192.168.126.132:27020",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER", //仲裁节点//
"uptime" : 569,
3)模拟所有标准节点出现故障
所有标准节点都出现故障,被动节点也不能成为主节点
[[email protected] etc]# mongod -f /etc/mongod2.conf --shutdown //关闭标准节点27018//
[[email protected] etc]# mongo --port 27019
kgcrs:SECONDARY> rs.status()
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 0, //宕机状态//
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 0, //宕机状态//
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY", //被动节点//
"uptime" : 1403,
"_id" : 3,
"name" : "192.168.126.132:27020",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER", //仲裁节点//
MongoDB复制集管理
1.配置允许在从节点读取数据
默认MongoDB复制集的从节点不能读取数据,可以使用rs.slaveOk()命令允许能够在从节点读取数据。
[[email protected] etc]# mongo --port 27017
kgcrs:SECONDARY> show dbs //读取不到数据库信息//
2018-07-17T17:11:31.570+0800 E QUERY [thread1] Error: listDatabases failed:{
"operationTime" : Timestamp(1531818690, 1),
"ok" : 0,
"errmsg" : "not master and slaveOk=false",
"code" : 13435,
"codeName" : "NotMaste
kgcrs:SECONDARY> rs.slaveOk()
kgcrs:SECONDARY> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
2.查看复制状态信息
可以使用 rs.printReplicationInfo()和rs.printSlaveReplicationInfo()命令查看复制集状态。
kgcrs:SECONDARY> rs.printReplicationInfo()
configured oplog size: 990MB
log length start to end: 2092secs (0.58hrs)
oplog first event time: Tue Jul 17 2018 16:41:48 GMT+0800 (CST)
oplog last event time: Tue Jul 17 2018 17:16:40 GMT+0800 (CST)
now: Tue Jul 17 2018 17:16:46 GMT+0800 (CST)
kgcrs:SECONDARY> rs.printSlaveReplicationInfo()
source: 192.168.126.132:27017
syncedTo: Tue Jul 17 2018 17:16:50 GMT+0800 (CST)
0 secs (0 hrs) behind the primary
source: 192.168.126.132:27019
syncedTo: Tue Jul 17 2018 17:16:50 GMT+0800 (CST)
0 secs (0 hrs) behind the primary
3.部署认证复制
kgcrs:PRIMARY> use admin
kgcrs:PRIMARY> db.createUser({"user":"root","pwd":"123","roles":["root"]})
[[email protected] ~]# vim /etc/mongod.conf //分别编辑四个配置文件//
....
security:
keyFile: /usr/bin/kgcrskey1 //验证路径//
clusterAuthMode: keyFile //验证类型//
[[email protected] ~]# vim /etc/mongod2.conf
[[email protected] ~]# vim /etc/mongod3.conf
[[email protected] ~]# vim /etc/mongod4.conf
[[email protected] bin]# echo "kgcrs key"> kgcrskey1 //生成4个实例的密钥文件//
[[email protected] bin]# echo "kgcrs key"> kgcrskey2
[[email protected] bin]# echo "kgcrs key"> kgcrskey3
[[email protected] bin]# echo "kgcrs key"> kgcrskey4
[[email protected] bin]# chmod 600 kgcrskey{1..4}
[[email protected] bin]# mongod -f /etc/mongod.conf //重启4个实例//
[[email protected] bin]# mongod -f /etc/mongod2.conf
[[email protected] bin]# mongod -f /etc/mongod3.conf
[[email protected] bin]# mongod -f /etc/mongod4.conf
[[email protected] bin]# mongo --port 27017 //进入标准节点中//
kgcrs:PRIMARY> show dbs //无法查看数据库//
kgcrs:PRIMARY> rs.status() //无法查看复制集//
kgcrs:PRIMARY> use admin //身份登录验证//
kgcrs:PRIMARY> db.auth("root","123")
kgcrs:PRIMARY> show dbs //可以查看数据库//
admin 0.000GB
config 0.000GB
local 0.000GB
kgcrs:PRIMARY> rs.status() //可以查看复制集//
"_id" : 0,
"name" : "192.168.126.132:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 411,
"_id" : 1,
"name" : "192.168.126.132:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 324,
"_id" : 2,
"name" : "192.168.126.132:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 305,
"_id" : 3,
"name" : "192.168.126.132:27020",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 280,
原文地址:http://blog.51cto.com/13642258/2146722