在CentOS7上部署MongoDB复制集和复制集的管理维护
MongoDB复制集的概述
复制集是额外的数据副本,是跨多个服务器同步数据的过程,复制集提供了冗余并增加了数据可用性,通过复制集可以对硬件故障和中断的服务进行恢复。
复制集工作原理
- MongoDB的复制集至少需要两个节点。其中一个是主节点(primary),负责处理客户端的请求,其余都是从节点(Secondary),负责复制主节点上的数据。
- MongoDB各个节点常见的搭配方式为:一主一从或一主多从。主节点记录其上的所有操作到oplog中,从节点定期轮询主节点获取这些操作,然后对自己的数副本执行这些操作,从而保证从节点的数据与主节点一致。
复制集的特点
- N个节点的群集
- 任何节点可作为主节点
- 所有写入操作都在主节点上
- 自动故障转移
- 自动恢复
MongoDB复制集部署
1.配置复制集
(1)创建数据文件和日志文件存储路径
[root@localhost ~]# mkdir -p /data/mongodb/mongodb{2,3,4}[root@localhost ~]# cd /data/mongodb/[root@localhost mongodb]# mkdir logs[root@localhost mongodb]# touch logs/mongodb{2,3,4}.log[root@localhost mongodb]# cd logs/[root@localhost logs]# lsmongodb2.log mongodb3.log mongodb4.log[root@localhost logs]# chmod 777 *.log
(2)编辑4个MongoDB实例的配置文件
先编辑Mongodb的配置文件,配置replSet参数值都为kgcrs,并复制3份,具体操作如下:
[root@localhost etc]# vim mongod.conf path: /var/log/mongodb/mongod.log# Where and how to store data.storage: dbPath: /var/lib/mongo journal: enabled: true# engine:# mmapv1:# wiredTiger:# how the process runsprocessManagement: fork: true # fork and run in background pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile timeZoneInfo: /usr/share/zoneinfo# network interfacesnet: port: 27017 bindIp: 0.0.0.0 # Listen to local interface only, comment to listen on all interfaces.#security:#operationProfiling:replication: replSetName: kgcrs #sharding:## Enterprise-Only Options#auditLog:#snmp:
然后将mongodb2.conf中的port参数配置为27018,mongodb3.conf中的port参数配置为27019,mongodb4.conf中的port参数配置为27020。同样也将dbpath和logpath参数修改为对应的路径值。
(3)启动4个MongoDB节点实列并查看进程信息
[root@localhost etc]# mongod -f /etc/mongod.conf --shutdown //先关闭//[root@localhost etc]# mongod -f /etc/mongod.conf //再开启//[root@localhost etc]# mongod -f /etc/mongod2.conf[root@localhost etc]# mongod -f /etc/mongod3.conf [root@localhost etc]# mongod -f /etc/mongod4.conf [root@localhost etc]# netstat -ntap | grep mongodtcp 0 0 0.0.0.0:27019 0.0.0.0:* LISTEN 17868/mongod tcp 0 0 0.0.0.0:27020 0.0.0.0:* LISTEN 17896/mongod tcp 0 0 0.0.0.0:27017 0.0.0.0:* LISTEN 17116/mongod tcp 0 0 0.0.0.0:27018 0.0.0.0:* LISTEN 17413/mongod
(4)配置三个节点的复制集
[root@localhost etc]# mongo> rs.status() //查看复制集//{ "info" : "run rs.initiate(...) if not yet done for the set", "ok" : 0, "errmsg" : "no replset config has been received", "code" : 94, "codeName" : "NotYetInitialized", "$clusterTime" : { "clusterTime" : Timestamp(0, 0), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }}> cfg={"_id":"kgcrs","members":[{"_id":0,"host":"192.168.126.132:27017"},{"_id":1,"host":"192.168.126.132:27018"},{"_id":2,"host":"192.168.126.132:27019"}]} //添加复制集//{ "_id" : "kgcrs", "members" : [ { "_id" : 0, "host" : "192.168.126.132:27017" }, { "_id" : 1, "host" : "192.168.126.132:27018" }, { "_id" : 2, "host" : "192.168.126.132:27019" } ]}> rs.initiate(cfg) //初始化配置时保证从节点没有数据//
(5)查看复制集状态
启动复制集后,再次通过rs.status()命令查看复制集的完整状态信息
kgcrs:SECONDARY> rs.status(){ "set" : "kgcrs", "date" : ISODate("2018-07-17T07:18:52.047Z"), "myState" : 1, "term" : NumberLong(1), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "readConcernMajorityOpTime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) } }, "members" : [ { "_id" : 0, "name" : "192.168.126.132:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", //主节点// "uptime" : 2855, "optime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-07-17T07:18:48Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1531811847, 1), "electionDate" : ISODate("2018-07-17T07:17:27Z"), "configVersion" : 1, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 1, "name" : "192.168.126.132:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //从节点// "uptime" : 95, "optime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-07-17T07:18:48Z"), "optimeDurableDate" : ISODate("2018-07-17T07:18:48Z"), "lastHeartbeat" : ISODate("2018-07-17T07:18:51.208Z"), "lastHeartbeatRecv" : ISODate("2018-07-17T07:18:51.720Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "192.168.126.132:27017", "syncSourceHost" : "192.168.126.132:27017", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 1 }, { "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //从节点// "uptime" : 95, "optime" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1531811928, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2018-07-17T07:18:48Z"), "optimeDurableDate" : ISODate("2018-07-17T07:18:48Z"), "lastHeartbeat" : ISODate("2018-07-17T07:18:51.208Z"), "lastHeartbeatRecv" : ISODate("2018-07-17T07:18:51.822Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "192.168.126.132:27017", "syncSourceHost" : "192.168.126.132:27017", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 1 } ], "ok" : 1, "operationTime" : Timestamp(1531811928, 1), "$clusterTime" : { "clusterTime" : Timestamp(1531811928, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }}
其中,health为1代表健康,0代表宕机。state为1代表主节点,为2代表从节点。
在复制集初始化配置时要保证从节点上没有数据
MongoDB复制集切换
MongoDB复制集可以实现群集的高可用,当其中主节点出现故障时会自动切换到其他节点。也可手动进行复制集的主从切换。
1.故障转移切换
[root@localhost etc]# ps aux | grep mongod //查看进程//root 17116 1.2 5.8 1546916 58140 ? Sl 14:31 0:51 mongod -f /etc/mongod.confroot 17413 1.0 5.7 1445624 57444 ? Sl 14:34 0:39 mongod -f /etc/mongod2.confroot 17868 1.2 5.5 1446752 55032 ? Sl 15:05 0:23 mongod -f /etc/mongod3.confroot 17896 0.8 4.7 1037208 47552 ? Sl 15:05 0:16 mongod -f /etc/mongod4.confroot 18836 0.0 0.0 112676 980 pts/1 S+ 15:38 0:00 grep --color=auto mongod[root@localhost etc]# kill -9 17116 ///杀死27017进程//[root@localhost etc]# ps aux | grep mongodroot 17413 1.0 5.7 1453820 57456 ? Sl 14:34 0:40 mongod -f /etc/mongod2.confroot 17868 1.2 5.5 1454948 55056 ? Sl 15:05 0:24 mongod -f /etc/mongod3.confroot 17896 0.8 4.7 1037208 47552 ? Sl 15:05 0:16 mongod -f /etc/mongod4.confroot 18843 0.0 0.0 112676 976 pts/1 R+ 15:38 0:00 grep --color=auto mongod[root@localhost etc]# mongo --port 27019kgcrs:PRIMARY> rs.status() "members" : [ { "_id" : 0, "name" : "192.168.126.132:27017", "health" : 0, //宕机状态// "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) { "_id" : 1, "name" : "192.168.126.132:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //从节点// "uptime" : 1467, "optime" : { "ts" : Timestamp(1531813296, 1), "t" : NumberLong(2) }, "optimeDurable" : { "ts" : Timestamp(1531813296, 1), "t" : NumberLong(2) }, { "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", //主节点// "uptime" : 2178, "optime" : { "ts" : Timestamp(1531813296, 1), "t" : NumberLong(2)}
2.手动进行主从切换
kgcrs:PRIMARY> rs.freeze(30) //暂停30s不参与选举kgcrs:PRIMARY> rs.stepDown(60,30) //交出主节点位置,维持从节点状态不少于60秒,等待30秒使主节点和从节点日志同步2018-07-17T15:46:19.079+0800 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27019' :DB.prototype.runCommand@src/mongo/shell/db.js:168:1DB.prototype.adminCommand@src/mongo/shell/db.js:186:16rs.stepDown@src/mongo/shell/utils.js:1341:12@(shell):1:12018-07-17T15:46:19.082+0800 I NETWORK [thread1] trying reconnect to 127.0.0.1:27019 (127.0.0.1) failed2018-07-17T15:46:19.085+0800 I NETWORK [thread1] reconnect 127.0.0.1:27019 (127.0.0.1) okkgcrs:SECONDARY> //交出主节点后立马变成从节点//kgcrs:SECONDARY> rs.status() "_id" : 0, "name" : "192.168.126.132:27017", "health" : 0, //宕机状态// "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, { "_id" : 1, "name" : "192.168.126.132:27018", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", //主节点状态// "uptime" : 1851, "optime" : { "ts" : Timestamp(1531813679, 1), "t" : NumberLong(3) { "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //从节点状态// "uptime" : 2563, "optime" : { "ts" : Timestamp(1531813689, 1), "t" : NumberLong(3)
MongoDB复制集的选举原理
节点类型分为标准节点(host)、被动节点(passive)和仲裁节点(arbiter)。
- 只有标准节点可能被选举为活跃(primary)节点,有选举权。被动节点有完整副本,不可能成为活跃节点,有选举权。仲裁节点不复制数据,不可能成为活跃节点,只有选举权。
- 标准节点与被动节点的区别:priority值高者是标准节点,低者为被动节点。
- 选举规则是票数高者获胜,priority是优先权为0~1000的值,相当于额外增加0~1000的票数。选举结果:票数高者获胜;若票数相同,数据新者获胜。
1.配置复制集的优先级
1)重新配置4个节点的MongoDB复制集,设置两个标准节点,一个被动节点和一个仲裁节点。
[root@localhost etc]# mongo> cfg={"_id":"kgcrs","members":[{"_id":0,"host":"192.168.126.132:27017","priority":100},{"_id":1,"host":"192.168.126.132:27018","priority":100},{"_id":2,"host":"192.168.126.132:27019","priority":0},{"_id":3,"host":"192.168.126.132:27020","arbiterOnly":true}]}> rs.initiate(cfg) //重新配置//kgcrs:SECONDARY> rs.isMaster(){ "hosts" : [ //标准节点// "192.168.126.132:27017", "192.168.126.132:27018" ], "passives" : [ //被动节点// "192.168.126.132:27019" ], "arbiters" : [ //仲裁节点// "192.168.126.132:27020"
2)模拟主节点故障
如果主节点出现故障,另一个标准节点将会选举成为新的主节点
[root@localhost etc]# mongod -f /etc/mongod.conf --shutdown //标准节点27017//[root@localhost etc]# mongo --port 27018 //此时会选举第二个标准节点为主节点//kgcrs:PRIMARY> rs.status() "_id" : 0, "name" : "192.168.126.132:27017", "health" : 0, //宕机状态// "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) "_id" : 1, "name" : "192.168.126.132:27018", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", //标准节点// "uptime" : 879, "optime" : { "ts" : Timestamp(1531817473, 1), "t" : NumberLong(2) "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //被动节点// "uptime" : 569, "optime" : { "ts" : Timestamp(1531817473, 1), "t" : NumberLong(2) "_id" : 3, "name" : "192.168.126.132:27020", "health" : 1, "state" : 7, "stateStr" : "ARBITER", //仲裁节点// "uptime" : 569,
3)模拟所有标准节点出现故障
所有标准节点都出现故障,被动节点也不能成为主节点
[root@localhost etc]# mongod -f /etc/mongod2.conf --shutdown //关闭标准节点27018//[root@localhost etc]# mongo --port 27019kgcrs:SECONDARY> rs.status() "_id" : 0, "name" : "192.168.126.132:27017", "health" : 0, //宕机状态// "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "_id" : 1, "name" : "192.168.126.132:27018", "health" : 0, //宕机状态// "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", //被动节点// "uptime" : 1403, "_id" : 3, "name" : "192.168.126.132:27020", "health" : 1, "state" : 7, "stateStr" : "ARBITER", //仲裁节点//
MongoDB复制集管理
1.配置允许在从节点读取数据
默认MongoDB复制集的从节点不能读取数据,可以使用rs.slaveOk()命令允许能够在从节点读取数据。
[root@localhost etc]# mongo --port 27017kgcrs:SECONDARY> show dbs //读取不到数据库信息//2018-07-17T17:11:31.570+0800 E QUERY [thread1] Error: listDatabases failed:{ "operationTime" : Timestamp(1531818690, 1), "ok" : 0, "errmsg" : "not master and slaveOk=false", "code" : 13435, "codeName" : "NotMaste kgcrs:SECONDARY> rs.slaveOk()kgcrs:SECONDARY> show dbsadmin 0.000GBconfig 0.000GBlocal 0.000GB
2.查看复制状态信息
可以使用 rs.printReplicationInfo()和rs.printSlaveReplicationInfo()命令查看复制集状态。
kgcrs:SECONDARY> rs.printReplicationInfo()configured oplog size: 990MBlog length start to end: 2092secs (0.58hrs)oplog first event time: Tue Jul 17 2018 16:41:48 GMT+0800 (CST)oplog last event time: Tue Jul 17 2018 17:16:40 GMT+0800 (CST)now: Tue Jul 17 2018 17:16:46 GMT+0800 (CST)kgcrs:SECONDARY> rs.printSlaveReplicationInfo()source: 192.168.126.132:27017 syncedTo: Tue Jul 17 2018 17:16:50 GMT+0800 (CST) 0 secs (0 hrs) behind the primary source: 192.168.126.132:27019 syncedTo: Tue Jul 17 2018 17:16:50 GMT+0800 (CST) 0 secs (0 hrs) behind the primary
3.部署认证复制
kgcrs:PRIMARY> use adminkgcrs:PRIMARY> db.createUser({"user":"root","pwd":"123","roles":["root"]})[root@localhost ~]# vim /etc/mongod.conf //分别编辑四个配置文件//....security: keyFile: /usr/bin/kgcrskey1 //验证路径// clusterAuthMode: keyFile //验证类型// [root@localhost ~]# vim /etc/mongod2.conf [root@localhost ~]# vim /etc/mongod3.conf [root@localhost ~]# vim /etc/mongod4.conf [root@localhost bin]# echo "kgcrs key"> kgcrskey1 //生成4个实例的密钥文件//[root@localhost bin]# echo "kgcrs key"> kgcrskey2[root@localhost bin]# echo "kgcrs key"> kgcrskey3[root@localhost bin]# echo "kgcrs key"> kgcrskey4[root@localhost bin]# chmod 600 kgcrskey{1..4}[root@localhost bin]# mongod -f /etc/mongod.conf //重启4个实例//[root@localhost bin]# mongod -f /etc/mongod2.conf[root@localhost bin]# mongod -f /etc/mongod3.conf[root@localhost bin]# mongod -f /etc/mongod4.conf[root@localhost bin]# mongo --port 27017 //进入标准节点中//kgcrs:PRIMARY> show dbs //无法查看数据库//kgcrs:PRIMARY> rs.status() //无法查看复制集//kgcrs:PRIMARY> use admin //身份登录验证//kgcrs:PRIMARY> db.auth("root","123")kgcrs:PRIMARY> show dbs //可以查看数据库//admin 0.000GBconfig 0.000GBlocal 0.000GBkgcrs:PRIMARY> rs.status() //可以查看复制集//"_id" : 0, "name" : "192.168.126.132:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 411,"_id" : 1, "name" : "192.168.126.132:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 324, "_id" : 2, "name" : "192.168.126.132:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 305,"_id" : 3, "name" : "192.168.126.132:27020", "health" : 1, "state" : 7, "stateStr" : "ARBITER", "uptime" : 280,