千家信息网

MongoDB sharding分片

发表于:2025-01-21 作者:千家信息网编辑
千家信息网最后更新 2025年01月21日,背景当MongoDB存储海量的数据时,一台机器可能不足以存储数据,也可能不足以提供可接受的读写吞吐量。这时,我们就可以通过在多台机器上分割数据,使得数据库系统能存储和处理更多的数据。1、MongoDB
千家信息网最后更新 2025年01月21日MongoDB sharding分片

背景

当MongoDB存储海量的数据时,一台机器可能不足以存储数据,也可能不足以提供可接受的读写吞吐量。这时,我们就可以通过在多台机器上分割数据,使得数据库系统能存储和处理更多的数据。


1、MongoDB sharding简介

三种角色:

配置服务器(config):是一个独立的mongod进程,保存集群和分片的元数据,即各分片包含了哪些数据的信息。

路由服务器(mongos):起到一个路由的功能,供程序连接。本身不保存数据,在启动时从配置服务器加载集群信息.

分片服务器(sharding):是一个独立mongod进程,保存数据信息。可以是一台服务器,如果想要高可用也可以配置成副本集。


2、实验环境

两台机器的IP:

172.16.101.54 sht-sgmhadoopcm-01

172.16.101.55 sht-sgmhadoopnn-01


config server:

172.16.101.55:27017


mongos:

172.16.101.55:27018


sharding:

172.16.101.54:27017

172.16.101.54:27018

172.16.101.54:27019


2、启动config服务

修改配置服务器的配置文件,主要是参数clusterRole指定角色为configsvr

[root@sht-sgmhadoopnn-01 mongodb]# cat /etc/mongod27017.confsystemLog:   destination: file   path: "/usr/local/mongodb/log/mongod27017.log"   logAppend: true   storage:   dbPath: /usr/local/mongodb/data/db27017   journal:      enabled: true      processManagement:   fork: true   pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pidnet:   port: 27017   bindIp: 0.0.0.0   setParameter:   enableLocalhostAuthBypass: false   sharding:   clusterRole: configsvr   archiveMovedChunks: true

[root@sht-sgmhadoopnn-01 mongodb]# bin/mongod --config /etc/mongod27017.conf

warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default

about to fork child process, waiting until server is ready for connections.

forked process: 31033

child process started successfully, parent exiting


3、启动mongos服务

修改路由服务器的配置文件,主要是参数configDB指定config服务器的IP和port,不需要配置有关数据文件的信息,因为路由服务器不存储数据

[root@sht-sgmhadoopnn-01 mongodb]# cat /etc/mongod27018.confsystemLog:   destination: file   path: "/usr/local/mongodb/log/mongod27018.log"   logAppend: true   processManagement:   fork: true   pidFilePath: /usr/local/mongodb/data/db27018/mongod27018.pid   net:   port: 27018   bindIp: 0.0.0.0   setParameter:   enableLocalhostAuthBypass: false   sharding:   autoSplit: true   configDB: 172.16.101.55:27017   chunkSize: 64


[root@sht-sgmhadoopnn-01 mongodb]# bin/mongos --config /etc/mongod27018.conf

warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default

2018-11-10T18:57:13.705+0800 W SHARDING running with 1 config server should be done only for testing purposes and is not recommended for production

about to fork child process, waiting until server is ready for connections.

forked process: 31167

child process started successfully, parent exiting


4、启动sharding服务

就是一个普通的mongodb进程,普通的配置文件

[root@sht-sgmhadoopcm-01 mongodb]# cat /etc/mongod27017.confsystemLog:   destination: file   path: "/usr/local/mongodb/log/mongod27017.log"   logAppend: true   storage:   dbPath: /usr/local/mongodb/data/db27017   journal:      enabled: true      processManagement:   fork: true   pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid   net:   port: 27017   bindIp: 0.0.0.0   setParameter:   enableLocalhostAuthBypass: false

[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27017.conf

[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27018.conf

[root@sht-sgmhadoopcm-01 mongodb]# bin/mongod --config /etc/mongod27019.conf


5、登陆mongos服务并添加sharding信息

[root@sht-sgmhadoopnn-01 mongodb]# bin/mongo --port=27018

mongos> sh.addShard("172.16.101.54:27017")

{ "shardAdded" : "shard0000", "ok" : 1 }

mongos> sh.addShard("172.16.101.54:27018")

{ "shardAdded" : "shard0001", "ok" : 1 }

mongos> sh.addShard("172.16.101.54:27019")

{ "shardAdded" : "shard0002", "ok" : 1 }


查看集群分片信息

mongos> sh.status()--- Sharding Status ---  sharding version: {    "_id" : 1,    "minCompatibleVersion" : 5,    "currentVersion" : 6,    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")}  shards:    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }  balancer:    Currently enabled:  yes    Currently running:  no    Failed balancer rounds in last 5 attempts:  0    Migration Results for the last 24 hours:        No recent migrations  databases:    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
mongos> db.runCommand({listshards:1}){    "shards" : [        {            "_id" : "shard0000",            "host" : "172.16.101.54:27017"        },        {            "_id" : "shard0001",            "host" : "172.16.101.54:27018"        },        {            "_id" : "shard0002",            "host" : "172.16.101.54:27019"        }    ],    "ok" : 1}


6、开启分片

需要执行分片的库和集合,以及分片模式,分片模式分为两种hash和range

[root@sht-sgmhadoopnn-01 mongodb]# bin/mongo --port=27018

分片库是testdb

mongos> sh.enableSharding("testdb")

{ "ok" : 1 }


(1)hash分片模式测试

分片的集合是collection1,根据id进行hash分片

mongos> sh.shardCollection("testdb.collection1",{"_id":"hashed"})

{ "collectionsharded" : "testdb.collection1", "ok" : 1 }


共插入10个测试document

mongos> use testdb

switched to db testdb

mongos> for(var i=0;i<10;i++){db.collection1.insert({name:"jack"+i});}

WriteResult({ "nInserted" : 1 })

mongos> sh.status()--- Sharding Status ---  sharding version: {    "_id" : 1,    "minCompatibleVersion" : 5,    "currentVersion" : 6,    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")}  shards:    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }  balancer:    Currently enabled:  yes    Currently running:  no    Failed balancer rounds in last 5 attempts:  0    Migration Results for the last 24 hours:        2 : Success  databases:    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0000" }    {  "_id" : "testdb",  "partitioned" : true,  "primary" : "shard0000" }        testdb.collection1            shard key: { "_id" : "hashed" }            chunks:                shard0000    2                shard0001    2                shard0002    2            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2)            { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3)            { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4)            { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5)            { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6)            { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)

查看每个sharding上的数据分布情况db.collection.stats()

mongos> db.collection1.stats(){    "sharded" : true,    "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",    "userFlags" : 1,    "capped" : false,    "ns" : "testdb.collection1",    "count" : 10,   #共十个document数据    "numExtents" : 3,    "size" : 480,    "storageSize" : 24576,    "totalIndexSize" : 49056,    "indexSizes" : {        "_id_" : 24528,        "_id_hashed" : 24528    },    "avgObjSize" : 48,    "nindexes" : 2,    "nchunks" : 6,    "shards" : {        "shard0000" : {            "ns" : "testdb.collection1",            "count" : 0,            "size" : 0,            "numExtents" : 1,            "storageSize" : 8192,            "lastExtentSize" : 8192,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 16352,            "indexSizes" : {                "_id_" : 8176,                "_id_hashed" : 8176            },            "ok" : 1        },        "shard0001" : {            "ns" : "testdb.collection1",            "count" : 6,            "size" : 288,            "avgObjSize" : 48,            "numExtents" : 1,            "storageSize" : 8192,            "lastExtentSize" : 8192,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 16352,            "indexSizes" : {                "_id_" : 8176,                "_id_hashed" : 8176            },            "ok" : 1        },        "shard0002" : {            "ns" : "testdb.collection1",            "count" : 4,            "size" : 192,            "avgObjSize" : 48,            "numExtents" : 1,            "storageSize" : 8192,            "lastExtentSize" : 8192,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 16352,            "indexSizes" : {                "_id_" : 8176,                "_id_hashed" : 8176            },            "ok" : 1        }    },    "ok" : 1}

分别登陆sharding节点查看数据分布,和通过命令db.collection.stats()看到的结果一致

可以发现节点27017上没有数据,节点27018上有6个document,节点27019上有4个document,出现这种情况的原因可能是插入的数据量太小,没有分布均匀,数据量越大,分布越均匀。

[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27017/testdb

> db.collection1.find()


[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27018/testdb

> db.collection1.find()

{ "_id" : ObjectId("5be6c467e6467cc8077da816"), "name" : "jack1" }

{ "_id" : ObjectId("5be6c467e6467cc8077da817"), "name" : "jack2" }

{ "_id" : ObjectId("5be6c467e6467cc8077da818"), "name" : "jack3" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81a"), "name" : "jack5" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81c"), "name" : "jack7" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81e"), "name" : "jack9" }


[root@sht-sgmhadoopcm-01 mongodb]# bin/mongo 172.16.101.54:27019/testdb

> db.collection1.find()

{ "_id" : ObjectId("5be6c467e6467cc8077da815"), "name" : "jack0" }

{ "_id" : ObjectId("5be6c467e6467cc8077da819"), "name" : "jack4" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81b"), "name" : "jack6" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81d"), "name" : "jack8" }


(2)range分片模式测试

分片的集合是collection2,根据name进行range分片

mongos> sh.shardCollection("testdb.collection2",{"name":1})

{ "collectionsharded" : "testdb.collection2", "ok" : 1 }

mongos> for(var i=0;i<1000;i++){db.collection2.insert({name:"jack"+i});}

WriteResult({ "nInserted" : 1 })

mongos> sh.status()--- Sharding Status ---  sharding version: {    "_id" : 1,    "minCompatibleVersion" : 5,    "currentVersion" : 6,    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")}  shards:    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }  balancer:    Currently enabled:  yes    Currently running:  no    Failed balancer rounds in last 5 attempts:  0    Migration Results for the last 24 hours:        4 : Success  databases:    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0000" }    {  "_id" : "testdb",  "partitioned" : true,  "primary" : "shard0000" }        testdb.collection1            shard key: { "_id" : "hashed" }            chunks:                shard0000    2                shard0001    2                shard0002    2            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2)            { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3)            { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4)            { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5)            { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6)            { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)        testdb.collection2            shard key: { "name" : 1 }            chunks:                shard0000    1                shard0001    1                shard0002    1            { "name" : { "$minKey" : 1 } } -->> { "name" : "jack1" } on : shard0001 Timestamp(2, 0)            { "name" : "jack1" } -->> { "name" : "jack5" } on : shard0002 Timestamp(3, 0)            { "name" : "jack5" } -->> { "name" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3, 1)


mongos> db.collection2.stats(){    "sharded" : true,    "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",    "userFlags" : 1,    "capped" : false,    "ns" : "testdb.collection2",    "count" : 1000,    "numExtents" : 6,    "size" : 48032,    "storageSize" : 221184,    "totalIndexSize" : 130816,    "indexSizes" : {        "_id_" : 65408,        "name_1" : 65408    },    "avgObjSize" : 48.032,    "nindexes" : 2,    "nchunks" : 3,    "shards" : {        "shard0000" : {            "ns" : "testdb.collection2",            "count" : 555,            "size" : 26656,            "avgObjSize" : 48,            "numExtents" : 3,            "storageSize" : 172032,            "lastExtentSize" : 131072,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 65408,            "indexSizes" : {                "_id_" : 32704,                "name_1" : 32704            },            "ok" : 1        },        "shard0001" : {            "ns" : "testdb.collection2",            "count" : 1,            "size" : 48,            "avgObjSize" : 48,            "numExtents" : 1,            "storageSize" : 8192,            "lastExtentSize" : 8192,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 16352,            "indexSizes" : {                "_id_" : 8176,                "name_1" : 8176            },            "ok" : 1        },        "shard0002" : {            "ns" : "testdb.collection2",            "count" : 444,            "size" : 21328,            "avgObjSize" : 48,            "numExtents" : 2,            "storageSize" : 40960,            "lastExtentSize" : 32768,            "paddingFactor" : 1,            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",            "userFlags" : 1,            "capped" : false,            "nindexes" : 2,            "totalIndexSize" : 49056,            "indexSizes" : {                "_id_" : 24528,                "name_1" : 24528            },            "ok" : 1        }    },    "ok" : 1}

参考链接

Sharding

0