自建MongoDB实践:MongoDB 分片集群
提高性能的第二种方法是使用具有相同容量的服务器并增加其数量,一般我们称之为水平扩容(或横向扩容)。
当数据量比较大的时候,我们需要把数据分片运行在不同的机器中,以降低 CPU、内存和 IO 的压力,Sharding 就是数据库分片技术。
MongoDB 分片技术类似 MySQL 的水平切分和垂直切分,数据库主要由两种方式做 Sharding:垂直扩展和横向切分。
垂直扩展:添加更多的 CPU,内存,磁盘空间等。
横向切分:则是通过数据分片的方式,通过集群统一提供服务。
一个 MongoDB 分片集群由以下组件组成:
- shard: 每个分片都包含分片数据的一个子集。每个分片以副本集部署。
- mongos: Mongos 充当查询路由器,在客户端应用程序和分片集群之间提供接口。从 MongoDB 4.4 开始,mongos 可以支持 hedged 读取,以尽量减少延迟。
- config servers: 配置服务器存储集群的元数据和配置信息。
MongoDB 在 Collection 级别进行分片处理,在集群中的分片之间分发这些 Collection 数据。
一个生产环境的集群,请确保数据的冗余性及系统的高可用性。对于一个生产级别的分片集群,需要考虑一下几点:
- 部署一个 3 成员的复制集作为一个配置中心服务
- 每个分片部署为一个 3 成员的复制集
- 部署一个或多个 mongos 路由
环境准备
- 开始演示:
主机名 | IP | 角色 |
mongo01.tyun.cn | 10.20.20.19 | mongos1(27017),config1(27000),shard1 primary(27010) |
mongo02.tyun.cn | 10.20.20.11 | mongos2(27017),config2(27000),shard1 secondary(27010) |
mongo03.tyun.cn | 10.20.20.41 | mongos3(27017),config3(27000),shard1 secondary(27010) |
mongo04.tyun.cn | 10.20.20.14 | shard2 primary(27010) |
mongo05.tyun.cn | 10.20.20.53 | shard2 secondary(27010) |
mongo06.tyun.cn | 10.20.20.61 | shard2 secondary(27010) |
mongo07.tyun.cn | 10.20.20.62 | shard3 primary(27010) |
mongo08.tyun.cn | 10.20.20.89 | shard3 secondary(27010) |
mongo09.tyun.cn | 10.20.20.99 | shard3 secondary(27010) |
如果大家在演示该文档时,手头上的机器资源不充足的话,可以安排一台多个角色即可(使用不同的端口号),不一定非得一台机器一个角色。
- 环境拓扑如下:
这里我们使用了静态 DNS 解析,如果有条件,可以用 DNS 服务进行域名的配置解析。/etc/hosts 文件如下:
10.20.20.19 mongo01.tyun.cn cfg1.tyun.cn mongos1.tyun.cn
10.20.20.11 mongo02.tyun.cn cfg2.tyun.cn mongos2.tyun.cn
10.20.20.41 mongo03.tyun.cn cfg3.tyun.cn mongos3.tyun.cn
10.20.20.14 mongo04.tyun.cn
10.20.20.53 mongo05.tyun.cn
10.20.20.61 mongo06.tyun.cn
10.20.20.62 mongo07.tyun.cn
10.20.20.89 mongo08.tyun.cn
10.20.20.99 mongo09.tyun.cn
配置 Config Server
01准备配置文件
在 3 台配置节点上分别创建配置文件 /etc/mongo-cfg.conf,内容如下:
# cfg1.tyun.cn 的配置文件
(venv36) [root@mongo01 ~]# cat /etc/mongo-cfg.conf
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongo-cfg.log
storage:
dbPath: /var/lib/mongocfg
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongo-cfg.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27000
bindIp: cfg1.tyun.cn
sharding:
clusterRole: configsvr
replication:
replSetName: config
# cfg2.tyun.cn 的配置文件
(venv36) [root@mongo02 ~]# cat /etc/mongo-cfg.conf
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongo-cfg.log
storage:
dbPath: /var/lib/mongocfg
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongo-cfg.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27000
bindIp: cfg2.tyun.cn
sharding:
clusterRole: configsvr
replication:
replSetName: config
# cfg3.tyun.cn 的配置文件
(venv36) [root@mongo03 ~]# cat /etc/mongo-cfg.conf
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongo-cfg.log
storage:
dbPath: /var/lib/mongocfg
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 1
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongo-cfg.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27000
bindIp: cfg3.tyun.cn
sharding:
clusterRole: configsvr
replication:
replSetName: config
02启动 Config Server
在 3 台配置节点上分别执行如下命令:
[root@mongo01 ~]# systemctl start mongocfg
[root@mongo02 ~]# systemctl start mongocfg
[root@mongo03 ~]# systemctl start mongocfg
检查一下进程是否已经启动成功:
(venv36) [root@mongo01 ~]# ansible -i hosts 'cfg' -m shell -a "systemctl status mongocfg" |grep "Active: active (running)"
Active: active (running) since Fri 2022-08-05 05:24:56 UTC; 1min 4s ago
Active: active (running) since Fri 2022-08-05 05:25:25 UTC; 35s ago
Active: active (running) since Fri 2022-08-05 05:25:36 UTC; 24s ago
03初始化 Config Server
登录到第一个节点上,这时还没有创建用户及密码,所以登录时没有指定密码也是可以登录的。
(venv36) [root@mongo01 ~]# mongo cfg1.tyun.cn:27000
MongoDB shell version v4.4.15
connecting to: mongodb://cfg1.tyun.cn:27000/test?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("651fb6a5-9e7e-43f9-91ee-1ae6a2b3365f") }
MongoDB server version: 4.4.15
>
> show dbs
> use test
switched to db test
> db.test.insert({a: 1})
WriteCommandError({
"ok" : 0,
"errmsg" : "command insert requires authentication",
"code" : 13,
"codeName" : "Unauthorized"
})
除了创建用户,其实什么也不能操作。接下来的第一件事情是创建用户及密码:
> db.createUser({user: "root", pwd: "root123", roles: [{role: "root", db: "admin" }]})
接着初始化 Config Server:
(venv36) [root@mongo01 ~]# mongo -u root -p --host cfg1.tyun.cn:27000 --authenticationDatabase admin
MongoDB shell version v4.4.15
Enter password:
connecting to: mongodb://cfg1.tyun.cn:27000/?authSource=admin&compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("a1827479-b741-4f8b-be49-5ca0be4852aa") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting:
2022-08-05T06:30:24.135+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
---
---
Enable MongoDB's free cloud-based monitoring service, which will then receive and display
metrics about your deployment (disk utilization, CPU, operation statistics, etc).
The monitoring data will be available on a MongoDB website with a unique URL accessible to you
and anyone you share the URL with. MongoDB may use this information to make product
improvements and to suggest MongoDB products and deployment options to you.
To enable free monitoring, run the following command: db.enableFreeMonitoring()
To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
>
> rs.initiate({
_id: "config",
"members" : [
{
"_id": 0,
"host" : "cfg1.tyun.cn:27000"
},
{
"_id": 1,
"host" : "cfg2.tyun.cn:27000"
},
{
"_id": 2,
"host" : "cfg3.tyun.cn:27000"
}
]
});
{ "ok" : 1 }
需要等待 10 秒钟左右,3 个 Config Server 会通过选举产生主节点。
注意提示符变化:
config:SECONDARY>
config:PRIMARY>
......
config:PRIMARY> config:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
config:PRIMARY> use admin
switched to db admin
config:PRIMARY> show users
{
"_id" : "admin.admin",
"userId" : UUID("0c0d5bc1-062c-4204-963f-bba842ffda7d"),
"user" : "admin",
"db" : "admin",
"roles" : [
{
"role" : "dbAdminAnyDatabase",
"db" : "admin"
},
{
"role" : "userAdminAnyDatabase",
"db" : "admin"
}
],
"mechanisms" : [
"SCRAM-SHA-1",
"SCRAM-SHA-256"
]
}
{
"_id" : "admin.root",
"userId" : UUID("aa54a433-e9a2-452b-bd1d-d6ef54f4a46e"),
"user" : "root",
"db" : "admin",
"roles" : [
{
"role" : "root",
"db" : "admin"
}
],
"mechanisms" : [
"SCRAM-SHA-1",
"SCRAM-SHA-256"
]
}
至此,Config Server 配置完成。
配置 Replica Set
Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为:
- mongo04.tyun.cn:27010
- mongo05.tyun.cn:27010
- mongo06.tyun.cn:27010
配置 Mongos
01准备 mongos 配置文件
sharding:
configDB: <configReplSetName>/cfg1.example.net:27019,cfg2.example.net:27019
net:
bindIp: localhost,<hostname(s)|ip address(es)>
一个相对完整的配置文件(以 mongos1 为例):
[root@mongo01 ~]# cat /etc/mongos.conf
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongos.log
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongos.pid
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27017
bindIp: mongos1.tyun.cn
# security:
# authorization: enabled
# keyFile: /etc/mongod.keyfile
sharding:
configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000
02启动 mongos
[root@mongo01 ~]# mongos \
--bind_ip mongos1.tyun.cn \
--port 27017 \
--logpath /var/log/mongodb/mongos.log \
--configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
--fork
[root@mongo02 ~]# mongos \
--bind_ip mongos2.tyun.cn \
--port 27017 \
--logpath /var/log/mongodb/mongos.log \
--configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
--fork
[root@mongo03 ~]# mongos \
--bind_ip mongos3.tyun.cn \
--port 27017 \
--logpath /var/log/mongodb/mongos.log \
--configdb config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000 \
--fork
mongos 也可以通过配置文件的形式启动:
[root@mongo01 ~]# cat /etc/mongos.conf
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongos.log
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongos.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27017
bindIp: mongos1.tyun.cn
security:
# authorization: enabled
keyFile: /etc/mongo.keyfile
sharding:
configDB: config/cfg1.tyun.cn:27000,cfg2.tyun.cn:27000,cfg3.tyun.cn:27000
启动命令如下:
[root@mongo01 ~]# mongos -f /etc/mongos.conf
添加 shard1 分片到分片集
增加第一个分片 shard1 到集群中:
[root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
MongoDB shell version v4.4.15
connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("f1ade2c4-c071-4e8a-9fbb-f1093e9d9753") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting:
2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
---
mongos> show dbs
admin 0.000GB
config 0.000GB
mongos>
mongos> sh.addShard("shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010");
{
"shardAdded" : "shard1",
"ok" : 1,
"operationTime" : Timestamp(1659691403, 8),
"$clusterTime" : {
"clusterTime" : Timestamp(1659691403, 8),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
}
shards:
{ "_id" : "shard1", "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010", "state" : 1 }
active mongoses:
"4.4.15" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
创建分片表
接下来我们创建一个测试库 test,然后在 test 库上创建集合 shard,并开启分片。
mongos> sh.enableSharding("test");
{
"ok" : 1,
"operationTime" : Timestamp(1659755432, 7),
"$clusterTime" : {
"clusterTime" : Timestamp(1659755432, 7),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.shardCollection("test.shard", {_id: 'hashed'});
{
"collectionsharded" : "test.shard",
"collectionUUID" : UUID("329f4308-bff9-453a-bec2-7f3a757d95dd"),
"ok" : 1,
"operationTime" : Timestamp(1659755452, 13),
"$clusterTime" : {
"clusterTime" : Timestamp(1659755452, 13),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
}
shards:
{ "_id" : "shard1", "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010", "state" : 1 }
active mongoses:
"4.4.15" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard1 1024
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "primary" : "shard1", "partitioned" : true, "version" : { "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"), "lastMod" : 1 } }
test.shard
shard key: { "_id" : "hashed" }
unique: false
balancing: true
chunks:
shard1 2 // 注意这里的输出
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard1 Timestamp(1, 0)
{ "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 1)
我们可以看到 shard1 中有 2 chunk。
插入测试数据:
mongos> use test
switched to db test
mongos> for (var i = 0; i < 100000; i++) {
db.shard.insert({i: i});
}
mongos> db.shard.find().limit(10)
{ "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
{ "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
{ "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83d"), "i" : 6 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83e"), "i" : 7 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83f"), "i" : 8 }
{ "_id" : ObjectId("62eddc26f659b8344f42c840"), "i" : 9 }
这时我们也可以登录到 shard1 复本集里面查看一下数据(找到主节点进行登录):
[root@mongo01 ~]# mongo --host mongo05.tyun.cn:27010
MongoDB shell version v4.4.15
connecting to: mongodb://mongo05.tyun.cn:27010/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("b14a4b9b-f6b9-48d5-980a-a7fd3bbf2d73") }
MongoDB server version: 4.4.15
---
shard1:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
local 0.004GB
test 0.006GB
shard1:PRIMARY> use test
switched to db test
shard1:PRIMARY> db.shard
db.shard
shard1:PRIMARY> db.shard.find().limit(6)
{ "_id" : ObjectId("62eddc26f659b8344f42c837"), "i" : 0 }
{ "_id" : ObjectId("62eddc26f659b8344f42c838"), "i" : 1 }
{ "_id" : ObjectId("62eddc26f659b8344f42c839"), "i" : 2 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83a"), "i" : 3 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83b"), "i" : 4 }
{ "_id" : ObjectId("62eddc26f659b8344f42c83c"), "i" : 5 }
shard1:PRIMARY>
添加 shard2 分片到分片集
Replica Set 的配置请参考 Replica Set 章节。分片 1 的集群节点为:
- mongo07.tyun.cn:27010
- mongo08.tyun.cn:27010
- mongo09.tyun.cn:27010
shard2 复本集验证:
[root@mongo01 ~]# mongo --host mongo07.tyun.cn:27010
shard2:PRIMARY> rs.status()
{
"set" : "shard2",
"date" : ISODate("2022-08-06T03:31:26.564Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"majorityVoteCount" : 2,
"writeMajorityCount" : 2,
"votingMembersCount" : 3,
"writableVotingMembersCount" : 3,
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1659756685, 1),
"t" : NumberLong(1)
},
"lastCommittedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1659756685, 1),
"t" : NumberLong(1)
},
"readConcernMajorityWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"appliedOpTime" : {
"ts" : Timestamp(1659756685, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1659756685, 1),
"t" : NumberLong(1)
},
"lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z")
},
"lastStableRecoveryTimestamp" : Timestamp(1659756625, 4),
"electionCandidateMetrics" : {
"lastElectionReason" : "electionTimeout",
"lastElectionDate" : ISODate("2022-08-06T03:30:25.877Z"),
"electionTerm" : NumberLong(1),
"lastCommittedOpTimeAtElection" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"lastSeenOpTimeAtElection" : {
"ts" : Timestamp(1659756615, 1),
"t" : NumberLong(-1)
},
"numVotesNeeded" : 2,
"priorityAtElection" : 1,
"electionTimeoutMillis" : NumberLong(10000),
"numCatchUpOps" : NumberLong(0),
"newTermStartDate" : ISODate("2022-08-06T03:30:25.915Z"),
"wMajorityWriteAvailabilityDate" : ISODate("2022-08-06T03:30:26.890Z")
},
"members" : [
{
"_id" : 0,
"name" : "mongo07.tyun.cn:27010",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 213,
"optime" : {
"ts" : Timestamp(1659756685, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2022-08-06T03:31:25Z"),
"lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1659756625, 1),
"electionDate" : ISODate("2022-08-06T03:30:25Z"),
"configVersion" : 1,
"configTerm" : -1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "mongo08.tyun.cn:27010",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 71,
"optime" : {
"ts" : Timestamp(1659756675, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1659756675, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2022-08-06T03:31:15Z"),
"optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
"lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
"lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.933Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncSourceHost" : "mongo07.tyun.cn:27010",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1,
"configTerm" : -1
},
{
"_id" : 2,
"name" : "mongo09.tyun.cn:27010",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 71,
"optime" : {
"ts" : Timestamp(1659756675, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1659756675, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2022-08-06T03:31:15Z"),
"optimeDurableDate" : ISODate("2022-08-06T03:31:15Z"),
"lastAppliedWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastDurableWallTime" : ISODate("2022-08-06T03:31:25.927Z"),
"lastHeartbeat" : ISODate("2022-08-06T03:31:25.890Z"),
"lastHeartbeatRecv" : ISODate("2022-08-06T03:31:24.872Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncSourceHost" : "mongo07.tyun.cn:27010",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1,
"configTerm" : -1
}
],
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1659756685, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1659756685, 1)
}
shard2:PRIMARY>
接着把 shard2 加入到分片集中(连接任意一台 mongos):
[root@mongo01 ~]# mongo --host mongos1.tyun.cn:27017
MongoDB shell version v4.4.15
connecting to: mongodb://mongos1.tyun.cn:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("1bb0a6ed-dad1-4440-95cb-2f60e0be506f") }
MongoDB server version: 4.4.15
---
The server generated these startup warnings when booting:
2022-08-05T09:17:59.537+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
2022-08-05T09:17:59.537+00:00: You are running this process as the root user, which is not recommended
---
mongos>
mongos> sh.addShard("shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010");
{
"shardAdded" : "shard2",
"ok" : 1,
"operationTime" : Timestamp(1659756859, 4),
"$clusterTime" : {
"clusterTime" : Timestamp(1659756859, 4),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("62ecc377dc19b0487fcd62e6")
}
shards:
{ "_id" : "shard1", "host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010", "state" : 1 }
active mongoses:
"4.4.15" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
31 : Success
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
config.system.sessions
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
shard1 994
shard2 30
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "primary" : "shard1", "partitioned" : true, "version" : { "uuid" : UUID("8c333889-11b2-4de0-9f54-f0c56b622124"), "lastMod" : 1 } }
test.shard
shard key: { "_id" : "hashed" }
unique: false
balancing: true
chunks:
shard1 1
shard2 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong(0) } on : shard2 Timestamp(2, 0)
{ "_id" : NumberLong(0) } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(2, 1)
mongos>
从输出可以看到,shard1 的 2 个 chunk,已经分配到了 shard2 上面了,这是 MongoDB 的自动均衡机制起作用了。
看看每个 Shard 的文档数量有多少?
mongos> status = db.shard.stats()
// 查看文档数量
mongos> status.shards.shard1.count
50184
// 过一段时间再次查看
mongos> status.shards.shard2.count
49816
// 比较一下两个分片的文档数量
mongos> status.shards.shard1.count - status.shards.shard2.count
368
从两个分片中的文档数量来看,数据存放基本是均衡的。
mongos> use admin
switched to db admin
mongos> db.runCommand({listShards: 1})
{
"shards" : [
{
"_id" : "shard1",
"host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
"state" : 1
},
{
"_id" : "shard2",
"host" : "shard2/mongo07.tyun.cn:27010,mongo08.tyun.cn:27010,mongo09.tyun.cn:27010",
"state" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1660384940, 3),
"$clusterTime" : {
"clusterTime" : Timestamp(1660384940, 3),
"signature" : {
"hash" : BinData(0,"kAzOU7gYu5MWoNSYPEZanw1KYd4="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
删除分片:
mongos> db.adminCommand( { removeShard: "shard2" } )
{
"msg" : "draining started successfully",
"state" : "started",
"shard" : "shard2",
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"testdb"
],
"ok" : 1,
"operationTime" : Timestamp(1660384982, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1660384982, 2),
"signature" : {
"hash" : BinData(0,"ToGrQJZSWqSfiFwe/Hop2eykOAM="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
查看移动的状态:
mongos> db.adminCommand( { removeShard: "shard2" } )
{
"msg" : "draining ongoing",
"state" : "ongoing", // 进行中
"remaining" : {
"chunks" : NumberLong(406), // 剩余
"dbs" : NumberLong(1),
"jumboChunks" : NumberLong(0)
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"testdb"
],
"ok" : 1,
"operationTime" : Timestamp(1660385198, 21),
"$clusterTime" : {
"clusterTime" : Timestamp(1660385198, 21),
"signature" : {
"hash" : BinData(0,"HVDmppA+MhUor9a72JKDjWErLKo="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
// 再次查看
mongos> db.adminCommand( { removeShard: "shard2" } )
{
"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : NumberLong(345), // 这里
"dbs" : NumberLong(1),
"jumboChunks" : NumberLong(0)
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"testdb"
],
"ok" : 1,
"operationTime" : Timestamp(1660385328, 3),
"$clusterTime" : {
"clusterTime" : Timestamp(1660385328, 3),
"signature" : {
"hash" : BinData(0,"Wi6BxDNErUjsHYTdVpvbiEyGUrw="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
// 一段时间后再次查看
mongos> db.adminCommand( { removeShard: "shard2" } )
{
"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : NumberLong(87), // 这里
"dbs" : NumberLong(1),
"jumboChunks" : NumberLong(0)
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"testdb"
],
"ok" : 1,
"operationTime" : Timestamp(1660385870, 3),
"$clusterTime" : {
"clusterTime" : Timestamp(1660385870, 6),
"signature" : {
"hash" : BinData(0,"R5LJzYTNv+s+aJaiJZVZ9arr+84="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
移动 DB 到其它分片:
mongos> db.adminCommand( { movePrimary: "testdb", to: "shard0" })
{
"ok" : 1,
"operationTime" : Timestamp(1660386323, 42852),
"$clusterTime" : {
"clusterTime" : Timestamp(1660386323, 42852),
"signature" : {
"hash" : BinData(0,"wpJWCc5pzEghDEgRjXl9NiA9Gxs="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
// 再次查看状态
mongos> db.adminCommand( { removeShard: "shard2" } )
{
"msg" : "removeshard completed successfully",
"state" : "completed",
"shard" : "shard2",
"ok" : 1,
"operationTime" : Timestamp(1660386353, 3),
"$clusterTime" : {
"clusterTime" : Timestamp(1660386353, 3),
"signature" : {
"hash" : BinData(0,"EoqSZ6a4MbSrQcBHH6rVAI1DtyA="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
mongos> db.runCommand({listShards: 1})
{
"shards" : [
{
"_id" : "shard1",
"host" : "shard1/mongo04.tyun.cn:27010,mongo05.tyun.cn:27010,mongo06.tyun.cn:27010",
"state" : 1
},
{
"_id" : "shard0",
"host" : "shard0/mongo01.tyun.cn:27010,mongo02.tyun.cn:27010,mongo03.tyun.cn:27010",
"state" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1660386367, 23),
"$clusterTime" : {
"clusterTime" : Timestamp(1660386367, 23),
"signature" : {
"hash" : BinData(0,"yBy7UjBzOh1RIbm4fj/q+Docptg="),
"keyId" : NumberLong("7128287226089177110")
}
}
}
总结
分片具有很大的灵活性。
不过,我们在执行某些操作时还存在一些限制。
我们将在以下列表中突出显示最重要的内容:
01group() 命令不起作用。我们应该使用 aggregate() 和聚合框架,或者 mapreduce()。
02db.eval() 命令不起作用,出于安全原因,在大多数情况下应将其禁用。
03更新操作时的 $isolated 选项不起作用。这是分片环境中缺少的功能。update() 的 $isolated 选项提供了保证,如果我们一次更新多个文档,其他读者和作者将看不到一些更新了新值的文档,而其他文档仍然具有旧值。这是在 unsharded 中实现的方式环境是通过持有全局写锁和/或将操作序列化到单个线程来确保对受 update() 影响的文档的每个请求都不会被其他线程/操作访问。此实现意味着它不是高性能的并且不支持任何并发,这禁止在分片环境中使用 $isolated 运算符。
04不支持查询的 $snapshot 运算符。find() 游标中的 $snapshot 运算符可防止文档由于在更新后被移动到磁盘上的不同位置,在结果中出现不止一次。$snapshot 运算符的操作成本很高,通常不是硬性要求。替代它的方法是对我们查询的字段使用索引,该字段的键在查询期间不会更改。
05如果我们的查询不包含分片键,索引将无法覆盖我们的查询。分片环境中的结果将来自磁盘,而不仅仅是来自索引。唯一的例外是如果我们只在内置的 _id 字段上查询并且只返回 _id 字段,在这种情况下,MongoDB 仍然可以使用内置索引覆盖查询。
06update()和remove()操作的工作方式不同。分片环境中的所有update()和remove()操作必须包括要受影响的文档的_id 或分片键;否则,mongos 路由器将不得不对所有集合、数据库和分片进行全表扫描,这在操作上会非常耗时。
07跨分片的唯一索引需要包含分片键作为索引的前缀。换句话说,要实现跨分片文档的唯一性,我们需要遵循 MongoDB 对分片遵循的数据分布。
08分片键的大小不得超过 512 字节。分片键索引必须在被分片的键字段和可选的其他字段上按升序排列,或者在其上的哈希索引。