Mongo DB sharding end to end example in mac

In this article there will be three sections. First one is to setup all the mongos and mongod processes/servers. Next step is to use this setup to shard some actual data. Last step is to understand how this sharding helps in increasing availability.

  1. Create 2 mongod config servers processes:

mkdir -p cfg1 cfg2
mongod — configsvr — replSet Config_server_replicass — logpath “cfg1.log” — dbpath cfg1 — port 20000 &
mongod — configsvr — replSet Config_server_replicass — logpath “cfg2.log” — dbpath cfg2 — port 20001 &

Note: configsvr flag here indicates that these mongo instances are config servers which will be used by mongos sever for storing meta data.

Login into mongoDB with port 20000 and mark these 2 servers as a single replica set:

mongo — host 20000

config = {
_id: “Config_server_replicas”, members:[
{_id: 0, host: “localhost:20000”},
{_id: 1, host: “localhost:20001”}
]
};

rs.initiate(config);

Exit mongo host

After 5 secs, Check rs.status(); One of the above servers will be chosen as primary and other as secondary replica.

2. Create first shard servers processes:

mkdir -p shrad1_rep1 shrad1_rep2
mongod — shardsvr — replSet Tarun_shrad1 — logpath “shrad1rep1.log” — dbpath shrad1_rep1 — port 40000 &
mongod — shardsvr — replSet Tarun_shrad1 — logpath “shrad1rep2.log” — dbpath shrad1_rep2 — port 40001 &

Note: shardsvr option is required to specify if it’s a shard server

mongo — host 40000

config = {
_id: “Tarun_shrad1”, members:[
{_id: 0, host: “localhost:40000”},
{_id: 1, host: “localhost:40001”}
]
};

Run this:
rs.initiate(config);

Exit mongo host

3. Create second shard servers processes:

mkdir -p shrad2_rep1 shrad2_rep2
mongod — shardsvr — replSet Tarun_shrad2 — logpath “shrad2rep1.log” — dbpath shrad2_rep1 — port 41000 &
mongod — shardsvr — replSet Tarun_shrad2 — logpath “shrad2rep2.log” — dbpath shrad2_rep2 — port 41001 &

mongo — host 40000

config = {
_id: “Tarun_shrad2”, members:[
{_id: 0, host: “localhost:41000”},
{_id: 1, host: “localhost:41001”}
]
};

Run this:
rs.initiate(config);

Exit mongo host

4. Create a mongos server:

mongos — configdb Config_server_replicass/localhost:20000,localhost:20001 — bind_ip 0.0.0.0 — port 50000 &

Note: configdb option is required to specify which config server will this mongos server use.

5. Adding shards to mongos:

Login into mongos server:

mongo — host 50000

sh.addShrad(“Tarun_shrad1/localhost:40000,localhost:40001”);

sh.status();

sh.addShard(“Tarun_shrad2/localhost:41000,localhost:41001”);

sh.status();

The setup should have been done by now. Use ps aux | grep mongo command to check if all the above processes are running properly. Also, login into each replica set in above steps and verify rs.status().

Note: The applications will always communicate to mongos instances. They should’t be connecting to mongod shard instances.

1. Marking the database shardable:

use testDB -> This is the database

db.createCollections(“pubg_scores”); -> This is the collection

Use this command to enable sharding on the database level:

sh.enableSharding(“testDB”);

2. Shard the actual collection:

sh.shardCollection(“testDB.pubg_scores”,{“title”: “hashed”});

3. Add some data into the collection:

Insert entries into the collection and observer the shard distribution: db.pubg_scores.insertOne(“title”: “Tarun1”);

db.pubg_scores.insertOne(“title”: “Tarun2”);

db.pubg_scores.insertOne(“title”: “Tarun3”);

db.pubg_scores.getShardDistribution(); -> Tells you about the sharding details i.e where the above 3 entries are written to i.e shard1 or shard2

To cross check if this is working fine, you can login into the shard mongo instances and query for the data using db.pubg_scores.find();

Replica sets gives us redundancy by duplicating the same data into multiple servers. The secondary replicas serve as backups. We can even server reads from secondary replicas in mongo DB to distribute read load to secondary replicas instead of primary replica.

Through Sharding, we basically divide the data into pieces(Shard meaning) and store it in different mongo servers. The load is spread across multiple servers. Even if one of the server goes down, only certain set of users are affected and all other shards remain to function the same, hence increasing availability of our application.

Thanks for reading. Please leave a clap if this article helped you. Leave a comment if you’ve any query.

IITian | Problem solver | Deep thinker | Story writer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store