Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

Large tree

5 boring performance improvements in MongoDB 3.4 that matter

Henrik Ingo
Senior Performance Engineer

HighLoad++, 2017

v2

@h_ingo

5 boring topics

  1. Faster initial sync for replica nodes
  2. Moving balancer process from the mongos to config server
  3. Wire compression
  4. Removing throttling from shard re-balancing
  5. Parallel migrations between shards
  6. Future improvements to expect

1. Faster initial sync

Initial sync recap

MongoDB sharded cluster with a chunk migration happening

At scale, ops become more interesting...

MongoDB 3.4: Faster initial sync

Main improvement: rebuild all indexes in parallel

We test this with a 1 TB sample of a real world DB

MongoDB 3.2 initial sync: ~ 1 week

MongoDB 3.4 initial sync: < 24 h

Balancer recap

MongoDB sharded cluster with a chunk migration happening

A MongoDB cluster:

2. Moving balancer to config server

MongoDB sharded cluster with a chunk migration happening

3. Compression of Wire Protocol (SERVER-3018)

4. Balancer throttling

Throttling & balancer

Configurability of the balancer

Example

use config
db.settings.update(
   { "_id" : "balancer" },
   { $set : { "_secondaryThrottle" : false ,
              "writeConcern": { "w": 1 } } },
   { upsert : true }
)

Let's benchmark this

YCSB load with 30 threads x 100 batchsize

3 shards * 3 replicas

MongoDB 3.2.4

AWS c3.large with standard EBS*

Throughput: 5k - 25k docs / sec **

 

*) Test is short enough that boosting provides good EBS perf.
**) This is a very low throughput, which is the point of this test.
This test demonstrates a common effect where sharded clusters become unbalanced
even with a simple insert workload when not using hashed shard key.

$ ./bin/ycsb load mongodb-async -s -P workloads/throttlingtest -threads 30
$ cat workloads/throttlingtest 
mongodb.url=mongodb://throttl-4.henrikingo3.8300.mongodbdns.com:27017/test
mongodb.batchsize=100
recordcount=50000000


operationcount=1000
workload=com.yahoo.ycsb.workloads.CoreWorkload

readallfields=true

readproportion=0.5
updateproportion=0.5
scanproportion=0
insertproportion=0

requestdistribution=zipfian

What did we learn?

5. Parallel chunk migrations

MongoDB sharded cluster with a chunk migration happening

As with initial sync, adding a new shard to a cluster is easy. Just add an empty shard, and the balancer starts moving data to it.

At scale, ops become more interesting...

MongoDB sharded cluster with 10 shards MongoDB sharded cluster with 20 shards and 1 migration happening

MongoDB 3.2: Only 1 migration at a time.
All new shards but one remain empty.
The DBA just needs to wait... but how long?

MongoDB 3.4: N/2 migrations at a time.
Double nr of shards in same time as adding 1 shard.

MongoDB sharded cluster with 20 shards and 10 migrations happening

Future work