Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

Measuring performance variability of EC2

Henrik Ingo (cc by)
HighLoad++ 2017 v1

Note: This presentation will work on Chrome only. Firefox will just show the cloud image that's on front. Chrome will zoom in behind it, so that you see the stuff behind it.

@h_ingo

Agenda

MongoDB performance testing

100+ Projects
1500+ Hosts
100+ Build Variants
400k hours/month
Performance = 5% of that
(more in $$$)

github.com/evergreen-ci

Microbenchmarks

System performance test
EC2 (this talk)

c3.8xlarge, SSD

The goal

Repeatable results

(NOT max performance)

AssumptionTrue / False
Dedicated instance = more stable performanceNot tested
Placement groups minimize network latency & varianceNot used yet
Different availability zones have different hardwareNot tested yet
For write heavy tests, noise comes from diskFalse
Ephemeral (SSD) disks have least varianceFalse
There are good and bad EC2 instancesFalse
Just use i2 instances (better SSD)False (True in theory)
You can't use cloud for performance testingFalse

We tested many aspects of EC2 and our own system. To help you follow the presentation, I will reveal up front what were the assumptions made when the system was first built, and how the assumptions fared in our testing.

The rest of the presentation I will then share how we tested different EC2 configurations and came to these conclusions.

It's common to see engineers making design decision based on things they read on the internet. As you can see, our system included LOTS of them!! I call it witchcraft. Old wives tales, not based in science. The point of this presentation is that that is bad idea! There are no short cuts. Assume nothing. Measure everything.

Method for testing noise in our EC2 clusters

What is noise?

noise = (max - min) / median

Goal is to minimize this single metric

There are good and bad EC2 instancesFalse

(min - median - max)
for each test & thread level

mmapv1 left, wiredTiger right

insert_vector, insert_ttl, index_build
highest; jtrue lowest

Ephemeral (SSD) disks have least varianceFalse
Remote EBS disks have unreliable performanceFalse (piops)
Just use i2 instances (better SSD)False (True in theory)
i2.8xlarge has much more RAM, and wiredTiger cacheSizeGB default is 50% of RAM.
This caused checkpointing issues not seen on c3.8xlarge.
At this point we switched to c3.8xlarge + EBS PIOPS.

CPU tuning

For write heavy tests, noise comes from diskFalse

Network noise tests

Canary tests

Summary

Assume nothing. Measure everything.

You can't use cloud for performance testingFalse

Image credits: l2f1 @ Flickr (CC BY) belenko @ Flickr (CC BY) pinkmoose @ Flickr (CC BY) jo7ueb @ OpenClipart.org (PD) ivanlasso @ OpenClipart.org (PD)