Tuesday, July 16, 2019

MySQL Master Replication Crash Safety Part #5a: making things faster without reducing durability - using better hardware

This is a follow-up post in the MySQL Master Replication Crash Safety series.  In the previous posts, we explored the consequences of reducing durability on masters (different data inconsistencies after an OS crash depending on replication type) and the performance boost associated with this configuration (benchmark results done on Google Cloud Platform / GCP).  The consequences are summarised in the introduction of Part #4, and the tests are the subject of this last post.  Also in this last post, I mentioned that my results for high durability are limited by the sync latencies of GCP persistent disks.  As I found a system with better latencies, I am able to present new results.  And this system is a vm in Amazon Web Services (AWS) with local SSD.

The previously presented results for the co-located tests (the details are in Part #4) are the following (I am omitting replication results as they are relatively similar):
  • ~220 TPS for high durability (sync_binlog = 1 and trx_commit = 1)
  • ~6280 TPS for low durability (sync_binlog = 0 and trx_commit = 2)
Before continuing, I have to tell you that this post has an annexe: Under the Hood.  Benchmarking is a complex art and reporting results accurately is even harder.  If all the details were put in a single article, it would make a very long post.  The links to the annexe should satisfy readers eager for more details.

So my new test environment with better sync latencies is a vm in AWS with local SSD.  The write latencies I am observing with ioping are ~50 us (they were ~590 us in my previous tests with GCP persistent disks).  GCP also has local SSDs, but their write latencies are 2.27 ms and 6.22 ms for SCSI and NVMe respectively, which is worse than persistent disks (if you find this weird, you are not alone: I have a ticket open with Google about this; and as Mark Callaghan commented on my last post, this might be because GCP Local SSD are only fast for reads and still slow for writes).  The details of those numbers are in the GCP vs AWS latencies for local SSD section of the annexe.

Before giving the results with faster disks, we have to think about the consequences of running a database on local SSD in a cloud environment.  The local SSDs are not persistent (they are volatile), so their content could be lost in some failure scenarios, including failure of the underlying disk, if the instance stops (normal shutdown or crash), or in case of instance termination.  If you choose to run MySQL on such volatile storage, you need to design your infrastructure to be able to cope with those failure scenarios (and you might simply want to run MySQL with low durability unless you also want to take advantage of very fast reads of local SSDs, but this is a different benchmark).  A solution could be failing-over to slaves, but this is not trivial to implement, so I would be very careful about deploying such volatile architecture in production.

So now the results !  With a vm in AWS, still using the same dbdeployer and sysbench tests as in Part #4, I get the following results:
  • Co-located, high durability: ~3790 TPS
  • Co-located, low durability: ~11,190 TPS
The high durability numbers are very different from my previous tests (GCP persistent SSD disk was giving ~220 TPS).  In an AWS environment with local SSD, I get throughput that are ~17 times better than GCP with persistent SSD.  The AWS environment is very close to running MySQL on-premises on physical servers with SSD or with a battery-backed-up RAID cache.  If you plan to move to the cloud from on-premises, make sure you take this into account.

If you plan to move MySQL from on-premises to the cloud,
make sure to take higher disk sync latencies into account !

And I also have 50% faster results with low durability in AWS compared to GCP.  Faster sync should not influence the result of a low durability configuration with an in-memory benchmark (I am running the oltp_insert benchmark, details in the sysbench section of the annexe of Part #4), so we have to find another explanation.  My guess is that AWS has faster vms than GCP, and this is confirmed by more tests whose results are in the GCP vs AWS vm performance section of the annexe.

AWS and GCP vm have different performance characteristics !

AWS and GCP also have different pricing for the instances I am using for my tests:
  • a GCP n1-standard-4 instance in europe-west4 is $0.2092 per hour
  • an AWS m5d.xlarge instance in Ireland is $0.252 per hour
This comparison is somehow unfair as the AWS instance includes 150 GB local/volatile SSD.  In GCP, persistent SSDs are $0.187 per month per GB, and local SSDs (volatile) are $0.088 per month per GB, so 150 GB per hour is $0.0390 and $0.0183 for persistent and local SSD respectively.  I will let you decide which one is better/cheaper considering the difference in vm speed and sync latencies.

This is all I have for now, the next post (5b) should be about multi-threads.

2 comments:

  1. Hi Jeff:

    Thanks for writing about the benefits of running MySQL in local ssd environments and the precautions needed in such an environment. At ScaleGrid, we support MySQL deployments running on the local ssd that not only offer great performance but also ensures data reliability in case if failures. I have written a blog recently on this for reference and more details: https://scalegrid.io/blog/how-to-improve-mysql-aws-performance-2x-over-amazon-rds-at-the-same-cost/

    Thanks!

    ReplyDelete
    Replies
    1. Hi Prasad, to be perfectly clear, I do not recommend to run MySQL on local SSDs. This post is more about a demonstration of better TPS when sync are fast, not a description of reference architecture. Moreover and in the case of the benchmark presented in this post, running with sync_binlog=0 and trx_commit=2 would also get good TPS with less drawbacks than local SSDs IMHO. A situation where local SSDs might be useful is where the latency to remote storage is penalising reads, but it is not the most commun use-case IMHO.
      Cheers, JFG

      Delete