Tuesday, July 16, 2019

MySQL Master Replication Crash Safety Part #5: faster without reducing durability (under the hood)

This post is a sister post to MySQL Master Replication Crash Safety Part #5: making things faster without reducing durability.  There is no introduction or conclusion to this post, only landing sections: reading this post without its context is not not recommended. You should start with the main post and come back here for more details.

And this Part #5 of the series has many sub-parts.  So far, the following has been published:
I will edit this post as I am publishing the next sub-parts because I prefer to only have one annexe to the many sub-parts of Part #5.

GCP vs AWS for latencies of local SSD


GCP has two types of local SSD: SCSI and NVMe.  AWS only has one unnamed type.  Below are latencies for the three flavours of local SSD.  You can see that AWS is better.
GCP SCSI$ ioping -WWW -c 200 -i 0 -L -s 4K /dev/sdd
[...]
--- /dev/sdd (block device 375 GiB) ioping statistics ---
199 requests completed in 1.24 s, 796 KiB written, 160 iops, 642.8 KiB/s
generated 200 requests in 1.25 s, 800 KiB, 159 iops, 638.1 KiB/s
min/avg/max/mdev = 5.40 ms / 6.22 ms / 7.20 ms / 329.5 us

GCP NVMe$ ioping -WWW -c 200 -i 0 -L -s 4K /dev/nvme0n1
[...]
--- /dev/nvme0n1 (block device 375 GiB) ioping statistics ---
199 requests completed in 451.0 ms, 796 KiB written, 441 iops, 1.72 MiB/s
generated 200 requests in 459.5 ms, 800 KiB, 435 iops, 1.70 MiB/s
min/avg/max/mdev = 1.61 ms / 2.27 ms / 3.09 ms / 603.6 us

AWS$ ioping -WWW -c 200 -i 0 -L -s 4K /dev/nvme3n1
[...]
--- /dev/nvme3n1 (block device 139.7 GiB) ioping statistics ---
200 requests completed in 11.7 ms, 20.8 k iops, 81.1 MiB/s
min/avg/max/mdev = 44 us / 48 us / 69 us / 2 us


GCP vs AWS for vm performance


As I am getting higher TPS for low durability in AWS than in GCP, I wanted to evaluate vm speed.  The first place I checked is /proc/cpuinfo, and the results are below.  We can see that AWS CPUs have high clock speed and BogoMips than GCP.
GCP$ grep -e processor -e model.name -e bogomips /proc/cpuinfo
processor       : 0
model name      : Intel(R) Xeon(R) CPU @ 2.00GHz
bogomips        : 4000.34
processor       : 1
model name      : Intel(R) Xeon(R) CPU @ 2.00GHz
bogomips        : 4000.34
processor       : 2
model name      : Intel(R) Xeon(R) CPU @ 2.00GHz
bogomips        : 4000.34
processor       : 3
model name      : Intel(R) Xeon(R) CPU @ 2.00GHz
bogomips        : 4000.34

AWS$ grep -e processor -e model.name -e bogomips /proc/cpuinfo
processor       : 0
model name      : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
bogomips        : 4999.99
processor       : 1
model name      : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
bogomips        : 4999.99
processor       : 2
model name      : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
bogomips        : 4999.99
processor       : 3
model name      : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
bogomips        : 4999.99
From the Wikipedia page for BogoMips, we can read that "[BogoMips] is not usable for performance comparisons among different CPUs".  So let's do sysbench CPU and memory tests.  The results are below.  In both cases, AWS is better than GCP.
# CPU tests.

GCP$ sysbench --test=cpu --time=10 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:   895.56

General statistics:
    total time:                          10.0007s
    total number of events:              8958

Latency (ms):
         min:                                    1.08
         avg:                                    1.12
         max:                                    4.17
         95th percentile:                        1.16
         sum:                                 9995.03

Threads fairness:
    events (avg/stddev):           8958.0000/0.00
    execution time (avg/stddev):   9.9950/0.00

AWS$ sysbench --test=cpu --time=10 run
[...]
CPU speed:
    events per second:   969.03
[...]


# Memory tests.

GCP$ sysbench --test=memory --time=10 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.17 (using bundled LuaJIT 2.1.0-beta2)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Running memory speed test with the following options:
  block size: 1KiB
  total size: 102400MiB
  operation: write
  scope: global

Initializing worker threads...

Threads started!

Total operations: 36490424 (3648207.15 per second)

35635.18 MiB transferred (3562.70 MiB/sec)


General statistics:
    total time:                          10.0002s
    total number of events:              36490424

Latency (ms):
         min:                                    0.00
         avg:                                    0.00
         max:                                    3.90
         95th percentile:                        0.00
         sum:                                 4330.33

Threads fairness:
    events (avg/stddev):           36490424.0000/0.00
    execution time (avg/stddev):   4.3303/0.00

AWS$ sysbench --test=memory --time=10 run
[...]
Total operations: 42742604 (4273549.54 per second)

41740.82 MiB transferred (4173.39 MiB/sec)
[...]

EOA: End Of Annexe.

2 comments:

  1. Would be great if vendors explained the difference in write latency

    ReplyDelete
    Replies
    1. By vendors, I guess you mean Cloud Providers.

      IMHO, AWS has not much to explain to explain: their performance is what I expect. However, GCP has more to explain, but I am not able to get much from the support case I open with them. I was able to have the public issue [1] open, and there is also [2] and [3] already open.

      [1]: https://issuetracker.google.com/issues/137385946
      [2]: https://issuetracker.google.com/issues/124631079
      [3]: https://issuetracker.google.com/issues/135215628

      Delete