Monday, November 11, 2024

Long and Silent / Stressful MySQL Startup with Many Tables

In the last weeks / months, I have been working on understanding / improving MySQL startup with many tables.  I already wrote two posts on the subject, the links are below.  So far, I did not share what brought my attention to this, and it is the subject of this post.  Also, and because it is related, I come back to the optimization / contribution I already made on the subject, looking at it with new information from this post.

This post is written in the context of my work on MySQL at Aiven : check our Developer Center, blog or careers page to know more about us.

The two posts I already wrote on the subject of long MySQL startup with many tables are :

  1. Faster MySQL Startup with Many Tables;
  2. Impressed by Disabling InnoDB Redo Logging for Creating Tables as Fast as Possible.

Where I work and a few months ago, someone waiting for MySQL to start asked what was taking so long.  My first reaction was to tell them to check the logs, but these were not useful.  One special thing about this MySQL instance is that it contains a lot of tables.  I went on to reproduce the problem, and this threw me down a rabbit hole that I am just starting to get out of.

Below is the full error log of the startup of a MySQL instance with 1 million tables (on a m6i.large AWS instance, using a gp3 volume with the default configuration : 3k iops and 125 MiB/s).

2024-08-26T02:12:54.209842Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_8_0_39/data/msandbox.err'.
2024-08-26T02:12:54.282572Z mysqld_safe Starting mysqld daemon with databases from /home/jgagne/sandboxes/msb_mysql_8_0_39/data
2024-08-26T02:12:57.668709Z 0 [System] [MY-010116] [Server] /home/jgagne/opt/mysql/mysql_8.0.39/bin/mysqld (mysqld 8.0.39) starting as process 66873
2024-08-26T02:12:58.409124Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
2024-08-26T02:19:00.014759Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
2024-08-26T02:20:03.386016Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2024-08-26T02:20:03.386062Z 0 [System] [MY-013602] [Server] Channel mysql_main configured to support TLS. Encrypted connections are now supported for this channel.
2024-08-26T02:20:03.545458Z 0 [System] [MY-011323] [Server] X Plugin ready for connections. Bind-address: '::' port: 18039, socket: /tmp/mysqlx-18039.sock
2024-08-26T02:20:03.545591Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_8.0.39/bin/mysqld: ready for connections. Version: '8.0.39'  socket: '/tmp/mysql_sandbox8039.sock'  port: 8039  MySQL Community Server - GPL.

These logs are not verbose enough !  In the 7:09 (7 minutes and 9 seconds) needed for this instance to start, there are two significant gaps without sign of progress in the logs.  And for this, I opened Bug #116619 : Startup without Signs of Progress with Many Tables (1M+).  These gaps are :

  • gap of 6:02 between MY-013576 and MY-013577;
  • gap of 1:03 between MY-013577 and MY-010068.

MySQL 8.0 default configuration for log_error_verbosity is 2, so only logging errors and warnings.  Let's try with 3 to also log information messages.  I cannot copy all logs, it would be too long, but the relevant extract is below (the full logs are in my GitHub blog repo).

2024-08-26T02:35:59.954165Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_8_0_39/data/msandbox.err'.
[...]
2024-08-26T02:36:04.166470Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started.
[...]
2024-08-26T02:36:04.292857Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2024-08-26T02:36:04.306208Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2024-08-26T02:36:14.306525Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 418594 data files and 2 undo files
2024-08-26T02:36:24.306648Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 858081 data files and 2 undo files
2024-08-26T02:36:27.504504Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-08-26T02:36:27.504569Z 1 [Note] [MY-012207] [InnoDB] Using 5 threads to scan 1000002 tablespace files
2024-08-26T02:36:37.505041Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 8279/250000 files
2024-08-26T02:36:37.505351Z 0 [Note] [MY-012200] [InnoDB] Thread# 0 - Checked 8274/250000 files
2024-08-26T02:36:37.505812Z 0 [Note] [MY-012200] [InnoDB] Thread# 2 - Checked 8273/250000 files
2024-08-26T02:36:37.506261Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 8280/250000 files
2024-08-26T02:36:47.505979Z 0 [Note] [MY-012200] [InnoDB] Thread# 3 - Checked 15657/250000 files
[...]
2024-08-26T02:41:57.536858Z 0 [Note] [MY-012200] [InnoDB] Thread# 1 - Checked 244948/250000 files
2024-08-26T02:42:04.302732Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.328662Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.350075Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.650573Z 0 [Note] [MY-012201] [InnoDB] Checked 250000 files
2024-08-26T02:42:04.758139Z 1 [Note] [MY-012208] [InnoDB] Completed  check of 1000004 files.
[...]
2024-08-26T02:42:05.565297Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended.
[...]
2024-08-26T02:42:06.718098Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-08-26T02:42:36.390311Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 330925 out of 1000006 tablespaces so far.
2024-08-26T02:42:46.390349Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 610870 out of 1000006 tablespaces so far.
2024-08-26T02:42:56.390389Z 0 [Note] [MY-012350] [InnoDB] 4threads have validated 828885 out of 1000006 tablespaces so far.
2024-08-26T02:43:05.290580Z 1 [Note] [MY-012356] [InnoDB] Scanned 1000006 tablespaces. Validated 1000006.
[...]
2024-08-26T02:43:09.535269Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
[...]
2024-08-26T02:43:09.702744Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_8.0.39/bin/mysqld: ready for connections. Version: '8.0.39'  socket: '/tmp/mysql_sandbox8039.sock'  port: 8039  MySQL Community Server - GPL.

There is a lot of say about these logs.  For the details not covered in this post, I opened Bug #116620 : Suboptimal Startup Info Logging with Many Tables (1M+).  With these Info / Note logs, we now see the four long phases of a MySQL startup with 1 million tables.  These phases are listed below, the names being my own.

  • InnoDB Directories Scanning, lasting 0:23 (2:36:04 to 2:36:27);
  • InnoDB Tablespace Duplicate Check, lasting 5:37 (2:36:27 to 2:42:04);
  • DD Tablespace Files Reading, lasting 0:20 (2:42:06 to 2:42:36 - 10 — minus 10 because at MY-012350, we already are in the next phase which logs every 10 seconds);
  • InnoDB Tablespace Validation, lasting 0:39 (2:42:36 - 10 to 2:43:05).

The InnoDB Tablespace Duplicate Check (Duplicate Check for short) is the part of startup that I previously made faster with the fadvise and light optimizations.  The name of this phase does not exactly match the logging (from the logs we could name it Tablespace Scanning or Tablespace Space ID Checking), but it matches the code : some details are in my previous post, and I will probably come back to this in a follow-up post.

The InnoDB Tablespace Validation (Tablespace Validation for short) has an interesting history, but I keep this for a follow-up post.  For now, it is enough to know that this is where is happening InnoDB Tablespace Path Validation (Path Validation for short).  It is possible to disable Path Validation since July 2020 / 8.0.21 with the innodb_validate_tablespace_paths parameter, which makes startup faster (more about this in WL #14008).  But be very careful with this and old MySQL versions, because there is a data corruption bug lurking, fixed in 8.0.36 and 8.3.0 from January 2024 (affected versions 8.0.21 to 8.0.35 and 8.1.0 to 8.2.0).  Also, be careful to not mixing-up Tablespace Validation and Tablespace Path Validation, it took me time to disentangle these two, and that confusion lead to Bug #115568 : innodb_validate_tablespace_paths is dummy.

In this post, I am not going further than naming InnoDB Directories Scanning and DD Tablespace Files Reading (respectively Directories Scanning and Tablespace Reading for short), I am keeping these for a follow-up post.

Back at Local SSDs

While we are checking logs and naming startup phases, let's look back at the behavior on local SSDs (up to here, the discussion has been with default gp3).  With local SSDs, we have below for a 2:36 non-optimized MySQL startup (partial logs in annex #1 and the full logs are in my GitHub blog repo).

  • InnoDB Directories Scanning, lasting 0:11;
  • InnoDB Tablespace Duplicate Check, lasting 1:32;
  • DD Tablespace Files Reading, lasting 0:12;
  • InnoDB Tablespace Validation, lasting 0:21.

With the light optimization and a reduced MySQL startup of 1:06, Duplicate Check now takes 14 seconds (partial logs in annex #2, full logs in GitHub).  I was curious about how much of these 14 seconds were the disk time, so I wrote a program to prefetch the first block of each tablespace (it is in annex #3).  With a warm cache, Duplicate Check takes 3 seconds (partial logs in annex #4, full logs in GitHub).  Note that with prefetching, Directories Scanning only takes 2 seconds (it was taking 10 before).  These results probably mean that not much can be done to additionally "individually" optimize Directories Scanning and Duplicate Check (because 3 and 2 seconds are not something worth spending much time on), but global improvements might still be worth pursing, and I will come back to this in a follow-up post.

Thanks to Dan Reif comment on my LinkedIn post that led me to do above tests, the results are interesting and I did not think to check this before his comment.

This concludes this post about MySQL startup phases.  In a follow-up post, I will describe InnoDB Tablespace Duplicate Check in greater detail, which will allow us to better understand the fadvise and light optimizations, and which will introduce a new optimization to make Duplicate Check faster in some specific circumstances.


Annex #1 : Log Extract for not-Optimized on Local SSDs

2024-09-06T14:55:09.113400Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_9_0_1/data/msandbox.err'.
[...]
2024-09-06T14:55:14.052953Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2024-09-06T14:55:14.060219Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2024-09-06T14:55:24.060248Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 882267 data files and 2 undo files
2024-09-06T14:55:25.057412Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-09-06T14:55:25.057458Z 1 [Note] [MY-012207] [InnoDB] Using 9 threads to scan 1000002 tablespace files
[...]
2024-09-06T14:57:07.838606Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.
[...]
2024-09-06T14:57:09.023041Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-09-06T14:57:31.075684Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 622649 out of 1000006 tablespaces so far.
2024-09-06T14:57:41.075698Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 965123 out of 1000006 tablespaces so far.
2024-09-06T14:57:42.301111Z 1 [Note] [MY-012356] [InnoDB] Scanned 1000006 tablespaces. Validated 1000006.
[...]
2024-09-06T14:57:45.154020Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_9.0.1/bin/mysqld: ready for connections. Version: '9.0.1'  socket: '/tmp/mysql_sandbox9001.sock'  port: 9001  Source distribution.

Annex #2 : Log Extract for Optimized / Light on Local SSDs

2024-09-06T15:11:29.288099Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_9_0_1/data/msandbox.err'.
[...]
2024-09-06T15:11:35.317580Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2024-09-06T15:11:35.318065Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2024-09-06T15:11:45.318154Z 1 [Note] [MY-012205] [InnoDB] Files found so far: 909747 data files and 2 undo files
2024-09-06T15:11:45.999238Z 1 [Note] [MY-012206] [InnoDB] Found 1000002 '.ibd' and 2 undo files
2024-09-06T15:11:45.999286Z 1 [Note] [MY-012207] [InnoDB] Using 9 threads to scan 1000002 tablespace files
[...]
2024-09-06T15:11:59.826734Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.
[...]
2024-09-06T15:12:00.688682Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-09-06T15:12:22.488548Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 631164 out of 1000006 tablespaces so far.
2024-09-06T15:12:32.488583Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 975234 out of 1000006 tablespaces so far.
2024-09-06T15:12:33.367277Z 1 [Note] [MY-012356] [InnoDB] Scanned 1000006 tablespaces. Validated 1000006.
[...]
2024-09-06T15:12:35.562828Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_9.0.1/bin/mysqld: ready for connections. Version: '9.0.1'  socket: '/tmp/mysql_sandbox9001.sock'  port: 9001  Source distribution.

Annex #3 : Prefetching, Code and Executions

#include <dirent.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/types.h>

int main(int argc, char** argv) {
  char buf[1024*4];
  DIR * pdir = opendir(".");

  while (1) {
    struct dirent *pdirent = readdir(pdir);
    if (!pdirent) break;
    if (pdirent->d_name[0] == '.') continue;

    FILE *pf = fopen(pdirent->d_name, "rb");
    posix_fadvise(fileno(pf), 0, 1024*4, POSIX_FADV_RANDOM);
    setvbuf(pf, 0, _IONBF, 0);
    fread(buf, 1, 1024*4, pf);
    fclose(pf);

    printf(".");
  }

  return 0;
}

# Because the program prints one character per file, piping to pv allows checking progress.
~/sandboxes/msb_mysql_9_0_1/data/test_jfg$ sudo sync; sudo /usr/bin/bash -c "echo 3 > /proc/sys/vm/drop_caches"
~/sandboxes/msb_mysql_9_0_1/data/test_jfg$ ~/tmp/a.out | pv -tbea -s 1000000 > /dev/null
 976KiB 0:01:30 [10.7KiB/s]

# Running the program a second time is much faster, meaning it worked.
~/sandboxes/msb_mysql_9_0_1/data/test_jfg$ ~/tmp/a.out | pv -tbea -s 1000000 > /dev/null
 976KiB 0:00:02 [ 385KiB/s]

Annex #4 : Log Extract for Cached Optimized / Light on Local SSDs

2024-09-06T16:07:37.641388Z mysqld_safe Logging to '/home/jgagne/sandboxes/msb_mysql_9_0_1/data/msandbox.err'.
[...]
2024-09-06T16:07:43.919517Z 1 [Note] [MY-012203] [InnoDB] Directories to scan './'
2024-09-06T16:07:43.919972Z 1 [Note] [MY-012204] [InnoDB] Scanning './'
2024-09-06T16:07:45.789184Z 1 [Note] [MY-012207] [InnoDB] Using 9 threads to scan 1000002 tablespace files
2024-09-06T16:07:48.668345Z 1 [Note] [MY-012208] [InnoDB] Completed space ID check of 1000004 files.
[...]
2024-09-06T16:07:49.449304Z 1 [Note] [MY-012357] [InnoDB] Reading DD tablespace files
2024-09-06T16:08:11.441441Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 628476 out of 1000006 tablespaces so far.
2024-09-06T16:08:21.441347Z 0 [Note] [MY-012350] [InnoDB] 8threads have validated 971386 out of 1000006 tablespaces so far.
2024-09-06T16:08:22.416577Z 1 [Note] [MY-012356] [InnoDB] Scanned 1000006 tablespaces. Validated 1000006.
[...]
2024-09-06T16:08:24.602894Z 0 [System] [MY-010931] [Server] /home/jgagne/opt/mysql/mysql_9.0.1/bin/mysqld: ready for connections. Version: '9.0.1'  socket: '/tmp/mysql_sandbox9001.sock'  port: 9001  Source distribution.

No comments:

Post a Comment