Monday, March 6, 2017

Better InnoDB Crash Recovery in MariaDB 10.1


Recently, I had to go through crash recovery of a large MariaDB 10.1.21 instance.  After starting MariaDB, I started tailing the error logs expecting to wait many minutes while InnoDB was scanning ibd files.  I was surprised (and actually delighted) with this:
[...]
[...]:34:36 [...] [Note] InnoDB: Reading tablespace information from the .ibd files...
[...]:34:53 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:35:09 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:35:25 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:35:41 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:35:57 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:36:13 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:36:29 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:36:46 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:37:02 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:37:18 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:37:34 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:37:50 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:38:06 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:38:22 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:38:38 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:38:54 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:39:10 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:39:26 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:39:42 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:39:58 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:40:14 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:40:30 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:40:46 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:41:02 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:41:18 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:41:34 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:41:50 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:42:06 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:42:22 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:42:38 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:42:54 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:43:10 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:43:26 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:43:57 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:44:13 [...] [Note] InnoDB: Processed [...] .ibd/.isl files
[...]:44:15 [...] [Note] InnoDB: Restoring possible half-written data pages
[...]
In previous version (and if my memory is right), I would only have seen something like this (so no sign-of-life for almost 10 minutes):
[...]
[...]:34:36 [...] [Note] InnoDB: Reading tablespace information from the .ibd files...
[...]:44:15 [...] [Note] InnoDB: Restoring possible half-written data pages
[...]
Since 10.1.1, MariaDB regularly update you it is (still) in the process of scanning ibd files.  So you do no worry (anymore) that crash recovery is stuck.  I was so happy I shared this with colleagues: one of then said "that's a really nice feature".  It was pointed out that this should be publicly and widely acknowledged (hence this post).

This work was implemented in MDEV-6456 by Sergei Petrunia of the MariaDB Corporation.  Thank you Sergei (and Jan Lindström for the review) and thank you MariaDB Corporation for improving InnoDB crash recovery: this is indeed, a very nice feature !

I am not sure what the status of this is in MySQL, if you know more, please leave a comment below.

4 comments:

  1. My understanding is that Oracle would not add any features to GA releases lightly. There have been some exceptions, but usually they have been at most a few months after the GA release. MySQL 5.6.10 was released as GA several years ago.

    I would say that I fixed the problem differently in MySQL 5.7, whose InnoDB was merged to MariaDB Server 10.2:
    http://mysqlserverteam.com/innodb-crash-recovery-improvements-in-mysql-5-7/
    WL#7142 adds some consistency checks and should speed up the crash recovery in the case when you have a very large number of .ibd files, but only few of them were modified since the latest redo log checkpoint. The scan for .ibd or .isl files was completely eliminated.

    This works by writing new MLOG_FILE_NAME records to the InnoDB redo log. There is also an MLOG_CHECKPOINT mini-transaction that is needed for repeating those MLOG_FILE_NAME records that would otherwise be ‘lost’ when the redo log checkpoint ‘cuts the start of the redo log’.

    Someone filed MySQL Bug #80788 to point out that the search for the MLOG_CHECKPOINT requires an extra scan phase of the redo log, up until that record. I implemented my suggestion in the upcoming MariaDB 10.2.5 in https://jira.mariadb.org/browse/MDEV-12103

    When it comes to progress reporting during crash recovery, there is MySQL Bug #78844, which I plan to address in MariaDB Server 10.1 and 10.2 under https://jira.mariadb.org/browse/MDEV-11027

    ReplyDelete
    Replies
    1. Hi, Marko, Glad to know that bug#80788 will be fixed in mariadb. Is it possible that some other bug reports i filed last year(for example, bug#82937 and bug#82176) be fixed in MariaDB ? :-)

      Delete
  2. Hi, Weixiang!

    Bug #82937 would change the redo log format. I do not think that it can be done in MariaDB 10.2, which is close to be released. But something similar can and should definitely be done in MariaDB 10.3.

    Bug #82176 can be fixed without changing any file format, so it is easier.

    In my opinion, InnoDB is sometimes using too high-level redo log format types. The dummy index information is costing not only CPU time when applying the logs, but it is also making the redo logs bigger. For example, instead of MLOG_COMP_REC_UPDATE_IN_PLACE or MLOG_COMP_REC_CLUST_DELETE_MARK or MLOG_REC_SEC_DELETE_MARK we could use MLOG_WRITE_STRING or MLOG_1BYTE. These changes could reduce the impact of the two bugs that you have filed.

    Only some bigger and less frequent operations, such as MLOG_PAGE_REORGANIZE, could be worth a dedicated high-level record type.

    ReplyDelete
    Replies
    1. Hi Marko, about changing the REDO log format in 10.3, how will that impact the "on disk" compatibility with MySQL 8.0 ? Thanks.

      Delete