Tuesday, November 19, 2024

Understanding InnoDB Tablespace Duplicate Check (MySQL Startup with Many Tables)

This post is a little long, please bear with me as after the intro, you can skip to the essential.

In the last weeks / months, I have been working on understanding / improving MySQL startup with many tables.  I already wrote three posts on the subject, they are listed below.  In this post, I focus on describing the startup phase InnoDB Tablespace Duplicate Check (Duplicate Check for short).  This allows to better understand the optimization I already submitted, and find another optimization that could be leveraged.

Monday, November 11, 2024

Long and Silent / Stressful MySQL Startup with Many Tables

In the last weeks / months, I have been working on understanding / improving MySQL startup with many tables.  I already wrote two posts on the subject, the links are below.  So far, I did not share what brought my attention to this, and it is the subject of this post.  Also, and because it is related, I come back to the optimization / contribution I already made on the subject, looking at it with new information from this post.

Monday, October 28, 2024

Impressed by Disabling InnoDB Redo Logging for Creating Tables as Fast as Possible

In the last weeks / months, I have been working on understanding / improving MySQL startup with many tables.  I already wrote a post about my work (Faster MySQL Startup with Many Tables).  In that post, I link to a bug report (Bug #115988 : Too Much Disk Read on Startup, penalizing deployments with many tables).  In that bug report, I write, without much details, that the quickest way to create tables in 8.0+ is by reducing durability and disabling redo logging.  We will explore this statement in this post.  The TLDR is that I was very impressed by the speedup introduced by disabling Redo Logging, and I think there might be other situations where this is useful.

Tuesday, September 3, 2024

Faster MySQL Startup with Many Tables (1M+)

I have been scratching my head about MySQL startup for some time.  There is much to say about this, and many other posts will probably follow.  For now, it is enough to know that with many tables (millions) the startup of MySQL 8.0+ (including 8.0, 8.4 and 9.0) is suboptimal (to say the least).  With very little changes, I was able to speed it up, from 2:39 to 1:09 (1 minute and 9 seconds).  This result is obtained with 1 million tables on a m6id.xlarge AWS instance (4 vcpu and local SSD).  It does not translate directly to EBS volumes, even though there are still things I think can be done there.  I describe all the details of my optimizations in the rest of this post.

Friday, June 28, 2024

Trick to Simulate Higher Latency IOs on Linux

When running tests, it is sometimes useful to slow down IOs to clearly show the impact of reading from the disk.  Modern laptops and servers usually have fast local SSDs, and most cloud vendors provide, in addition to fast local disks, relatively fast network block devices (EBS for AWS and Persistent Disk for GCP).  Even when using magnetic network block devices, IOs are not guaranteed to be slow because they might hit the cache of the storage appliance (EBS volumes sometime have sub-millisecond latency).  This blog post describes the solution I am using for simulating high latency IOs on Linux. TL&DR: device-mapper’s delay target.

Wednesday, June 19, 2024

Trusting mysqldump and Insecure MySQL Client Lead to Remote Code Execution

Yes, you read this correctly: because the MySQL client is insecure and allows running arbitrary commands, and because mysqldump blindly trusts the server it is dumping from, a hostile MySQL Server on which mysqldump is executed could trigger arbitrary command execution (also known as a remote code execution).  This post raises awareness on this vulnerability and shows how a secure MySQL client makes things harder to exploit.

Monday, June 17, 2024

RFC: Database / Schema in the Slow Query Log File

A month ago, I announced that I am working on a MySQL Contribution to add database / schema in the slow query log file (bug report, LinkedIn, Twitter, Facebook, and MySQL Community Slack).  My contribution is ready, and before submitting it, I would like feedback, so I am writing this post as a Request for Comments (RFC).  If you are in a hurry, you can go directly to the RFC Fake PR.  But what is a Fake PR ?  It is part of my way of working on MySQL Contributions, which I describe below, and on which I would also welcome feedback.  Something else covered in this post are the design choices made for this contribution, and again, I would welcome your feedback on them (I can make adjustments before submitting my contribution).  In this post, I also explain why having the database / schema in the slow query log file is important.

Wednesday, April 10, 2024

17 Years of Insecure MySQL Client !

Yes, this is a catchy title, but it is true, and it got you reading this post :-).  Another title could have been “Please load this mysql-dump: what could go wrong ?”.  As you guessed, loading a dump is not a risk-free operation.  In this post, I explain how the insecure MySQL client makes this operation risky and how to protect against it.

And if you think this post is not worth reading because you do not load dumps, consider the similarity with running any untrusted SQL scripts.  You might be doing this more often than you think (have you ever had to run a script to prepare MySQL for an upgrade of WordPress, Grafana, or any other software: these are untrusted SQL scripts).

Thursday, April 4, 2024

dbdeployer Tutorial on Mac

Not very long ago (well, maybe a little longer, this post is in draft for more than a year), in the spawn of less than 5 days, I suggested many colleagues to reproduce a problem they had with MySQL in a "more simple environment".  Such more simple environment can be created with dbdeployer.  dbdeployer is a tool to create "MySQL Sandboxes" on a Mac (laptop or desktop) or on Linux (vm, laptop or desktop).  It is relatively simple to use, but if you do not know what to do, getting things set up the first time can be challenging.  The goal of this post is to ease this first setup.  In this tutorial, I describe how to install dbdeployer on a Mac and how to create your first sandboxes.  I might do a Linux tutorial in the future.