Wednesday, November 22, 2023

Thoughts on the October 2023 MySQL Releases

A few days ago, Oracle released three new MySQL GA versions: 8.2.0, 8.0.35 and 5.7.44.  I skimmed the release notes (8.2.0, 8.0.35 and 5.7.44), and I am not impressed.  I guess that I would be even less impressed / more disappointed if I had checked in greater detail, and if I had reviewed the 8.1.0, 8.0.34 and 5.7.43 release notes.  The subject of my disappointment is Oracle not fixing bugs in ALL of the LTS releases, sometimes only fixing them in the latest Innovation Release.  This post summarizes my findings and thoughts.

Monday, June 26, 2023

Rows Examined Blindspot when Looking for non-Existing Data

When trying to understand queries in the slow log, an interesting metric to look at is rows examined.  Also, when trying to understand CPU spikes on a MySQL instance, the InnoDB Metric dml_reads and the global status Innodb_rows_read are interesting to check.  In a similar way, when trying to gather information about which queries are loading the system, SUM_ROWS_EXAMINED from the Performance Schema table events_statements_summary_by_digest is interesting to analyze.  But rows examined might not mean what you think, and I give an example in this post.

TL&DR: in MySQL and from what I understand, rows examined means rows returned by the storage engine, even if the name would lead you to think that it is rows read.  A consequence of this is that a query looking for non-existing data has 0 rows examined, and this is what I call in this post a rows examined blindspot.

Monday, May 22, 2023

Avoiding a STOP SLAVE Crash with MTR in Percona Server older than 5.7.37-40

I am finalizing my Percona Live talk MySQL and Vitess (and Kubernetes) at HubSpot.  In this talk, I mentioned that I like that Percona is providing better MySQL with Percona Server.  This comes with a little inconvenience though: with improvements, sometimes comes regression.  This post is about such regression and a workaround I implemented some time ago (I should have shared it earlier).  This regression is fixed in Percona Server 5.7.37-40, so this is mostly interesting for historical purposes, or if you are running a Percona Server version older than that.

Thursday, March 30, 2023

Running and Mountain Climbing at Percona Live Denver

I just booked my travel arrangements for Percona Live 2023.  In case you missed it, one of the most important MySQL Conference of the year is happening in Denver from Monday May 22 to Wednesday 24.  I will be there and I am giving a talk about how HubSpot operates Percona Server / MySQL with Vitess in Kubernetes.  My colleague, Mali Akmanalp, is also speaking about the tools we provide developers at HubSpot to keep their query workloads efficient with as little involvement from the database team as possible.  But this post is not about our talks, it is about running and mountain climbing.

Monday, January 23, 2023

Planet for the MySQL Community Graduating from Beta

In May 2020, I published a Planet for the MySQL Community Pluto Beta.  I was satisfied with the result and was considering it done, but I did not invest the time to promote it to release (a non-beta site was running for a few months without being advertised).  I finally came to it, and today I am happy to deprecate the beta and announce the release of Planet for the MySQL Community.  In this post, I recall what led to the creation of this new blog aggregator, I list recent changes, and I mention where I would like things to go in the future.  But before continuing to read this post, if you are using the beta, please update your bookmarks :

Also, if you are using Planet for the MySQL Community, could you follow Planet for the MySQL Community / @PlanetMysqlComT on Twitter.  This allows me to know my users and to gauge the interest for this work.

Monday, December 12, 2022

Free Page Consumption by InnoDB Table Compression (Percona backoff Algo. part #2)

In my previous post about InnoDB Stalls on Empty Free List, I used a test environment that might look a little artificial : a table larger than the InnoDB Buffer Pool but fitting in the Linux Page Cache.  This configuration allows serving very quickly what MySQL thinks are IOs because these are hit in the filesystem cache.  In this post, I explain why this environment is not totally artificial and can be a good representation of another more complex environment.  I also explain how an Empty Free List backoff stall can occur when no IO is done.  You have probably already guessed what this post is covering from the title, so let's dive in.

Wednesday, November 30, 2022

Tail Latencies in Percona Server because of InnoDB Stalls on Empty Free List

If, in Percona Server, you are observing tail latencies on queries that should be fast, this might be a side effect of Percona's improved InnoDB Empty Free List Algorithm.  When using this algorithm (the default in 5.6 and 5.7 and optional configuration in 8.0), a query needing a free page while none are available waits until the LRU Manager Thread refills the free list.  Because this thread is waking-up only at regular intervals, the query is blocked (which explains the latency) until the LRU Manager Thread is done sleeping and completes its work.  These latencies / stalls might last up to one second in the worse case.  All the details are in this post, starting with InnoDB Free Pages.

Tuesday, November 15, 2022

Bad Optimizer Plan on Queries Combining WHERE, ORDER BY and LIMIT

Sometimes, the MySQL Optimizer chooses a wrong plan, and a query that should execute in less than 0.1 second ends-up running for 12 minutes !  This is not a new problem: bugs about this can be traced back to 2014, and a blog post on the subject dates of 2015.  But even if this is old news, because this problem recently came to my attention, it is a problem worth writing on.

Tuesday, May 31, 2022

Triggering Replication Lag for Testing a Script

I am currently working on a script to auto-enable parallel replication / multi-threaded replication (MTR) when there is replication lag.  For testing this script, I need to trigger replication lag that would disappear after enabling MTR.  I came-up with a simple solution for that, and I thought it could be useful to more people, so I am writing this blog post about it.  Read-on for the details.

Tuesday, March 8, 2022

Rows Examined not Trustworthy because of Index Condition Pushdown

When Index Condition Pushdown is used, you cannot trust rows examined as reported by the slow query log (nor in some Performance Schema tables) and you cannot trust the global status Innodb_rows_read (nor the InnoDB Metrics dml_reads).  These can lead to an incorrect query complexity analysis, to a mistaken estimation of the total load of a MySQL server, and to a lot of wasted time.  As we go through the details in this post, we will understand why a SELECT * is sometimes faster than only selecting the columns from an index (we would normally expect the opposite) and we will visit inner workings of the MySQL Storage Engine Architecture.  But first, let's start by exposing the problem.

Wednesday, January 19, 2022

Crashing MySQL with Malicious Intent and a lot of Determination

A year ago, I blogged about An Unprivileged User can crash your MySQL Server.  At the time, I explained how to protect yourself against this problem.  A few weeks ago, I revisited this vulnerability in a follow-up post in which I explained the fix, claimed that the MySQL 5.7 default configuration for Group Replication is still problematic, and explained a tuning to avoid the vulnerability.  In this last post in the series, I explain how to exploit this vulnerability to crash older version of MySQL (or untuned Group Replication in 5.7), but this needs a lot of determination.

Monday, December 13, 2021

Trick to Simulate a Linux Server with less RAM

I created the first draft of this post many years ago.  At that time, I was working with physical servers having 192 GB of RAM or more.  On such systems, doing memory pressure tests with MySQL is complicated.  I used a trick to simulate a Linux server with less RAM (also works with vms, probably not with Kubernetes or containers).  I recently needed the trick again and as I will refer to it in a future post, now is a good time to complete and publish this.  TL&DR: huge pages...

Tuesday, November 30, 2021

Follow-up on an Unprivileged User can Crash your MySQL Server

A year ago, I blogged about An Unprivileged User can Crash your MySQL Server.  At the time, I presented how to protect yourself against this problem without explaining how to generate a crash.  In this post, I am revisiting this vulnerability, not giving the exploit yet, but presenting the fix.  Also, because the default configuration of Group Replication in 5.7 is still vulnerable (it is not in 8.0), I show the adjustment to make to avoid problems.

Friday, November 12, 2021

Duplicate Entry in ALTER TABLE (and in OPTIMIZE TABLE)

A few weeks ago and in MySQL 5.7, I had an ALTER TABLE that failed with a duplicate entry error.  This is old news as it is happening since MySQL 5.6, but I only saw this recently because I am normally using online schema change from the Percona Toolkit (pt-osc) or GitHub's online schema migration (gh-ost).  I do not like that and I am disappointed this has not been improved, so this post is partially a rant, but not only, because I explain why ALTER TABLE is failing and I give solutions to avoid the failure.

Saturday, December 5, 2020

Fixing low durability GTID replica with Voodoo

At my FOSDEM talk earlier this year, I gave a trick for fixing a crashed GTID replica.  I never blogged about this, so now is a good time.  What is pushing me to write on this today is my talk at MinervaDB Athena 2020 this Friday.  At this conference, I will present more details about MySQL replication crash safety.  So you know what to do if you want to learn more about this subject.  Let's now talk about voodoo.

Sunday, October 18, 2020

An Unprivileged User can crash your MySQL Server

Yes, your read the title correctly: an unprivileged user can crash your MySQL Server.  This applies for the default configuration of MySQL 8.0.21 (and it is probably the case for all MySQL 8 GA versions).  Depending on your configuration, it might also be the case for MySQL 5.7.  This needs malicious intent and a lot of determination, so no need to panic as this will not happen by accident.  I am not sharing the exploit in this post (I might do it later), but I am telling you how to avoid the problem.

Sunday, June 7, 2020

Update on Planet for the MySQL Community (June 7, 2020)

Three weeks ago, I released the Pluto Beta of Planet for the MySQL Community.  Since then, a few things changed and I think it is worth doing an update to the Ecosystem.  The change I am the most happy about is that Ivan Groenewold started helping with the project (you can find him in the people of the oursqlcommunity.org GitHub organization).  So Planet for the MySQL Community is not a one-man initiative anymore, and it is growing into being closer to a true community project (we could still use more help, feel free to contact me if you want to join the team, some sysadmin skills required).

Thursday, May 14, 2020

Planet [for the] MySQL Community: Pluto Beta

As indicated in a previous post, I am working on Planet [for the] MySQL Community for which I did a RFC on the Requirements.  I am happy to announce the beta release of Planet [for the] MySQL Community using the Planet Pluto Free Feed Reader and (Static) Website Generator.  If you are in a hurry, you can check it out at https://planet-beta-pluto.oursqlcommunity.org/.  This post describes how I got there, how it was built, how it works and a little more.

Tuesday, May 12, 2020

Planet MySQL Community: Requirements RFC

As indicated in a previous post, I am working on a Planet MySQL Community (this name is not final, see below).  In this post, I want to present what I think the requirements should be for this new aggregator.  This is a Request for Comment (RFC): I do not claim completeness nor perfection, this might (will) evolve, and your remarks are welcome.  A tl&dr is available in the Summary of Requirements below.

Friday, April 24, 2020

Planning for the AFTER Planet MySQL (bis)

Oups, my previous post had a forbidden word and did not show-up on Planet MySQL, so retry...

As written in a previous post, the state of Planet MySQL is unhealthy !  I am still aggregated there for now as, before leaving what was the best news-feed for the MySQL Community, we need a replacement.  This post aims at starting a discussion on this replacement.

Planning for the AFTER Planet MySQL

As written in my previous post, the state of Planet MySQL is unhealthy !  I am still aggregated there for now as, before leaving what was the best news-feed for the MySQL Community, we need a replacement.  This post aims at starting a discussion on this replacement.

State of Planet MySQL: unhealthy !

Also, the state of bugs.mysql.com: unhealthy (last paragraph below, but this would be too long a title).

I have been complaining for some time about Planet MySQL filtering content that I think it should not.  Shlomi made a courageous decision on this and explained it in his recent post (Pulling his blog out of Planet MySQL aggregator, over community concerns).  I am thinking of doing the same.  What are your thoughts on this ?  Please share them in the comments below (or on Shlomi's blog).

Monday, March 30, 2020

JFG MySQL Advice during COVID-19

Update 2020-05-05: I stopped providing consulting hours.  They were not very popular, but it was worth trying.

We are all impacted by COVID-19.  On my side, this virus disrupted my travel, professional and personal plans for the upcoming months.  To overcome these disruptions, and following the idea of someone on MySQL Community Slack, I am experimenting with giving MySQL advice on a 30-minute basis.  You can book a slot using this link:
While booking, you will be asked to describe what you want to discuss.  I will confirm the meeting by mail if it is a subject I can help you with.  The discussion can be on Skype, Google Hangout, or your favorite video conferencing system (while booking, you will be asked for it).

This is an experiment !  I opened consulting hours between March 31st and April 11th.  If this goes well, I will add more; if it does not go well, I will cancel everything.  Also, I will do my best to answer your questions and help you with your MySQL challenges, but I cannot replace a full-time DBA on your team.  Also, I will not be able to get a complete picture of your situation in such a short time, so you will have to be smart in deciding if you want to follow my advice or not.

Good luck to you all and courage during these strange times !

Monday, January 27, 2020

A Legacy Behavior of MySQL Corrupting Restored Backups (replicate-same-server-id = OFF)

In my previous post (Puzzled by MySQL Replication), I describe a weird, but completely documented, behavior of replication that had me scratching my head for hours because it was causing data corruption.  I did not give too many details then as I also wanted allowing you to scratch your head if you wished.  In this post, I describe this behavior in more details.

Thursday, January 9, 2020

Puzzled by MySQL Replication (War Story)

Recently, I was puzzled by MySQL replication !  Some weird, but completely documented, behavior of replication had me scratching my head for hours.  I am sharing this war story so you can avoid losing time like me (and also maybe avoid corrupting your data when restoring a backup).  The exact justification will come in a follow-up post, so you can also scratch your head trying to understand what I faced.  So let's dive-in.

Monday, October 7, 2019

The dark side of super_read_only

There is an interesting feature in MySQL 5.7+: the global variable super_read_only.  MySQL 5.6 and before only had the read_only global variable which is not preventing a user with SUPER privilege to write to the database.  With super_read_only, those users cannot write anymore.  But this feature comes with a surprise.

Tuesday, July 16, 2019

MySQL Master Replication Crash Safety Part #5a: making things faster without reducing durability - using better hardware

This is a follow-up post in the MySQL Master Replication Crash Safety series.  In the previous posts, we explored the consequences of reducing durability on masters (different data inconsistencies after an OS crash depending on replication type) and the performance boost associated with this configuration (benchmark results done on Google Cloud Platform / GCP).  The consequences are summarised in the introduction of Part #4, and the tests are the subject of this last post.  Also in this last post, I mentioned that my results for high durability are limited by the sync latencies of GCP persistent disks.  As I found a system with better latencies, I am able to present new results.  And this system is a vm in Amazon Web Services (AWS) with local SSD.

MySQL Master Replication Crash Safety Part #5: faster without reducing durability (under the hood)

This post is a sister post to MySQL Master Replication Crash Safety Part #5: making things faster without reducing durability.  There is no introduction or conclusion to this post, only landing sections: reading this post without its context is not not recommended. You should start with the main post and come back here for more details.

Tuesday, July 9, 2019

MySQL Master Replication Crash Safety Part #4: benchmarks of high and low durability

This is a follow-up post in the MySQL Master Replication Crash Safety series.  In the three previous posts, we explored the consequence of reducing durability on masters (including setting sync_binlog to a value different from 1).  But so far, I only quickly presented why a DBA would run MySQL with such configuration.  In this post, I present actual benchmark results.  I also present a fundamental difference between on-premise servers and cloud virtual machines as my tests are done in Google Cloud Platform (GCP).  But before going further, let's summarise the previous posts.

MySQL Master Replication Crash Safety part #4: benchmarks (under the hood)

This post is a sister post to MySQL Master Replication Crash Safety Part #4: benchmarks of high and low durability.  There are no introduction or conclusion to this post, only landing sections: reading this post without its context is not recommended. You should start with the main post and come back here for more details.

Tuesday, June 25, 2019

Getting past 2400 threads with sysbench and dbdeployer for benchmarking MySQL 5.7.26

I am currently running tests with sysbench and dbdeployer on a Google Cloud Platform Debian 9 instance.  I ran into an interesting limit and lifting it was not straightforward.  I hope that by sharing this, you can avoid losing too much time.

Wednesday, May 8, 2019

Not enforcing SSL on CloudSQL, really !

When creating a MySQL CloudSQL instance, SSL connections are not enforced by default and you get below in the Connections tab of the Google Cloud Platform console.  Is this a problem ?  Some people might think no, but I do not agree with them.  And if I am writing this post, you can probably guess that there is a lot to say about this subject.  Read on for the details.

Wednesday, May 1, 2019

Tuesday, February 26, 2019

MySQL Master High Availability and Failover: more thoughts

Some months ago, Shlomi Noach published a series about Service Discovery.  In his posts, Shlomi describes many ways for an application to find the master.  He also gives detail on how these solutions cope with failing-over to a slave, including their integration with Orchestrator.

This is a great series, and I recommend its reading for everybody implementing master failover, with or without Orchestrator, even if you are not fully automating the process yet.  Taking a step back, I realized that service discovery is only one of the five parts of a full MySQL Master Failover Strategy; this post is about these five parts.  In some follow-up posts, I might analyze some deployments using the framework presented in this post.

Tuesday, February 12, 2019

MySQL Master Replication Crash Safety Part #3: GTID

This is a follow-up post in the MySQL Master Replication Crash Safety series.  In the two previous posts, we explored the consequence of reducing durability on masters (including setting sync_binlog to a value different than 1) when slaves are using legacy file+position replication.  In this post, we cover GTID replication.  This introduces a new inconsistency scenario with a potential replication breakage that depends on transaction execution on the master and timing on the slave.  Before discussing this violation of ACID, we start with some reminders about the last posts and with some explanations about GTIDs.

Tuesday, January 8, 2019

Care when changing MASTER_DELAY with CHANGE MASTER TO (delayed replication)

A few days ago, I stepped into a trap !  This made me lose time for fixing things (and even more for writing this post...).  In the hope that you will avoid my mistake, I am sharing this war story.  I also obviously opened a bug, more about this below.

TL&DR: be careful when using CHANGE MASTER TO MASTER_DELAY = N: it might wipe your relay logs !

Thursday, November 15, 2018

MySQL Master Replication Crash Safety Part #2: lagging slaves

This is Part #2 of the MySQL Master Replication Crash Safety series.  In the previous post, we explored the consequence of reducing durability on masters with slaves using legacy file+position replication.  The consequences are data inconsistencies with a clear warning sign: the slaves stop replicating and report an error.  In this post, we extend our understanding of the impact of running a master with sync_binlog != 1 by considering lagging slaves that are using file+position replication.  This introduces a new inconsistency scenario without replication breakage, which is trickier to detect.  But let's start with some reminders.

Tuesday, November 13, 2018

How to install Percona Server 5.7 on Debian/Ubuntu without a root password prompt

In the last few months, I had to install Percona Server 5.7 (PS5.7) on Debian a few times.  I was regularly annoyed by apt-get -y install percona-server-server-5.7 prompting me for a password.  But that annoyance did not push me to investigate the subject in detail: it was always a quick manual fix and Googling did not lead to a straightforward solution.  However in the last days, I had to install PS5.7 on many servers, so it was worth finding how to do this.  I am writing this post in the hope that it will help some people...

Tuesday, October 30, 2018

On the consequences of sync_binlog != 1 (part #1)

A well-known performance booster in MySQL is to set sync_binlog to 0.  However, this configuration alone comes with serious consequences on consistency and on durability (the C and D of ACID); I explore those in this series.  In this post, I give some background on the sync_binlog parameter and I explain part of the problem with setting it to 0 (or to a value different from 1).  The other problems — including the behaviour with GTIDs (I am limiting the current scope to legacy file + position replication) — and some solutions are the subject of the upcoming posts.

Tuesday, September 11, 2018

Unforeseen use case of my GTID work: replicating from AWS Aurora to Google CloudSQL

A colleague brought an article to my attention.  I did not see it on Planet MySQL where I get most of the MySQL news (or it did not catch my eye there).  As it is interesting replication stuff, I think it is important to bring it to the attention of the MySQL Community, so I am writing this short post.

The surprising part for me is that it uses my 4-year-old work for online migration to GTID with MySQL 5.6.  This is a completely unforeseen use case of my work as I never thought that my hack would be useful after Oracle include an online migration path to GTID in MySQL 5.7 (Percona did something similar for MySQL 5.6).

Monday, August 27, 2018

Another Post on the Percona Community Blog, Bug Activities on Replication Crash Safety, and Percona Live Europe

I published another article on the Percona Community Blog.  This time, it is about Semi-Synchronous Replication.  You can read the post here:
I previously wrote about my motivation to publish on the Percona Community Blog.  Things have not changed: I still believe it is a great community initiative that I want to encourage.  You can learn more about the Percona Community Blog in their Hello World and in the Story So Far posts.

Thursday, June 28, 2018

JFG Posted on the Percona Community Blog - A Nice Feature in MariaDB 10.3: no InnoDB Buffer Pool in Core Dumps

I just posted an article on the Percona Community Blog.  You can access it following this link:
I do not know if I will stop publishing posts on my personal blog or use both, I will see how things go.  In the rest of this post, I will share why I published there and how things went in the process.

Thursday, April 19, 2018

Some bugs and spring pilgrimage to Percona Live Santa Clara 2018

I am now in an airport, waiting for one of the four flights that will bring me to Percona Live Santa Clara 2018.  This is a good time to write some details about my tutorial on parallel replication.  But before talking about Percona Live, I will share thoughts on MySQL/MariaDB bugs that caught my attention in the last weeks/months (Valeriy: you clearly have an influence on me).

Saturday, January 27, 2018

Next week in Brussels: Parallel Replication at the MySQL Pre-FOSDEM Day

FOSDEM is next weekend and I am talking about Parallel Replication on Friday, February 2nd at the MySQL Pre-FOSDEM Day (there might be tickets left in case of cancellation, attendance is free of charge).  During this talk, I will show benchmark results of MySQL 8.0 parallel replication on Booking.com real production environments.  I thought I could share a few things before the talk so here it is.

Thursday, January 11, 2018

More Write Set in MySQL: Group Replication Certification

This is the third post in the series on Write Set in MySQL.  In the first post, we explore how Write Set allows to get better parallel replication in MySQL 8.0.  In the second post, we saw how the MySQL 8.0 improvement is an extension of the work done in MySQL 5.7 to avoid replication delay/lag in Group Replication.  In this post, we will see how Write Set is used in Group Replication to detect conflicts in multi-writer mode during certification.  We will also see the impacts, on conflict detection, of the Write Set bug that I presented in the first post.

Monday, January 8, 2018

Write Set in MySQL 5.7: Group Replication

In my previous post, I write that Write Set is not only in MySQL 8.0 but also in MySQL 5.7 though a little hidden.  In this post, I describe Write Set in 5.7 and this will bring us in the inner-working of Group Replication.  I am also using this opportunity to explain and show why members of a group can replicate faster than a standard slave.  We will also see the impacts, on Group Replication, of the Write Set bug that I presented in my last post.

Wednesday, January 3, 2018

An update on Write Set (parallel replication) bug fix in MySQL 8.0

In my MySQL Parallel Replication session at Percona Live Santa Clara 2017, I talked about a bug in Write Set tracking for parallel replication (Bug#86078).  At the time, I did not fully understand what was going wrong but since then, we (Engineers at Oracle and me) understood what happened and the bug is supposed to be fixed in MySQL 8.0.4.  This journey thought me interesting MySQL behavior and bug reporting practices.  In this post, I am sharing both in addition to some insight on Write Set tracking for parallel replication.

Tuesday, November 28, 2017

Here is the CREATE TABLE of death

In a previous post, I talked about the existence of a CREATE TABLE that is crashing MySQL up to versions 5.5.58, 5.6.38 and 5.7.20, and MariaDB up to version 5.5.57, 10.0.32, 10.1.26 and 10.2.7.  I hope you upgraded (or can mitigate this problem in another way) as I am now publishing the CREATE TABLE of death.

Thursday, October 19, 2017

A crashing bug in MySQL: the CREATE TABLE of death (more fun with InnoDB Persistent Statistics)

I ended one of my last posts - Fun with InnoDB Persistent Statistics - with a cryptic sentence: there is more to say about this but I will stop here for now.  What I did not share at the time is the existence of a crashing bug somehow related to what I found.  But let's start with some context.