Tuesday, February 26, 2019

MySQL Master High Availability and Failover: more thoughts

Some months ago, Shlomi Noach published a series about Service Discovery.  In his posts, Shlomi describes many ways for an application to find the master.  He also gives detail on how these solutions cope with failover to a slave, including their integration with Orchestrator.

This is a great series, and I recommend its reading for everybody implementing master failover, with or without Orchestrator, and even if you are not fully automating the process yet.  Taking a step back, I realized that service discovery is only one of the five parts of a full MySQL Master Failover Strategy, and this post is about these five parts.  In some follow-up posts, I might analyze some deployments using the framework presented in this post.

Tuesday, February 12, 2019

MySQL Master Replication Crash Safety Part #3: GTID

This is a follow-up post in the MySQL Master Replication Crash Safety series.  In the two previous posts, we explored the consequence of reducing durability on masters (including setting sync_binlog to a value different than 1) when slaves are using legacy file+position replication.  In this post, we cover GTID replication.  This introduces a new inconsistency scenario with a potential replication breakage that depends on transaction execution on the master and timing on the slave.  Before discussing this violation of ACID, we start with some reminders about the last posts and with some explanations about GTIDs.

Tuesday, January 8, 2019

Care when changing MASTER_DELAY with CHANGE MASTER TO (delayed replication)

A few days ago, I stepped into a trap !  This made me lose time for fixing things (and even more for writing this post...).  In the hope that you will avoid my mistake, I am sharing this war story.  I also obviously opened a bug, more about this below.

TL&DR: be careful when using CHANGE MASTER TO MASTER_DELAY = N: it might wipe your relay logs !

Thursday, November 15, 2018

MySQL Master Replication Crash Safety Part #2: lagging slaves

This is Part #2 of the MySQL Master Replication Crash Safety series.  In the previous post, we explored the consequence of reducing durability on masters with slaves using legacy file+position replication.  The consequences are data inconsistencies with a clear warning sign: the slaves stop replicating and report an error.  In this post, we extend our understanding of the impact of running a master with sync_binlog != 1 by considering lagging slaves that are using file+position replication.  This introduces a new inconsistency scenario without replication breakage, which is trickier to detect.  But let's start with some reminders.

Tuesday, November 13, 2018

How to install Percona Server 5.7 on Debian/Ubuntu without a root password prompt

In the last few months, I had to install Percona Server 5.7 (PS5.7) on Debian a few times.  I was regularly annoyed by apt-get -y install percona-server-server-5.7 prompting me for a password.  But that annoyance did not push me to investigate the subject in detail: it was always a quick manual fix and Googling did not lead to a straightforward solution.  However in the last days, I had to install PS5.7 on many servers, so it was worth finding how to do this.  I am writing this post in the hope that it will help some people...

Tuesday, October 30, 2018

On the consequences of sync_binlog != 1 (part #1)

A well-known performance booster in MySQL is to set sync_binlog to 0.  However, this configuration alone comes with serious consequences on consistency and on durability (the C and D of ACID); I explore those in this series.  In this post, I give some background on the sync_binlog parameter and I explain part of the problem with setting it to 0 (or to a value different from 1).  The other problems — including the behaviour with GTIDs (I am limiting the current scope to legacy file + position replication) — and some solutions are the subject of the upcoming posts.

Tuesday, September 11, 2018

Unforeseen use case of my GTID work: replicating from AWS Aurora to Google CloudSQL

A colleague brought an article to my attention.  I did not see it on Planet MySQL where I get most of the MySQL news (or it did not catch my eye there).  As it is interesting replication stuff, I think it is important to bring it to the attention of the MySQL Community, so I am writing this short post.

The surprising part for me is that it uses my 4-year-old work for online migration to GTID with MySQL 5.6.  This is a completely unforeseen use case of my work as I never thought that my hack would be useful after Oracle include an online migration path to GTID in MySQL 5.7 (Percona did something similar for MySQL 5.6).

Monday, August 27, 2018

Thursday, June 28, 2018

JFG Posted on the Percona Community Blog - A Nice Feature in MariaDB 10.3: no InnoDB Buffer Pool in Core Dumps

I just posted an article on the Percona Community Blog.  You can access it following this link:
I do not know if I will stop publishing posts on my personal blog or use both, I will see how things go.  In the rest of this post, I will share why I published there and how things went in the process.

Thursday, April 19, 2018

Some bugs and spring pilgrimage to Percona Live Santa Clara 2018

I am now in an airport, waiting for one of the four flights that will bring me to Percona Live Santa Clara 2018.  This is a good time to write some details about my tutorial on parallel replication.  But before talking about Percona Live, I will share thoughts on MySQL/MariaDB bugs that caught my attention in the last weeks/months (Valeriy: you clearly have an influence on me).

Saturday, January 27, 2018

Next week in Brussels: Parallel Replication at the MySQL Pre-FOSDEM Day

FOSDEM is next weekend and I am talking about Parallel Replication on Friday, February 2nd at the MySQL Pre-FOSDEM Day (there might be tickets left in case of cancellation, attendance is free of charge).  During this talk, I will show benchmark results of MySQL 8.0 parallel replication on Booking.com real production environments.  I thought I could share a few things before the talk so here it is.

Thursday, January 11, 2018

More Write Set in MySQL: Group Replication Certification

This is the third post in the series on Write Set in MySQL.  In the first post, we explore how Write Set allows to get better parallel replication in MySQL 8.0.  In the second post, we saw how the MySQL 8.0 improvement is an extension of the work done in MySQL 5.7 to avoid replication delay/lag in Group Replication.  In this post, we will see how Write Set is used in Group Replication to detect conflicts in multi-writer mode during certification.  We will also see the impacts, on conflict detection, of the Write Set bug that I presented in the first post.

Monday, January 8, 2018

Write Set in MySQL 5.7: Group Replication

In my previous post, I write that Write Set is not only in MySQL 8.0 but also in MySQL 5.7 though a little hidden.  In this post, I describe Write Set in 5.7 and this will bring us in the inner-working of Group Replication.  I am also using this opportunity to explain and show why members of a group can replicate faster than a standard slave.  We will also see the impacts, on Group Replication, of the Write Set bug that I presented in my last post.

Wednesday, January 3, 2018

An update on Write Set (parallel replication) bug fix in MySQL 8.0

In my MySQL Parallel Replication session at Percona Live Santa Clara 2017, I talked about a bug in Write Set tracking for parallel replication (Bug#86078).  At the time, I did not fully understand what was going wrong but since then, we (Engineers at Oracle and me) understood what happened and the bug is supposed to be fixed in MySQL 8.0.4.  This journey thought me interesting MySQL behavior and bug reporting practices.  In this post, I am sharing both in addition to some insight on Write Set tracking for parallel replication.

Tuesday, November 28, 2017

Here is the CREATE TABLE of death

In a previous post, I talked about the existence of a CREATE TABLE that is crashing MySQL up to versions 5.5.58, 5.6.38 and 5.7.20, and MariaDB up to version 5.5.57, 10.0.32, 10.1.26 and 10.2.7.  I hope you upgraded (or can mitigate this problem in another way) as I am now publishing the CREATE TABLE of death.

Thursday, October 19, 2017

A crashing bug in MySQL: the CREATE TABLE of death (more fun with InnoDB Persistent Statistics)

I ended one of my last posts - Fun with InnoDB Persistent Statistics - with a cryptic sentence: there is more to say about this but I will stop here for now.  What I did not share at the time is the existence of a crashing bug somehow related to what I found.  But let's start with some context.

Wednesday, August 16, 2017

The danger of no Primary Key when replicating in RBR (and a partial protection with MariaDB 10.1)

TL;DR: unless you know what you are doing, you should always have a primary key on your tables when replicating in RBR (and maybe even all the time).

TL;DR2: MariaDB 10.1 has an interesting way to protect against missing a primary key (innodb_force_primary_key) but it could be improved.

A few weeks ago, I was called off hours because replication delay on all the slaves from a replication chain was high and growing.  It was not the first time this happened on that chain, so I thought right away that this was probably an UPDATE or DELETE of many rows on a table without a primary key.  Let's see what is the problem with this and to understand that, we have to talk about binary log formats.

Monday, August 14, 2017

More Details about InnoDB Compression Levels (innodb_compression_level)

In one of my previous posts, I shared InnoDB table compression statistics for a read-only dataset using the default value of innodb_compression_level (6).  In it, I claimed, without giving much detail, that using the maximum value for the compression level (9) would not make a big difference.  In this post, I will share more details about this claim.

TL;DR: tuning innodb_compression_level is not very useful for my dataset.

Thursday, August 10, 2017

Why we still need MyISAM (for read-only tables)

TL;DR: we still need MyISAM and myisampack because it uses less space on disk (half of compressed InnoDB) !

In the previous post, I shared my experience with InnoDB table compression on a read-only dataset.  In it, I claimed, without giving much detail, that using MyISAM and myisampack would result is a more compact storage on disk.  In this post, I will share more details about this claim.

Monday, August 7, 2017

An Adventure in InnoDB Table Compression (for read-only tables)

In my last post about big MySQL deployments, I am quickly mentioning that InnoDB compression is allowing dividing disk usage by about 4.3 on a 200+ TiB dataset.  In this post, I will give more information about this specific use case of InnoDB table compression and I will share some statistics and learnings on this system and subject.  Note that I am not covering InnoDB page compression which is a new feature of MySQL 5.7 (also known as hole punching).

Wednesday, July 19, 2017

InnoDB Basics - Compaction: when and when not

This is old news for MySQL/MariaDB expert but people that are starting using InnoDB do not always know that disk space is not automatically released when deleting data from a table.  To explain and demonstrate that, I will take two real-world examples: table1 and table2.

Wednesday, July 5, 2017

Fun with InnoDB Persistent Statistics

Something interesting happened to me in the last days, and it is worth sharing.  I was upgrading MariaDB (MySQL also impacted) to a new major version and mysql_upgrade showed something like this:
[...]
Phase 4/7: Running 'mysql_fix_privilege_tables'
ERROR 1062 (23000) at line 586: Duplicate entry 'schema-table_name#P#partition_name_truncated' for key 'PRIMARY'
ERROR 1062 (23000) at line 590: Duplicate entry 'schema-table_name#P#partition_name_truncated' for key 'PRIMARY'
ERROR 1062 (23000) at line 593: Duplicate entry 'schema-table_name#P#partition_name_truncated' for key 'PRIMARY'
FATAL ERROR: Upgrade failed

Monday, May 22, 2017

Better Replication when running both InnoDB and MyRocks (or other Storage-Engines)

Kristian Nielsen is working on a new feature for MariaDB 10.3 and he published very interesting results.  This feature is MDEV-12179: Per-engine mysql.gtid_slave_pos tables.  He writes about replicating twice as fast in the worst case when using two storage engines (InnoDB and MariaRocks in his tests, but could also be InnoDB and TokuDB or TokuDB and MyRocks).  I will let you read all the details on his blog about Improving replication with multiple storage engines.

Why am I posting this here ?  Mostly because I want to share with you that:
  • I am also involved in this project,
  • I am working closely with Kristian on this feature,
  • and that Booking.com is financing Kristian's time on this development.
If you are also interested in this, feel free to comment in the JIRA MDEV, to leave a comment below, or on Kristian's post.

Tuesday, April 18, 2017

Sunday, April 16, 2017

Booking.com talks at Percona Live Santa Clara 2017

In a week, me and some Booking.com colleagues will be in Santa Clara for Percona Live.

Booking.com is sponsoring the conference and we will be present at the Monday Evening Reception.  You do not need a tutorial pass to attend the dinner (even if it is on the tutorial day): any valid pass will do.  If you do not have your ticket yet, it is time to register (you can use the discount code “SeeMeSpeak” for a 10% discount on the registration fees).

Tuesday, April 11, 2017

Many thanks Oracle for implementing RESET MASTER TO

MySQL 8.0.1 is out and it includes an implementation of my feature request (Bug #77438).  This extension to RESET MASTER allows to simplify master promotion with Binlog Servers.  Let's see how it works:
# mysql -N <<< "SHOW MASTER STATUS"
binlog.027892   3006935
# mysql -N <<< "RESET MASTER TO 12345; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.012345   92773
# mysql -N <<< "RESET MASTER TO 12345678; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.12345678 24795
# mysql -N <<< "RESET MASTER TO 1234567890; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.1234567890       13987
# mysql -N <<< "RESET MASTER TO 12345678901; DO sleep(rand()*10); SHOW MASTER STATUS"
ERROR 3567 (HY000) at line 1: The requested value '12345678901' for the next
binary log index is out of range. Please use a value between '1' and '2147483647'.
# mysql -N <<< "RESET MASTER TO $RANDOM; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.013529   89880
# mysql -N <<< "RESET MASTER TO $RANDOM; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.000831   22961
# mysql -N <<< "RESET MASTER TO $RANDOM; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.023089   107764
# mysql -N <<< "RESET MASTER TO $RANDOM; DO sleep(rand()*10); SHOW MASTER STATUS"
binlog.003433   67903
Many thanks Oracle for implementing my feature request, and a special mention to Daniël van Eeden for providing a patch in the bug report.

Monday, March 6, 2017

Better InnoDB Crash Recovery in MariaDB 10.1


Recently, I had to go through crash recovery of a large MariaDB 10.1.21 instance.  After starting MariaDB, I started tailing the error logs expecting to wait many minutes while InnoDB was scanning ibd files.  I was surprised (and actually delighted) with this:

Wednesday, February 8, 2017

A Metric for Tuning Parallel Replication in MySQL 5.7

MySQL 5.7 introduced the LOGICAL_CLOCK type of multi-threaded slave (MTS).  When using this type of parallel replication (and when slave_parallel_workers is greater than zero), slaves use information from the binary logs (written by the master) to run transactions in parallel.  However, enabling parallel replication on slaves might not be enough to get a higher replication throughput (VividCortex blogged about such a situation recently in Solving MySQL Replication Lag with LOGICAL_CLOCK and Calibrated Delay).  To get a faster slave with parallel replication, some tuning is needed on the master.

Friday, January 20, 2017

How upgrading MariaDB Server failed because 50M warnings were ignored

This post is part of the series "please do not ignore warnings in MySQL/MariaDB".  The previous post of the series can be found here.

In this post, I will present why ignoring warnings made me lose time in upgrading MariaDB Server.  I think this war story is entertaining to read and it is also worth presenting to people claiming that ignoring warnings is no big deal.

Wednesday, January 18, 2017

Why I wrote "please do not ignore warnings" and "to always investigate/fix warnings" (in MySQL/MariaDB)

In a last post, I wrote the two following sentences:
  • please do not ignore warnings
  • always investigate/fix warnings
I realized that without context, this might be hard to understand.  In this post, I want to give more background about these two sentences.

Friday, January 13, 2017

Oracle MySQL and the funny replication breakage of Friday, January 13

In my previous post, I talked about a funny replication breakage that I experienced with MariaDB.  So what about different versions of MySQL...

Funny replication breakage of Friday, January 13

A funny replication breakage kept me at the office longer than expected today (Friday 13 is not kind with me).

So question of the day: can you guess what the below UPDATE statement does (or what is wrong with it) ?

Monday, October 3, 2016

Last Details about the Percona Live Amsterdam Community Dinner 2016

Here are the last details about the Percona Live 2016 Community Dinner hosted at Booking.com:
Booking.com will also do a short talk.  The subject is a surprise !

Friday, September 30, 2016

Percona Live Amsterdam and MariaDB Developer Meeting 2016: Tip to Stay Dry

Or should I say "to avoid getting soaked"...

The Amsterdam weather forecasts for next week is out and even if it looks good for the Percona Live MySQL and (No)SQL Conference (Monday to Wednesday) and for the MariaDB Developer Meeting (Thursday to Saturday), it could still change (or you might suffer the rain in the week-end):


Thursday, September 15, 2016

Please register to the Percona Live Amsterdam Community Dinner 2016

Percona Live Amsterdam is in 3 weeks, and on the evening of the second day of the conference (Tuesday October 4th 2016), there is the traditional Community Dinner.

As last year, Booking.com is hosting the event and as last year, canal boats will bring attendees from the conference venue to Booking.com headquarters.

This event involves some planning: Percona needs to arrange for canal boats and Booking.com needs to order food and drinks and plan for catering staff.  In both cases, too much or not enough is bad for obvious reasons, and this is why we ask you to register.

The normal sales end soon.  After that, commitments need to be made and we will have less flexibility for accommodating more people.

So please help us make the event a success by registering to the Percona Live Community Dinner.

And as a reminder, you can see a picture from last year below, more pictures from the event and boat trip on Percona facebook album.

Saturday, August 20, 2016

A discussion about sync-master-info and other replication parameters

Some time ago, feedback was requested on new replication default after MySQL 5.7.  Some of the suggested default are:
I agree on the suggestions for relay-log-info-repository and relay-log-recovery: they are needed for crash safe replication, and having crash safe replication enabled by default is a good thing.  I have doubts about master-info-repository: I do not see what benefits are introduced by this change.  I have much bigger doubts about sync-master-info and sync-relay-log: those changes bring an illusion of safety, which is bad.  Let's dive in more details about those parameters.

Saturday, July 16, 2016

Understanding Bulk Index Creation in InnoDB (and innodb_sort_buffer_size)

In a previous post, I presented an Unexpected Memory Consumption for Bulk Index Creation in InnoDB.  This was triggered by an increased innodb_sort_buffer_size and as stated in another post: "the sorting algorithm does not scale well with large sort buffers".  In this post, I will present why it does not scale well and I will suggest solutions.

Tuesday, July 5, 2016

Let's meet at Percona Live Amsterdam

I am very happy that my talk, MySQL Parallel Replication: inventory, use-cases and limitations, is included in the Sneak Peek of Percona Live Amsterdam.  As a member of the Conference Committee, I knew this was being discussed, but I refrained from commenting on discussion about my talk and the submissions of my colleagues from Booking.com.

Monday, April 11, 2016

MySQL Parallel Replication and more Booking.com talks at Percona Live (April 2016)

In a few days, I will be flying to San Francisco and then making my way to Santa Clara to attend the Percona Live Conference.  On the last day of the conference (Thursday), I will speak about MySQL Parallel Replication.  I hope to see you there and I will he happy to answer questions you might have (on this subject and others):

Sunday, January 24, 2016

Replication crash safety with MTS in MySQL 5.6 and 5.7: reality or illusion?

Reminder: MTS = Multi-Threaded Slave.

Update 2017-04-17: since the publication of this post, many things happened:
  • the procedure for fixing a crashed slave has been automated (Bug#77496)
  • Bug#80103 as been closed at the same time as Bug#77496
  • but I still think there are unfixed things, see Bug#81840
Update 2018-08-20: it was brought to my attention that MySQL 5.6 with GTID enabled and with sync_binlog != 1 is not replication crash safe, even with single threaded replication.  This is reported in Bug#70659 - Make crash safe slave work with gtid + less durable settings.  Thanks Valeriy for mentioning, in your last blog post, the bug from Sveta Smirnova (Bug#90997) which pointed me to Yoshinori Matsunobu's bug.  To my knowledge, MySQL 5.7 and 8.0 are also affected by this bug, even if this should not be the case.  I will soon open a new bug report about this and will put the bug number below.
    End of updates.

    I will be talking about parallel replication at FOSDEM in Brussel on January 30th and at Percona Live Santa Clara in April (link to the talk descriptions here and here).  Come to one (or both) of those talks to learn more about this subject.

    Thursday, December 3, 2015

    JFG proposed sessions for Percona Live Santa Clara (and Community voting)

    This year, Percona introduced Community Voting for Percona Live submission.  This is what you can read on the conference website:
    In an effort to involve the larger community in the selection of speaking sessions for the 2016 Percona Live Data Performance Conference, we’ve implemented a community voting process. After a speaker submits a proposal we encourage sharing to the community and social networks for a vote. The more highly ranked proposals will continue onto the next phase of the voting process with the conference committee.

    Saturday, October 17, 2015

    Binlog Servers for Simplifying Point in Time Recovery

    A common way to implement point in time recovery capability is:
    1. to regularly do a full backup of a database,
    2. and to save the binary logs of that database (or from its master if doing backups on a slave).
    When point in time recovery is required you need to:
    1. restore a backup,
    2. and apply the binary logs up to the point of recovery.
    (Step # 2 and # b above are the ones that will be simplified by using Binlog Servers.)