At my FOSDEM talk earlier this year, I gave a trick for fixing a crashed GTID replica. I never blogged about this, so now is a good time. What is pushing me to write on this today is my talk at MinervaDB Athena 2020 this Friday. At this conference, I will present more details about MySQL replication crash safety. So you know what to do if you want to learn more about this subject. Let's now talk about voodoo.
Showing posts with label GTID. Show all posts
Showing posts with label GTID. Show all posts
Saturday, December 5, 2020
Monday, January 27, 2020
A Legacy Behavior of MySQL Corrupting Restored Backups (replicate-same-server-id = OFF)
In my previous post (Puzzled by MySQL Replication), I describe a weird, but completely documented, behavior of replication that had me scratching my head for hours because it was causing data corruption. I did not give too many details then as I also wanted allowing you to scratch your head if you wished. In this post, I describe this behavior in more details.
Thursday, January 9, 2020
Puzzled by MySQL Replication (War Story)
Recently, I was puzzled by MySQL replication ! Some weird, but completely documented, behavior of replication had me scratching my head for hours. I am sharing this war story so you can avoid losing time like me (and also maybe avoid corrupting your data when restoring a backup). The exact justification will come in a follow-up post, so you can also scratch your head trying to understand what I faced. So let's dive-in.
Tuesday, February 26, 2019
MySQL Master High Availability and Failover: more thoughts
Some months ago, Shlomi Noach published a series about Service Discovery. In his posts, Shlomi describes many ways for an application to find the master. He also gives detail on how these solutions cope with failing-over to a slave, including their integration with Orchestrator.
This is a great series, and I recommend its reading for everybody implementing master failover, with or without Orchestrator, even if you are not fully automating the process yet. Taking a step back, I realized that service discovery is only one of the five parts of a full MySQL Master Failover Strategy; this post is about these five parts. In some follow-up posts, I might analyze some deployments using the framework presented in this post.
This is a great series, and I recommend its reading for everybody implementing master failover, with or without Orchestrator, even if you are not fully automating the process yet. Taking a step back, I realized that service discovery is only one of the five parts of a full MySQL Master Failover Strategy; this post is about these five parts. In some follow-up posts, I might analyze some deployments using the framework presented in this post.
Tuesday, February 12, 2019
MySQL Master Replication Crash Safety Part #3: GTID
This is a follow-up post in the MySQL Master Replication Crash Safety series. In the two previous posts, we explored the consequence of reducing durability on masters (including setting sync_binlog to a value different than 1) when slaves are using legacy file+position replication. In this post, we cover GTID replication. This introduces a new inconsistency scenario with a potential replication breakage that depends on transaction execution on the master and timing on the slave. Before discussing this violation of ACID, we start with some reminders about the last posts and with some explanations about GTIDs.
Tuesday, September 11, 2018
Unforeseen use case of my GTID work: replicating from AWS Aurora to Google CloudSQL
A colleague brought an article to my attention. I did not see it on Planet MySQL where I get most of the MySQL news (or it did not catch my eye there). As it is interesting replication stuff, I think it is important to bring it to the attention of the MySQL Community, so I am writing this short post.
The surprising part for me is that it uses my 4-year-old work for online migration to GTID with MySQL 5.6. This is a completely unforeseen use case of my work as I never thought that my hack would be useful after Oracle include an online migration path to GTID in MySQL 5.7 (Percona did something similar for MySQL 5.6).
The surprising part for me is that it uses my 4-year-old work for online migration to GTID with MySQL 5.6. This is a completely unforeseen use case of my work as I never thought that my hack would be useful after Oracle include an online migration path to GTID in MySQL 5.7 (Percona did something similar for MySQL 5.6).
Thursday, October 15, 2015
Do not run those commands with MariaDB GTIDs - part # 2
Update 2016-01-30: restarting the IO_THREAD might be considered useful in some situations (avoiding MDEV-9138). Look for "in contrast, if the IO thread was also stopped first" in MDEV-6589 for more information.
In a previous post, I listed some sequences of commands that you should not run on a MariaDB slave that is lagging and which is using the GTID protocol. Those are the following (do not run them, it's a trap):
In a previous post, I listed some sequences of commands that you should not run on a MariaDB slave that is lagging and which is using the GTID protocol. Those are the following (do not run them, it's a trap):
- "STOP SLAVE; START SLAVE UNTIL ...;",
- or "STOP SLAVE; START SLAVE;" (to remove an UNTIL condition as an example),
- or "STOP SLAVE; SET GLOBAL slave_parallel_threads=...; START SLAVE;",
- and maybe others.
Monday, October 12, 2015
Do not run those commands with MariaDB GTIDs - part # 1
In the spirit of sharing war stories and avoiding others to do the same mistakes as I did, here are some sequences of commands that you should avoid to run on a MariaDB slave that is lagging and which is using the GTID protocol. Remember, do not run those because...
Thursday, April 23, 2015
Self-Critic and Slides of my PLMCE Talks
The link to the slides of my talks can be found at the end of this post but first, let me share some thoughts about PLMCE.
Talking with people, I was surprised to be criticized of presenting only the good sides of my solution without giving credit to the good side of the alternative solutions. More than surprised, I was also a little shocked as I want to be perceived as objective as possible. Let me try to fix that:
Talking with people, I was surprised to be criticized of presenting only the good sides of my solution without giving credit to the good side of the alternative solutions. More than surprised, I was also a little shocked as I want to be perceived as objective as possible. Let me try to fix that:
Wednesday, April 8, 2015
Even Easier Master Promotion (and High Availability) for MySQL (no need to touch any slave)
Dealing with the failure of a MySQL master is not simple. The most common solution is to promote a slave as the new master but in an environment where you have many slaves, the asynchronous implementation of replication gets in your way. The problem is that each slave might be in a different state:
- some could be very close to the dead master,
- some could be missing the latest transactions,
- and some could be far behind (lagging, delayed slaves, or slaves in maintenance).
Wednesday, March 25, 2015
Follow up on MySQL 5.6 GTIDs: Evaluation and Online Migration
One year ago, I blogged about Evaluation and Online Migration of MySQL 5.6 GTIDs. At that time, we setup the following test environment where:
- A is a production master with GTIDs disabled,
- D to Z are standard slaves with GTIDs disabled,
- B is an intermediate master running my recompiled version of MySQL implementing the ANONYMOUS_IN-GTID_OUT mode (see the details my previous post),
- C is a slave with GTID enabled.
Subscribe to:
Posts (Atom)