Friday, November 12, 2021

Duplicate Entry in ALTER TABLE (and in OPTIMIZE TABLE)

A few weeks ago and in MySQL 5.7, I had an ALTER TABLE that failed with a duplicate entry error.  This is old news as it is happening since MySQL 5.6, but I only saw this recently because I am normally using online schema change from the Percona Toolkit (pt-osc) or GitHub's online schema migration (gh-ost).  I do not like that and I am disappointed this has not been improved, so this post is partially a rant, but not only, because I explain why ALTER TABLE is failing and I give solutions to avoid the failure.

So ALTER TABLE can fail with a duplicate entry error, let's see this:

mysql [localhost:8027] {msandbox} (test_jfg) > ALTER TABLE t DROP COLUMN v2;
ERROR 1062 (23000): Duplicate entry '1' for key 't.v'

As you can see, this is not a complicated ALTER.  Above is failing for MySQL 8.0.27, and I am also able to reproduce with 5.7.36 and 5.6.51, but not with 5.5.62.  The full reproduction test case is at the end of this post.

I am not the first one seeing this.  When googling "MySQL ALTER TABLE Duplicate entry", I got these results (and yes, OPTIMIZE TABLE has a similar problem):

In the links above, we can get reference to the Online DDL Limitations section of the MySQL Documentation:

When running an in-place online DDL operation, the thread that runs the ALTER TABLE statement applies an online log of DML operations that were run concurrently on the same table from other connection threads. When the DML operations are applied, it is possible to encounter a duplicate key entry error (ERROR 1062 (23000): Duplicate entry), even if the duplicate entry is only temporary and would be reverted by a later entry in the online log. This is similar to the idea of a foreign key constraint check in InnoDB in which constraints must hold during a transaction.

It looks like the online DDL improvements of MySQL 5.6 introduced this behavior.  In my humble opinion, this is a regression: a perfectly working ALTER in 5.5 is now throwing an error in 5.6, 5.7 and 8.0.  From my understanding, the error condition is the following

While the ALTER is running,
anything causing a duplicate entry, even a "silent" one,
makes the ALTER fails in the "log apply" phase.

An example of a "silent" duplicate entry is the one "hidden" by INSERT ... ON DUPLICATE KEY UPDATE (IODKU).  Yes, the ALTER fails with a duplicate entry error if an IODKU ends-up doing an UPDATE while the table is altered (what the documentation calls a temporary duplicate entry) !  And the ALTER is not failing at the moment of the succeeding IODKU (or at the moment of the failing INSERT), it will fail much later - maybe hours later - at log apply phase.  This is a terrible default behavior that is very annoying !

Before ranting even more, let me tell you how to avoid this error.  The first solution is to use a blocking DDL by adding LOCK=EXCLUSIVE to the ALTER.  A non-blocking solution, which is also friendlier to replication (an online ALTER might be partially non-blocking on the primary, but it will block the SQL thread on replicas which might cause lag) is to use pt-osc or gh-ost.  Finally, another solution is to configure MySQL not to use DDLs by setting the variable old_alter_table to ON, which can be set at the session or at the global level (this is the only solution I found for OPTIMIZE TABLE).

If you are not interested in the rant, you should stop reading now.

The first annoyance about ALTER failing with a duplicate entry error is the misleading error message.  The first time I saw this, I asked myself "How can this be ?", "Did I do something wrong ?" "Is my data corrupted ?" and I was worried.  For being more exact and less worrying, the error message should be something like "Online ALTER failed during the log apply phase because of a concurrent duplicate entry (or INSERT ... ON DUPLICATE KEY UPDATE)", and ideally, it should give a solution to this problem (add LOCK=EXCLUSIVE to the ALTER, or use SET old_alter_table = ON).  And for this, I opened Bug#104608 - Improve error message for ATLER failing because concurrent Duplicate entry.

The second annoyance is the unfriendly behavior.  For me, this failure mode should not be the default behavior.  I like that MySQL is implementing online DDL, but online DDL that could fail should not be the default behavior.  I would prefer, when introducing an improvement that could fail, that benefiting from the improvement is an option (in this case, it would be LOCK=NONE) with the previous behavior staying the default.  I understand that is is now a little late to go back to the MySQL 5.5 behavior because it has become expected that some DDL are online, but I wish we would not be where we are today on this.  This is now part of the Dark Side of MySQL until this is fixed (and this is why I am calling this situation the ONLINE DDL purgatory in the next bug report).

A third annoyance about this behavior is that this was not improved for 8 years (MySQL 5.6 was GA in February of 2013) !  Improving the error message as I suggested above is one thing, removing this failure condition is another.  I realize this might be complicated, but let's start with the error message.  Once the message is better, I do not think it is unrealistic to hope for something better.  And for this, I opened Bug#104626 - Remove failure of Online ALTER because concurrent Duplicate entry.

That is all I have for today, if you have historical context that might change my mind about this behavior being unfriendly, or if you have other thoughts on the subject, feel free to leave a comment below.


Reproducing the Duplicate Entry Error During ALTER TABLE with dbdeployer


# Create a sandbox:
dbdeployer deploy single mysql_8.0.27

# Initialize the test environment:
./use <<< "
  CREATE DATABASE test_jfg;
  CREATE TABLE test_jfg.t(
    id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    v INT, v2 INT, UNIQUE KEY(v))"
 
# Make rows big to be able to quickly generate a big table (each row is close to 8 KB):
seq -f "ALTER TABLE t add column c%02.0f CHAR(255) DEFAULT '';" 1 30 | ./use test_jfg

# Make commit fast so loading is not too long:
./use <<< "set global sync_binlog = 0, global innodb_flush_log_at_trx_commit = 0"

# Fill the table with about 1 GB of data:
seq -f "INSERT INTO t (v) VALUES (%0.0f);" 1 $((1024*1024/8)) | ./use test_jfg

# Generate the error with a failing INSERT:
./use -N <<< "SELECT VERSION()" && 
  ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)"; ) & \
  ./use test_jfg <<< "ALTER TABLE t DROP COLUMN c01" && echo "ALTER OK $(date)" || echo "ALTER KO $(date)"
[1] 23767
8.0.27
ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 't.v'
INSERT KO Mon Nov  8 21:56:29 UTC 2021
ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 't.v'
[1]+  Done                    ./use -N <<< "SELECT VERSION()" && ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)" )
ALTER KO Mon Nov  8 21:57:23 UTC 2021

# In above, the first duplicate entry is from the INSERT and the 2nd is from the ALTER.
#   We can see that the INSERT completes before the ALTER (because of the online DDL),
#   which is not the case with 5.5.62 below (INSERT blocked and ALTER succeeding).
./use -N <<< "SELECT VERSION()" && 
  ( sleep 1; date; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)"; ) & \
  ./use test_jfg <<< "ALTER TABLE t DROP COLUMN c01" && echo "ALTER OK $(date)" || echo "ALTER KO $(date)"
[1] 10052
5.5.62
Mon Nov  8 23:31:45 UTC 2021
ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 'v'
ALTER OK Mon Nov  8 23:32:07 UTC 2021
INSERT KO Mon Nov  8 23:32:07 UTC 2021
[1]+  Done                    ./use -N <<< "SELECT VERSION()" && ( sleep 1; date; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)" )

# And we have a similar error with OPTIMIZE TABLE:
./use -N <<< "SELECT VERSION()" && 
  ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)"; ) & \
  ./use test_jfg <<< "OPTIMIZE TABLE t"; echo "OPTIMIZE $(date)"
[1] 31753
8.0.27
ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 't.v'
INSERT KO Mon Nov  8 22:05:48 UTC 2021
Table   Op      Msg_type        Msg_text
test_jfg.t      optimize        note    Table does not support optimize, doing recreate + analyze instead
test_jfg.t      optimize        error   Duplicate entry '1' for key 't.v'
test_jfg.t      optimize        status  Operation failed
[1]+  Done                    ./use -N <<< "SELECT VERSION()" && ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1)" || echo "INSERT KO $(date)" )
OPTIMIZE Mon Nov  8 22:06:53 UTC 2021

# For completeness, generate the error with an INSERT ON DUPLICATE KEY UPDATE:
./use -N <<< "SELECT VERSION()" && 
  ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1) ON DUPLICATE KEY UPDATE v2=0" && echo "IODKU OK $(date)"; ) & \
./use test_jfg <<< "ALTER TABLE t DROP COLUMN c02" && echo "ALTER OK $(date)" || echo "ALTER KO $(date)" [1] 2356 8.0.27 IODKU OK Mon Nov 8 22:08:58 UTC 2021 ERROR 1062 (23000) at line 1: Duplicate entry '1' for key 't.v' [1]+ Done ./use -N <<< "SELECT VERSION()" && ( sleep 1; ./use test_jfg <<< "INSERT INTO t (v) VALUES (1) ON DUPLICATE KEY UPDATE v2=0" && echo "IODKU OK $(date)" ) ALTER KO Mon Nov 8 22:10:06 UTC 2021

6 comments:

  1. Sorry, my bad. I designed and implemented the online ADD INDEX (MySQL WL#5526) and online table rebuild (MySQL WL#6255) in MySQL 5.6.

    The reason of the duplicate key error is that all changes are logged as they are made. That is, BEGIN;INSERT;ROLLBACK; would be logged as INSERT and DELETE.

    In https://jira.mariadb.org/browse/MDEV-16329 you can find some notes about this. I hope that together with https://jira.mariadb.org/browse/MDEV-11675 it will reduce the need of external online schema change tools. When a future MariaDB Server version implements MDEV-16329, I would think that you could simply write
    ALTER TABLE t DROP COLUMN v, ALGORITHM=COPY;
    to have it rebuilt online, without any duplicate key errors.

    The logging for the InnoDB native ALTER TABLE implementation would still have to be improved.

    For what it is worth, MariaDB Server included a number of improvements to ALTER TABLE some years ago in the 10.4 release. Among other things, DROP COLUMN could be executed instantly, without changing the data format. For example, dropping an INT NOT NULL column would continue to waste some space in each new row. Such waste can be disabled by setting innodb_instant_alter_column_allowed=add_last (to allow instant ADD COLUMN) or innodb_instant_alter_column_allowed=never. This should allow you to import tablename.ibd files to older servers.

    ReplyDelete
  2. Jean-François, I see that you had posted comments to https://bugs.mysql.com/bug.php?id=98600 and https://bugs.mysql.com/bug.php?id=76895 about this same problem.

    I found that https://jira.mariadb.org/browse/MDEV-15250 had been closed due to lack of feedback from the reporter. I just reopened that ticket and posted an idea how to fix it.

    ReplyDelete
  3. Hi JF, I didn't realize this issue was a problem with MySQL 8 as well, probably because majority of us probably resort to one of the online schema change tools. However, it makes me rather happy that this will be taken up by the best and the issue will finally see a conclusion.

    ReplyDelete
  4. MariaDB 10.6.8, 10.7.4, 10.8.3 with the fix of MDEV-15250 are now available for download. Thanks to Thirunarayanan Balathandayuthapani for implementing my idea. Like my original implementation of MySQL online ALTER TABLE in 2012, this rewrite was tested by Matthias Leich using the Random Query Generator (RQG). Unlike back then, debugging was much easier, thanks to AddressSanitizer and https://rr-project.org. When the server is run under rr, we get debuggable execution traces leading to failures, which saves a lot of guesswork that would be necessary if we only got core dumps.

    ReplyDelete