Wednesday, November 22, 2023

Thoughts on the October 2023 MySQL Releases

A few days ago, Oracle released three new MySQL GA versions: 8.2.0, 8.0.35 and 5.7.44.  I skimmed the release notes (8.2.0, 8.0.35 and 5.7.44), and I am not impressed.  I guess that I would be even less impressed / more disappointed if I had checked in greater detail, and if I had reviewed the 8.1.0, 8.0.34 and 5.7.43 release notes.  The subject of my disappointment is Oracle not fixing bugs in ALL of the LTS releases, sometimes only fixing them in the latest Innovation Release.  This post summarizes my findings and thoughts.

This post is written in the context of my work on MySQL at Aiven: check our careers page or blog, including our recent acquisition of EverSQL.

In this post, I do not cover new features because David Stokes does this well in his Quick Peek series on the Percona Blog: 8.2 and 8.0.35, 8.0.34 and 8.1.0, and 8.0.33.

While writing this post, I asked colleagues for bugs they would have liked fixed in 8.0.35 and 5.7.44.  Someone from Oracle contacted me asking for my list, so it looks as if they realize they could have done better and want to improve.  I am still publishing this post because I think my findings and thoughts benefit from being written in public.

The Long-Term Support and Innovation Release Models

Oracle looks committed to the new Long-Term Support (LTS) and Innovation Release (IR) models.  On the IR branch, 8.2.0 came-out a few days ago (October 2023) and 8.1.0 in July 2023.  On the LTS branch, 8.0.35 and 8.0.34 came-out at the same time as 8.2.0 and 8.1.0.  I know a lot of people experienced pain with the new features introduced on the 8.0 branch: hopefully, the last breaking changes were in 8.0.33 (and it looks as if Percona believes the same as they removed the version check in XtraBackup 8.0.34).  I am happy that we can now take advantage of new features on the IR branch, or stay stable and up-to-date on the LTS branch, but I am surprised by the release cadence on the IR branch.  New versions every 3 months is a very aggressive new feature release cadence (maybe even too much, I will come back to this below).  That is it for the praise, now comes the more problematic findings and thoughts.

Rushed 8.2.0

For me, the last Innovation Release — 8.2.0 — looks a little rushed.  A colleague found a missing package (Bug #112949: missing mysql-community-libs-compat rpms) and I think at least 2 deprecation notices have not been fully thought through (the option binlog_transaction_dependency_tracking and the table INFORMATION_SCHEMA.PROCESSLIST).  It is OK to deprecate features, but a clear path to supported features should be provided at the same time as deprecation, which was not done in this case.  And this is why I wrote above that there should be fewer releases on the IR branch: one every 6 months is probably more than enough, once a year might even be suitable.  This would allow for more well-thought deprecations and features (I have more to say on the release schedule subject, I touch-back on this in the conclusion).

Below, I give more details about why I think more work was deserved for the two deprecations.  If you are not interested in the details, feel free to skip to the next section: Bugs Fixed in 8.2.0 but Not in 8.0.35.

In the case of the table I_S.PROCESSLIST, there is an option for SHOW PROCESSLIST to use the supported table p_s.processlist (performance_schema_show_processlist), but its default value still uses the deprecated information schema table.  IMHO, this shows a rushed deprecation without a well thought path to the supported feature.  For that, I opened Bug #112872: Consider setting performance_schema_show_processlist to ON by default.

In the case of the binlog_transaction_dependency_tracking option, it is unclear what will be the new value after deprecation.  The current options are COMMIT_ORDER, WRITESET and WRITESET_SESSION.  Once this is deprecated, the best option is clearly not the current default (COMMIT_ORDER), which again shows a rushed deprecation.  The option I recommend in my Parallel Replication talk is WRITESET, but WRITESET_SESSION is not a bad option (it might lead to better replica throughput combined with replica_preserve_commit_order set to OFF).  But I think WRITESET_SESSION should not be the new default nor the only supported option because it does not produce good multi-threaded replication binary logs on replicas (it does not work well for chained replication or new primary promotion).  I did not open a bug for this, and I am waiting for the official plan from Oracle (but so far, the information I have is not very helpful).

Bugs Fixed in 8.2.0 but Not in 8.0.35

The 8.0 branch is LTS and supported until April 2026.  With this, I expect bugs fixed in 8.2.0 to also be fixed in 8.0.35, but this is not what I see (I know of at least 8 such bugs).  Obviously, exceptions apply, including omission / occasional mistake (but 8 mistakes are a lot), or fixes needing new features that do not make sense to back-port (examples are given in the 5.7 section below).  In both of these exceptions, I expect transparency in the public bug system, either by acknowledging a mistake or by explaining why a bug is not fixed in all LTS branches.  IMHO, a bug opened in 8.0 and closed with a fix in 8.2 without a comment about 8.0 does not show a lot of consideration for the bug reporter.

I have not checked the July releases, but I guess there are also bugs fixed in 8.1.0 that are not fixed in 8.0.34.  It would be interesting to check these and see if they are fixed in 8.0.35, but I am leaving this to someone else (feel free to add a comment to this post if you find something).

Below, I give more details about what I identified.  If I missed something, your list might benefit others, so feel free to add a comment to this post or share what you have privately: LinkedIn, Twitter, jfgagne on MySQL Community Slack, or email.  If you are not interested in the details, feel free to skip to the next section: Bugs Fixed in 8.2 or 8.0 but Not in 5.7.

One of these bugs is internal (Bug #35616015).  A quick test showed 8.0.35 is affected, so I opened Bug #112959: Wrong rows examined in P_S and slow log for Index Merge.

Another internal bug (Bug #34833913) got the attention of a colleague.  He opened a public bug (Bug #112979: binlog_transaction_compression_level_zstd bug not fixed in 8.0.35) which was closed as duplicate of a private bug (Bug #109199).  I will let you judge if the patch is complex to back-port, maybe Percona will fix this (PS-8990).

I commented on some of these bugs, they are:

I did not comment on all of them.  I gave up realizing it was a recurring pattern.  Some of these bugs are:

There is even a bug which is advertised as fixed in the release notes, but has not been closed in the public bug system.  I commented on it (Bug #111355: Derived condition pushdown return wrong results).

Bugs Fixed in 8.2 or 8.0 but Not in 5.7

The same way some bugs are fixed in 8.2.0 and not fixed in 8.0.35, some bugs are fixed in 8.2.0 (or 8.0.35) and not fixed in 5.7.44.  I think this is a bigger problem than bugs not being fixed in 8.0.35: for 8.0, there are still chances for a fix to make their way in the LTS branch, but because 5.7.44 is advertised as the last release of 5.7, these bugs will stay unfixed in 5.7.

It looks as if Oracle is not committed
to fixing bugs in a previous LTS
once a new LTS is out

This observation is not inspiring confidence for the maintenance of 8.0 when the next LTS is out.  MySQL 5.7 was supposed to be supported until Oct 2023, but we have evidence that bugs were left unaddressed in 5.7.  Examples of bugs I expected to be fixed in 5.7.44 are:

Not in the context of 8.2.0 and 8.0.35, I have another example of a 5.7. bug which did not receive the expected attention: Bug #107059: Bad compression stats in I_S.INNODB_CMP.  This bug, a regression in 5.7.27 not affecting 8.0, has not yet been fixed in 5.7, but its sister bug (PS-8749) was fixed in Percona Server 5.7.42-45: thanks Percona for this.

As I described for bugs fixed in 8.2.0 and not in 8.0.35, exceptions also apply for 5.7.  One such exception is when fixing a bug implies a complete refactor of the code, or the introduction of a new feature (like Bug #199 needed the new Data Dictionary in 8.0).  This might also apply to Bug #109595: records_in_range does too many disk reads (8.0.31, 5.7.39 and 5.6.51).  But for exceptions, I still expect transparency in the public bug system, and this is not what we usually get, including in Bug #107574: MTR deadlocks when preserving commit order and changing read_only.

Again, if I missed something, your list might benefit others, so feel free to add a comment to this post or contact me in private: LinkedIn, Twitter, jfgagne on MySQL Community Slack, or email.

Conclusion

As I wrote above, Oracle contacted me asking for the list of bugs that were fixed in 8.2.0 and that should be fixed in 8.0.35.  So it looks as if they realize they could have done better and want to improve.  I still think publishing this post has value because it can be referenced if things derail again in the future or end-up not evolving in the right direction.  Until we see things improve, we have Percona addressing bugs that do not receive attention from upstream, and I am happy they are there (I already mentioned PS-8749, another example — still unfixed in all Oracle MySQL versions — is PS-8785: Metrics not incremented for 1st iteration in buf_LRU_free_from_common_LRU_list).

I mentioned that I am surprised by the release cadence of the Innovation Release branch: one new IR every 3 months is very aggressive.  I am not sure customers need new features so often, having them every 6 months is more than enough.  What we need though is well thought features, and for this, I can wait up to 12 months.  But we also need stability and bug fixes, so when skipping an IR release, Oracle could provide a minor release for the previous IR.  This would provide customers with an option to have critical bug fixes, and might increase the adoption of IRs as they could get one to three minor releases with bug fixes.

10 comments:

  1. Strongly disagree on the less frequent innovation releases being better. We, the community, have the option of ignoring them if they are too frequent for us, but we don't have the option of pulling the code somehow to public if they are too infrequent. Having no public real time source code branches, the frequent releases is the least bad option.

    ReplyDelete
    Replies
    1. Thanks for joining the discussion Laurynas. You bring an interesting angle to it, not that it significantly changes my opinion, but it is worth discussing.

      > Having no public real time source code branches, the frequent releases is the least bad option.

      If I understand correctly, what you like about the frequent release cadence of the IR branch is that it makes code available more often. As I wrote above, it is an interesting angle, but I think it mostly applies to the people maintaining a fork of MySQL, not the general user population.

      It is my belief, and I could change my mind, that the majority of users value stability and bug fixes over code availability. They also want new features, but these should also come with stability and bug fixes. And in that sense, the original 8.0 branch (pre LTS) was not matching the stability requirement because the new features introduced in it were breaking things more often.

      With the above established, we can infer that most users will want to stay on the LTS branches. And the main point of my post stands: Oracle should fix bugs in all LTS branches.

      We then arrive at the release cadence of the IR branch. You bring-up the code availability, I still stick to stability. In the current release model, once users want new features that are not in LTS, they need to sacrifice stability jumping on the IR boat. I suggest a world where there would be less IRs, and some minor releases on IRs, which would lead to more stability.

      We could imagine a more complex world, combining code availability, stability, and better features: introducing release-candidate in the IR model. For the October 2023 releases, this would have meant releasing bug-fixes on 8.1 (8.1.1-ga) and an RC on 8.2 (8.2.0-rc). I like the RC model, because it allows releasing new features, having them tested by the community, and eventually things being adapted in a next RC or GA. The next releases (January 2024) could include a 8.1.2-ga, a 8.2.1 (rc or ga), and eventually a 8.3.0-rc This could combine stability, and the possibility of iterating on features, and code availability. But then, the burden is on Oracle to maintain more branches, which is something they will resist.

      It is a complex subject that can be approached from many angles. My first priority is stability and then new features. I forgot the code availability angle, thanks for shedding light on my blindspot. But it does not change my opinion on preferring less frequent and more well thought releases with a stability option, unless a RC model can be implemented (I might change my mind if my work brings me closer to maintaining a fork of MySQL, which is closer to your world as this was what you were doing at Percona).

      Delete
  2. Thank you for this post, I'm currently trying to find the next go-to version considering our needs and had similar observations and concerns. For example, I was puzzled what should be done with binlog_transaction_dependency_tracking as here: https://dev.mysql.com/doc/refman/8.2/en/replication-options-replica.html#sysvar_replica_parallel_type we can read:

    "When the replication topology uses multiple levels of replicas, LOGICAL_CLOCK may achieve less parallelization for each level the replica is away from the source. To compensate for this effect, you should set binlog_transaction_dependency_tracking to WRITESET or WRITESET_SESSION on the source as well as on every intermediate replica to specify that write sets are used instead of timestamps for parallelization where possible."

    So it is deprecated, yet they still suggest to set it to some specific value...

    Another point is that even though Oracle has promised 8.0.34+ version to be bug fixes only, both 8.0.34 and 8.0.35 have some changes, which at least to me, don't look like just bug fixing ones. As an example - starting with 8.0.34 MySQL is spamming mysqld.log with warnings when mysql_native_password authentication is used... Maybe not a real change, yet, quite annoying (and "fixable" with log_error_suppression_list='MY-013360').

    ReplyDelete
    Replies
    1. > So it is deprecated, yet they still suggest to set it to some specific value...

      Nice find: I would suggest you to open a bug on this.

      > starting with 8.0.34 MySQL is spamming mysqld.log

      This is basically a deprecation back-port, I personally think it is ok in a minor version, but I know others disagree.

      Delete
    2. I would say, it should be acceptable to say that some feature is deprecated, but logging warning for each login attempt is a bit too much for the version which is supposed to be: "MySQL 8.0.34+ will become bugfix only release (red)" ;) (as written https://dev.mysql.com/blog-archive/introducing-mysql-innovation-and-long-term-support-lts-versions)

      Delete
    3. > logging warning for each login attempt

      I agree with you, a better way of doing this would be to warn, on startup, that mysql_native_password is enabled but deprecated, and only warn on the first login of a user with mysql_native_password. Do you want to open a bug about this ?

      Delete
    4. Not really, if we decide to go with this version we will use log_error_suppression_list if needed.

      Delete
  3. Other bugs were brought to my attention via private message.

    Bug #112035 [1]: Materializing performance_schema.data_locks can lead to excessive mem usage/OOM.

    Bug #112946 [2]: INSTANT algo <= 8028 on table in mysql schema leads to corruption post upgrade.

    Bug #113050 [3]: Table Inaccessible after upgrade with in 8.0 and upgrade failures from 5.7 to 8.

    [1]: https://bugs.mysql.com/bug.php?id=112035
    [2]: https://bugs.mysql.com/bug.php?id=112946
    [3]: https://bugs.mysql.com/bug.php?id=113050

    Bug #112035 is advertised as fixed in 8.3, but nothing is said about 8.0 Hopefully, we caught this one in time and it will end-up being fixed in 8.0.36 (I reported it to my Oracle contact that asked for my bug list).

    Bug #112946 is about a bug fixed in 8.2 but not fixed in 8.0. I will let you judge if the fix is too complex to be back-ported [4].

    [4]: https://github.com/mysql/mysql-server/commit/68da9a26c56c1f522483fd7b2da810c3e4ccc88f

    Bug #113050 is rushed deprecation in 8.0.29, a feature was deprecated, but upgrading make table using this feature unusable.

    ReplyDelete
    Replies
    1. And I brought Bug #112946 to the attention of Percona [1], maybe they will backport / downstream the fix in their 8.0.35.
      [1]: https://jira.percona.com/browse/PS-9011

      Delete
  4. Thx for the analysis, my guess is that most of the missing bugs are either mistakes or decisions that a bug fix was more dangerous than the bug it fixes. I would give Oracle a few releases to learn the new way of handling bug fixes. Any organisations with hundreds of developers will require some time when the ship changes direction. This is most likely why they are interested in your list.

    ReplyDelete