Wednesday, May 1, 2019

Care with using the max_connections beta database flag on CloudSQL...

War story of the day: do not use — or be very careful when using — the max_connections beta database flag on CloudSQL... because it has many bugs.

I was hit by this today: we set the max_connections flag to 8000 on a primary server a few days ago, we had a failover last night, and the flag was not set on the replica (bug #1).

Update 2019-05-03: there is another bug (bug #1.5).  If you set the max_connections to 8000 and restart the instance, the value is back to its default 4030.  So the flag is not persisted between restart.  There is a public issue about about this: https://issuetracker.google.com/131813062.

Also, if you set the max_connections flag to some value (8000 in my case), if you un-set the flag, the value goes back to 151 (MySQL 5.7 default) instead of the default CloudSQL max_connections value of 4030 (bug #2).

Update 2019-05-03 bis: there is a public issue about bug #2: https://issuetracker.google.com/issues/131819924.

And I can tell you that MySQL with 3000 open connections does not like when max_connections goes to 151 !  Also, good luck with killing those connections, because even if you have a root session open to the database, you cannot kill connections belonging to another user (you are not SUPER in CloudSQL).  And because you are not SUPER, you cannot increase .  And because all connections to the database are taken, you cannot:
  • increase the value of the database max_connections flag via the Google Cloud Platform console,
  • restart the instance,
  • or fall back to the failover replica.
Update 2019-05-08: there is a public issue about not being able to restart an instance that reached max_connections: https://issuetracker.google.com/issues/132204816.

Update 2019-05-15: there is a public feature request for enabling failover when the master is unreachable: https://issuetracker.google.com/issues/132662912.

Your only option is to shut-down gracefully the application using all the connections (from the client side) to go below 151 and get back control over CloudSQL.

Hopefully, this will be fixed quickly (I have a support case open on this).

Update 2019-05-06: somehow related public issue on this subject: gcloud is saying that the instance will be restarted after changing max_connections — https://issuetracker.google.com/issues/132035806.

And hopefully, sharing this will prevent you hitting the same problems as me.

2 comments:

  1. Thanks for sharing this JF. This is a very terrible bug indeed!

    ReplyDelete
  2. That is a steenky one... Hope they fix it soon.

    ReplyDelete