Wednesday, May 1, 2019

Care with using the max_connections beta database flag on CloudSQL...

War story of the day: do not use — or be very careful when using — the max_connections beta database flag on CloudSQL... because it has many bugs.

I was hit by this today: we set the max_connections flag to 8000 on a primary server a few days ago, we had a failover last night, and the flag was not set on the replica (bug #1).

Update 2019-05-03: there is another bug (bug #1.5).  If you set the max_connections to 8000 and restart the instance, the value is back to its default 4030.  So the flag is not persisted between restart.  There is a public issue about about this: https://issuetracker.google.com/131813062.

Also, if you set the max_connections flag to some value (8000 in my case), if you un-set the flag, the value goes back to 151 (MySQL 5.7 default) instead of the default CloudSQL max_connections value of 4030 (bug #2).

Update 2019-05-03 bis: there is a public issue about bug #2: https://issuetracker.google.com/issues/131819924.

And I can tell you that MySQL with 3000 open connections does not like when max_connections goes to 151 !  Also, good luck with killing those connections, because even if you have a root session open to the database, you cannot kill connections belonging to another user (you are not SUPER in CloudSQL).  And because all connections to the database are taken, you cannot:
  • increase the value of the database max_connections flag via the Google Cloud Platform console,
  • restart the instance,
  • or fall back to the failover replica.
Update 2019-05-08: there is a public issue about not being able to restart an instance that reached max_connections: https://issuetracker.google.com/issues/132204816.

Update 2019-05-15: there is a public feature request for enabling failover when the master is unreachable: https://issuetracker.google.com/issues/132662912.

Your only option is to shut-down gracefully the application using all the connections (from the client side) to go below 151 and get back control over CloudSQL.

Hopefully, this will be fixed quickly (I have a support case open on this).

Update 2019-05-06: somehow related public issue on this subject: gcloud is saying that the instance will be restarted after changing max_connections — https://issuetracker.google.com/issues/132035806.

And hopefully, sharing this will prevent you hitting the same problems as me.

2 comments:

  1. Thanks for sharing this JF. This is a very terrible bug indeed!

    ReplyDelete
  2. That is a steenky one... Hope they fix it soon.

    ReplyDelete