Wednesday, April 15, 2015

MaxScale Binlog Server HOWTO: Operations (including Chaining)

In the Install and Configure HOWTO, we learned how to install and configure a MaxScale Binlog Server.  In this HOWTO, I will present the common operations that you might need to perform when using this software.  Those operations include:
  • Purging Binary Logs,
  • Chaining Binlog Servers,
  • Saving Binary Log Files in the Non-Default Directory,
  • Downloading Binary Logs other than First,
  • Listing Connected Slaves,
  • Disconnecting one or all Slaves,
  • Differentiating a MaxScale from a MySQL Server,
  • Getting More Information about Slaves (and more),
  • Recovering After a Crash.



Purging Binary Logs


There is no build-in mechanism to purge/expire binary logs from the MaxScale Binlog Router.  You need to write a cron job for that.


Chaining Binlog Servers


Chaining Binlog Server is essential to many scenario presented in the original Binlog Server article.  In the Install and Configure HOWTO, we learn how to configure the following replication topology:
-----     / \     -----
| A | -> / X \ -> | B |
-----    -----    -----
From this, you should be able to build the following topology.  You will need it for the rest of this section, so start by setting it up in your environment.
-----
| A |
-----
  |
  +---------+
  |         |
 / \       / \
/ X \     / Y \
-----     -----
  |         |
-----     -----
| B |     | C |
-----     -----
I will now describe how to go from the topology above to the topology below.
-----
| A |
-----
  |
 / \       / \
/ X \ --> / Y \
-----     -----
  |         |
  |         |
-----     -----
| B |     | C |
-----     -----
It is actually very simple (but there is a trap explained in the next paragraph): once you have both X and Y replicating from A, you only have to edit the configuration file replacing A by X and restart MaxScale on Y.

However, do not try to skip the step of making Y replicating from A.  If you skip that step, you will not be able to bootstrap a Binlog Router.  The reason bootstrapping a Binlog Router replicating from another Binlog Router does not work is that MaxScale does not know how to serve the user list to a client.  Remember that MaxScale needs to authenticate client connections and to do so, it downloads the user list from the master.  As MaxScale does not know how to serve this list, if you try to initialize Y replicating from X, Y will not be able to learn the user list from X and will not be able to accept client connections.

This allows me to introduce the notion of the Binlog Router cache.  You might have realized that there is a .cache directory under /usr/local/mariadb-maxscale/Binlog_Service/.  This cache directory stores all the information the MaxScale Binlog Router needs to serve client without being connected to the master.  Without that cache, if the MaxScale Binlog Router is not able to connect to the master, it cannot serve client as it does not know:
  • the version of the master,
  • the server-id of the master,
  • the UUID of the master,
  • max_allowed_packet from the master,
  • the user list from the master,
  • and many others.
So this cache is very useful, including for chaining: as Y is not able to get the user list from X, it authenticates client connection using its cache.  However, it brings a small problem: if a new user is created on A while Y is replicating from X, Y will not be able to learn the existence of this new user (X will learn its existence as it will regularly refresh its user list, including when an unknown user establishes a connection).  To avoid this problem, Y should be connected back to A regularly to refresh its cache (or when new users are created on the master).

Saving Binary Log Files in the Non-Default Directory


As previously discussed, the binary logs from the master are saved by default in the directory /usr/local/mariadb-maxscale/Binlog_Service.  This directory name correspond to the Binlog Router section in the MaxScale configuration file ([Binlog_Service] in our case) and its location correspond to the MAXSCALE_HOME environment variable set in the /etc/init.d/maxscale script (/usr/local/mariadb-maxscale in our case).

If you do not want to store your binary logs in this directory, you can create a symbolic link pointing to your favorite directory.  Those commands in bash will move the binary log directory in /var/lib/maxscale/my_binlogs/:
sudo service maxscale stop;
sudo mkdir /var/lib/maxscale;
cd /usr/local/mariadb-maxscale/;
sudo mv Binlog_Service /var/lib/maxscale;
sudo mv /var/lib/maxscale/{Binlog_Service,my_binlogs};
sudo ln -s /var/lib/maxscale/my_binlogs Binlog_Service;
sudo service maxscale start;
Even if the symbolic link trick works well, it is not elegant.  A nicer solution is a router parameter, called binlogdir.  By adding the following at the end of the router_options line, we should get the same result as the symbolic link trick (the coma is important):
,binlogdir=/var/lib/maxscale/my_binlogs
However, there is a known bug with binlogdir around the .cache directory (the cache directory is explained above).  This bug is that the MaxScale Binlog Router does not use the parameter binlogdir for creating the cache directory: this directory is always created in the default location like if binlogdir was not set.  This should be fixed in a future version.

Downloading Binary Logs other than First


By default, the MaxScale Binlog Router starts downloading binary logs from file # 1.  However, if this file does not exist on the master or if starting at this file is not suitable (lot of binary logs files and we are in a hurry), the default must be modified.  In the Install and Configure HOWTO, we saw how to use the initialfile router option.  However, an initial file number is not something that should stay in a configuration file (it will not make sense in 6 months from now) and if you want to avoid editing the configuration file after bootstrapping MaxScale, there is a more elegant way.

This other way relies on the startup behavior of the Binlog Router: on initialization, the binary log directory is scanned for the latest binary log file: the resulting file number and size are used to start downloading the binary logs from the master.  By putting the right file in the binary log directory before starting MaxScale, we trick the router to download the file we want.  However, you should not put an empty binary log file there as MaxScale will try downloading this file at position zero from the master, which does not make sense.

A binary log file start by 4 bytes (magic number) and those are never requested from a master.  So the solution consists in creating a 4 bytes file with the right content.  You can do so using the following command:
xxd -r <<< "0000000: fe62 696e" > $your_binlog_file;
To convince you from the validity of this solution, you can run the following on any existing binary log file on the master:
head -c 4 $an_existing_binlog_file | xxd;

Listing Connected Slaves


To get the list of slaves connected on a MaxScale Binlog Server, you can connect to MaxScale using the MySQL client and issue the command "SHOW SLAVE HOSTS;":
# Edit the below line.
maxscale="<hostname or ip address of maxscale>"; 
cmd="SHOW SLAVE HOSTS;";
mysql -h $maxscale -u repl -pslavepass <<< "$cmd";
Note: if 127.0.0.1 or localhost does not work for the variable maxscale above, try the complete IP address of the server as returned by the following command:
maxscale="$(ip a s dev eth0 | grep "inet " |
              awk '{print $2}' | cut -d "/" -f 1)";
or, to be able to user 127.0.0.1 with MaxScale, add the following line after the line router=binlogrouter in the MaxScale configuration file:
localhost_match_wildcard_host=1

Disconnecting one or all Slaves


Disconnecting one, some, or all slaves from a MaxScale Binlog Server might be useful in some situations.  The use-case I have in mind is the following:
  • two MaxScale Binlog Servers are deployed (X and Y),
  • both MaxScale Binlog Servers are replicating from the same master,
  • slaves are connecting to MaxScale using a DNS entry including both IP addresses (X and Y).
This way, if one of X or Y fails, the slaves will reconnect to the other thanks to DNS.  (Note that you can achieve the same result with 2 lines in the /etc/hosts file with different IP addresses but with the same hostname.)

In the situation described above, after a failure (or a restart) of a MaxScale, all the slaves are connected to the same Binlog Server.  This is not a problem in itself but one might want to rebalance the load on the 2 nodes.  To do that, the command "DISCONNECT ALL;" can be issued on a MaxScale using the MySQL client (see the previous section for an example of running a command on a MaxScale using the MySQL client ).

It is also possible to disconnect a single slave running the command "DISCONNECT SERVER <server-id>;".

Note that for MaxScale 1.1.0, there is no GRANT management for the DISCONNECT command.  Anybody able to connect to MaxScale will be able to disconnect slaves.  It should be improved in a next version.

Differentiating a MaxScale Binlog Server from a MySQL Server


When using automation (or a MySQL client), it could be useful to know if you are connected to a MaxScale Binlog Router or a "standard" MySQL Server.  For that, you can issue the command "SHOW VARIABLES LIKE 'MAXSCALE%';".

A MySQL Server does not return any lines to that command but a MaxScale Binlog Router returns a line with the MaxScale version:
> $ mysql -h 127.0.0.1 -u repl -pslavepass \
>     2> /dev/null <<< "SHOW VARIABLES LIKE 'MAXSCALE%';";
> Variable_name   value
> MAXSCALE_VERSION        1.1.0
Note that the Binlog Router is not very flexible in the syntax of this command: you must run exactly "SHOW VARIABLES LIKE 'MAXSCALE%';", all letters mush match without error.  Adding an underscore after MAXSCALE or removing the last E will return an error.

Getting More Information about Slaves (and more)


Sometimes, the list of slaves (SHOW SLAVE HOSTS;) or the download position of MaxScale in the master binary logs (ls -l /usr/local/mariadb-maxscale/Binlog_Service) is not enough.  Then, you must use on the administration interface of MaxScale to get more information.  Running the following command will show you probably more information that you need:
/usr/local/mariadb-maxscale/bin/maxadmin \
   -pmariadb show services;
Replacing "show services" by "help" will give you all the possible commands of the administration interface.  Read the MaxScale documentation for more information.

Using a password on the command line (-pmariadb, the default password for the administration interface) is not very nice.  You can put that password in a .maxadmin file in your home directory:
printf "passwd=mariadb" > ~/.maxadmin;
/usr/local/mariadb-maxscale/bin/maxadmin show services;

Recovering After a Crash


As presented in the Install and Configure HOWTO, the MaxScale Binlog Router is not crash-safe.  On startup, the module scans the binary log directory and starts downloading binary log from the master at the most recent file and at the position corresponding to the size of that file.  If the size of the file is not align to a binary log event (partial write before a crash), things will go wrong.

There are 3 solutions to that, listed below in order if increasing complexity.  One of them should to be applied after a crash (a crash is easy to detect, see further down):
  1. Delete the latest binary log file,
  2. Truncate the latest binary log file to 4 bytes,
  3. Truncate the latest binary log file at the position of the last complete event.
With solution # 1, the Binlog Router will try downloading the binary log from the master at the size of the the previous binary log file.  This will lead to the creation of the next file.  The drawback of this solution is that if MaxScale is crashing often, we might drop all our binary logs one after the other, which would be bad.

The solution # 2 is the one I am using at the moment.  As presented above, the 4 first bytes of a binary log file is a magic number identifying the file as a MySQL binary log.  By truncating the file at this position, you are guaranty to be aligned with an event: the next restart of MaxScale will redownload this file from the beginning.  But if you have large binary log files, this might waist bandwidth (which I am willing to accept at this time for simplicity).

The solution # 3 is the smartest: using mysqlbinlog, one can find the ending position of the last complete event in a binary log file.  Truncating the file at this position will minimize bandwidth usage.

But how can we know that MaxScale crashed ?  Easy: when MaxScale starts, it creates a sub-directory in /usr/local/mariadb-maxscale/data/, and it deletes that directory when it stops cleanly.  So if /usr/local/mariadb-maxscale/data/ is not empty, it means that MaxScale was not cleanly stopped and that Binlog Router crash recovery must be applied.

The Binlog Router crash recovery is not yet included in the current implementation: a startup script must take care of that.  This might make its way in a future release of the Binlog Router.



We are now done with presenting the common operations on a MaxScale Binlog Server.  I hope this helps.

Known Issues:

  • The Binlog Router does not serve GRANTs: bug number to come soon... (until implemented, connect your chained Binlog Router with the upstream master regularly),
  • Parameter binlogdir does not work with cache: bug number to come soon... (until fixed, use the symbolic link trick),
  • The Binlog Router does not handle zero size binlog file: bug number to come soon... (until implemented, use the 4 bytes file creation trick with xxd),
  • Manage Grants for DISCONNECT command: bug number to come soon...
  • The Binlog Router is not crash-safe:  bug number to come soon... (until implemented, do manual crash recovery as described above).

2 comments:

  1. Great post.
    Until now, there is no mechanism of maxscale allow auto purge or expire binary logs.
    As you said above, we need to write a cron job for this task?
    Can you explain more detail? Do you just need to "find" for old binlogs and "rm" them or anything we need to care?

    ReplyDelete
    Replies
    1. A find/rm would work. You can also find/gzip for a start and then find/rm. A more complex solution would be to save to HDFS or S3.

      Delete