Friday, November 13, 2009

How to install MySQL Enterprise Monitor agents in a failover environment

MySQL Enterprise Monitor is a tool to watch and analyze multiple MySQL environments from a single web based dashboard. More information is available on the MySQL homepage. Each MySQL instance is monitored by a small agent that connects to the MySQL instance and reads statistics that is sent to the MySQL Enterprise Monitor (MEM) Server.
That setup is very easy. But if the MySQL server is in a cluster failover configuration, there are some things to consider when installing the MEM agent:

What do you want?

Do you want to have two entries in the MEM dashboard for both physical servers?
This is good because:
  • You can monitor them separately, you can define different rules for both servers in case they offer different capabilities.
  • You can immediately see, which physical server runs the MySQL instance. The other entry will always report either "MySQL server down" or "MEM agent not reachable".
This is not so good because:
  • You cannot watch the data if a failover occurred. E.g. you can only see graphs for a specific physical host.
  • You get red alarms because the agent on the passive node cannot reach the MySQL instance. This alarm is harmless. But it will train you to ignore red alarms. Not good.
If you like this approach here is the description how to install.

Do you want to see only one entry in the MEM dashboard that displays the data no matter, which physical server is running the instance at the moment?
This is good because:
  • You have continuous data even if a failover occurred during the period. E.g. you can watch the cache hit rate graph of your MySQL server and you will see only a dip where the failover took place.
  • You will not see false events like "MySQL server is down on the passive node". Of course it is down. That's why it is called the passive node ;-)
This is not so good because:
  • It's not so easy to see which physical server currently runs the MySQL instance: In the meta info on the dashboard screen you see the physical hostname that runs the MySQL instance.
  • You need to apply the same rules to both physical servers. This may be a problem if they have different capabilities, e.g. if the backup node is smaller.
If you prefer this approach here is the description how to install.
By the way: I recommend this way, as the disadvantages are really small and the installation is even a little easier. Both procedures work for all common HA frameworks like SunCluster, OpenHA Cluster, Linux Heartbeat, Microsoft Windows Cluster.

Installing MEM agent on a cluster on the logical host

The goal is to have only one entry in the Enterprise Monitor Dashboard that shows the status of the MySQL instance, no matter on which physical server in runs. There are two ways to achieve this:
  • You can install the agent on both physical nodes
  • You can install the agent on a shared storage.
In either case you have to make sure, that only one agent runs at a time. You have to integrate the agent into your cluster framework. I will not describe how this works, as it is highly dependant on your cluster framework.

The following description assumes, that you will install the agent on both physical nodes.

  1. Install the agent but DO NOT START the agent yet.

  2. Edit the [agent-installdir]/etc/mysql-monitor-agent.ini
    In the [mysql-proxy] section add the following line:
    agent-host-id=[logical hostname]
  3. Do steps 1. and 2. for the other cluster node as well.

  4. Include the agent in the cluster's failover group so that it will start automatically on that node, where the MySQL instance is running.

  5. Start the agent via the cluster framework.

  6. The entry in MEM will get the hostname of the first node that runs MySQL. Even after a failover this name will stay. You should change that name to the virtual hostname of the virtual ip adress of your MySQL service. Use the MEM BUI, choose the tab "Settings". From the secondary menu right below the tabs choose "Manage Servers". If you move the mouse over the server name a submenu will appear. Choose "Rename" and enter the name of the virtual IP.

Installing MEM agent in a cluster on the physical hosts

To install the MEM agent in a way that both physical servers are listed in the MEM dashboard, you have to install the agent on both physical nodes. But: Do not start the agent after the installation!
There are three different IDs in MEM: agent-uuid, mysql-uuid and host-id. Usually they are generated automatically and you will never notice these IDs. For more information about the meaning of the different IDs look at this very good explanation from Jonathon Coombes.
The agent stores the uuid and the hostid in a MySQL table called mysql.inventory. After a failover the other agent on the new node will notice "wrong" hostid and uuid entries in the inventory table. The agent will stop and ask you to TRUNCATE mysql.inventory. But with this procedure MEM creates a new instance, so all old data is lost. Not good for a failover environment.
So in case of a failover you have to provide the mysql.inventory table, that the agent expects.
And here is how you can achieve this:
  1. Install the MEM agent on the node that currently runs MySQL. Start the agent.

  2. Make a table that stores the hostid and uuid for every physical host:
    USE mysql;
    CREATE TABLE inventory_hostname LIKE inventory;
    DROP INDEX `PRIMARY` ON inventory_hostname;
    ALTER TABLE inventory_hostname ADD COLUMN (hostname VARCHAR(64));
    ALTER TABLE inventory_hostname ADD PRIMARY KEY (hostname,name);
    INSERT INTO inventory_hostname SELECT *,@@hostname FROM inventory;
    SELECT * FROM inventory_hostname;

    The newly created table should look like this:
    +--------+--------------------------------------+----------+
    | name | value | hostname |
    +--------+--------------------------------------+----------+
    | uuid | 96936e90-56bd-4eb1-aef3-e708d149a4cb | wclus-1 |
    | hostid | mac:{005056a138c10000} | wclus-1 |
    +--------+--------------------------------------+----------+

    (Notice that the hostid is based on the mac address in my case. Usually this is the public ssh host key.)

  3. Stop the agent

  4. Empty the inventory_table:
    TRUNCATE mysql.inventory;

  5. Failover the MySQL instance to the other node.

  6. Install and start the agent on the other node. It will save new values in inventory.

  7. Copy these new values to the inventory_hostname table:
    USE mysql;
    INSERT INTO inventory_hostname SELECT *,@@hostname FROM inventory;
    Both nodes should be visible in MEM dashboard right now.

    With every failover we need to populate the inventory table with the host-specific rows. Easiest way (and independent on the operating system and cluster framework) is to define an init-file:

  8. On both nodes create a file named [MySQL Basedir]/mysql_init_HA_MEM.sql with the following statements:
    USE mysql;
    REPLACE INTO inventory SELECT name,value FROM inventory_hostname WHERE hostname=@@hostname;

  9. On both nodes edit you my.cnf or my.ini file. In the section [mysqld] add the following line:
    init-file=[MySQL Basedir]/mysql_init_HA_MEM.sql
    (If you already have an init-file defined you can add the commands "USE mysql; REPLACE..." to you init-file.)

  10. Start all agents and try to failover the MySQL instance. Check that you init-file really modifies the inventory table.