Mablomy

Finding Tables without Primary Key

2019-05-04T10:15:00.000+01:00

Having a primary key defined for each user table is best practice for performance in InnoDB. And when using Group Replication or InnoDB Cluster for automatic high availability it is (almost) mandatory. (The full story see here.) So it is wise to check if you have tables running without primary key. You can identify these tables by running:

SELECT t.table_schema, t.table_name FROM tables AS t

LEFT JOIN key_column_usage AS c

ON (t.table_name = c.table_name AND

c.constraint_schema = t.table_schema AND

c.constraint_name = 'PRIMARY' )

WHERE t.table_schema NOT IN ("mysql", "information_schema",

"performance_schema", "sys")

AND c.constraint_name IS NULL

AND t.table_type = "BASE TABLE";

And if you want to make life even easier, you can add this as a report to the sys schema:

CREATE VIEW sys.schema_tables_without_pk AS

SELECT t.table_schema, t.table_name FROM tables AS t

LEFT JOIN key_column_usage AS c

ON (t.table_name = c.table_name AND

c.constraint_schema = t.table_schema AND

c.constraint_name = 'PRIMARY' )

WHERE t.table_schema NOT IN ("mysql", "information_schema",

"performance_schema", "sys")

AND c.constraint_name IS NULL

AND t.table_type = "BASE TABLE";

It is easy to detect but a little more challenging to solve. You need to consider your application and potential load when adding a new primary key. One solution is to add a new auto_increment column. In many cases this might help already. In other cases you already have a natural primary key in your table definition, It's just not defined as such.

To add auto_increment columns to all affected tables (which I do not recommend without thinking about it and testing first!), you can use the beauty of the Python mode in MySQL Shell:

$ mysqlsh root@localhost:33060 --py

MySQL Shell 8.0.16

Oracle is a registered trademark of Oracle Corporation and/or its affiliates.

Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.

Creating a session to 'root@localhost:39010'

Fetching schema names for autocompletion... Press ^C to stop.

Your MySQL connection id is 762 (X protocol)

Server version: 8.0.16-commercial MySQL Enterprise Server - Commercial

No default schema selected; type \use <schema> to set one.

MySQL localhost:33060+ ssl Py >l=session.get_schema("sys").get_table("schema_tables_without_pk").select().execute().fetch_all()

MySQL localhost:33060+ ssl Py >for val in l: session.sql("ALTER TABLE "+val[0]+"."+val[1]+" ADD COLUMN (__id int unsigned auto_increment PRIMARY KEY)");

TTL - Perfect Accuracy by using an insertable VIEW

2019-03-09T07:29:00.000+00:00

One more comment regarding TTL in MySQL:
If you are looking for perfect accuracy and never want to access rows that are older than the defined TTL you can hide the table t (from my previous post) behind a view. This view will automatically select only rows within TTL lifespan:
CREATE VIEW ttl as SELECT id, content, created_at FROM t
WHERE created_at >= NOW() - INTERVAL 10 SECOND;
This view is insertable, so you can fully use this view and you are not distracted by the additional column "bucket".
INSERT INTO ttl VALUES (NULL, "This is a test", NULL);
You could even exclude column "created_at" from the view definition, if there was not bug #94550. 'created_at' could be fully handled internally.
This view does not affect performance much. In my simple test it did not show any affect. Just better usability and better accuracy of TTL.

Limitations

You cannot use foreign keys with your ttl'ed table and view. This is because partitioning and foreign keys are mutually exclusive. If you need foreign keys go with the simple delete event procedure and forget about the view.
Due to bug #94550 you have to set explicit_defaults_for_timestamp to OFF and you always have to insert NULL into column 'created_at'.
In this whole setup the TTL is mentioned in four locations: In the partitioning definition, in the definition of the generated column 'bucket', in the cleaning event procedure and in the WHERE clause of the view. This makes it easier to screw up the setup. Make sure you use the same value everywhere. Same applies for the number of partitions in the table definition as well as the cleaning event procedure.

TTL - Time-to-Live in MySQL

2019-03-08T10:17:00.002+00:00

A customer recently asked for a TTL feature in MySQL. The idea is to automatically delete rows from a certain table after a defined lifespan, e.g. 60 seconds. This feature is common in many NoSQL databases, but it is not available in MySQL. However MySQL offers all you need to implement this. And due to partitioning much more efficient than only deleting rows. Let's test it.

tl;dr

Partition the table and truncate partitions in a regular event procedure, that does the trick and comes at a fraction of the cost of regularly deleting rows.

The test case

The table needs a column to keep track of row age. This can be either a "created_at" column or an "expires_at" column. ("expires_at" has the additional advantage that each row can have an individual lifespan. Not possible in many NoSQL solutions.)
So my table is
CREATE TABLE `t` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`content` varchar(42) DEFAULT NULL,
);
I tested two variants to implement a 10 seconds TTL on my table "t":

The simple solution

Run an event every 10 seconds to delete rows that have been created more than 10s ago.
DELIMITER |
CREATE EVENT ttl_delete
        ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
        DO BEGIN
                DELETE FROM t WHERE created_at < NOW() - INTERVAL 10 SECOND;
        END |
DELIMITER ;
And index on "created_at" might improve performance for the DELETE job. But in any case it is quite expensive to scan the table and remove roughly 50% of the rows of a table, at least if the INSERT rate is high.

The efficient solution

Instead of DELETing we can use the much faster TRUNCATE operation. Obviously we do not want to TRUNCATE the whole table but if we distribute the inserted rows into partitions it is safe to truncate any partition that contains outdated rows. Let's define three partitions (or buckets): One that is currently being written to, one that holds rows of the last 10 seconds and one partition that can be truncated because the rows are older than 10 seconds. Key is to calculate the bucket from the current time. This can be done with the expression FLOOR(TO_SECONDS(NOW()/10)) % 3, or more generic FLOOR(TO_SECONDS(NOW()/ttl))% number_of_buckets
Now we can partition the table by this expression. For that we add a generated column to calculate the bucket from the column "created_at" and partition the table by column "bucket". The table now looks like this:
CREATE TABLE `t` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`content` varchar(42) DEFAULT NULL,
`bucket` tinyint(4) GENERATED ALWAYS AS
       (floor(TO_SECONDS(`created_at`) / 10) % 3) STORED NOT NULL,
PRIMARY KEY (`id`,`bucket`)
) PARTITION BY LIST (`bucket`)
(PARTITION p0 VALUES IN (0),
PARTITION p1 VALUES IN (1),
PARTITION p2 VALUES IN (2));
And the event procedure is like this:
DELIMITER |
CREATE EVENT ttl_truncate
    ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
    DO BEGIN
            CASE FLOOR(TO_SECONDS(NOW())/10)%3
            WHEN 0 THEN ALTER TABLE test.t TRUNCATE PARTITION p1;
            WHEN 1 THEN ALTER TABLE test.t TRUNCATE PARTITION p2;
            WHEN 2 THEN ALTER TABLE test.t TRUNCATE PARTITION p0;
            END CASE;
        END|
DELIMITER ;

Watching the rows come and go

To verify that the procedure works as expected I created a small monitor procedure that displays each second the number of rows per partition. Then it is easy to follow in which partition data is currently added and when a partition gets truncated.
DELIMITER |
CREATE PROCEDURE monitor()
BEGIN
WHILE 1=1 DO
   SELECT "p0" AS "part", count(*) FROM t PARTITION (p0)
         UNION SELECT "p1", count(*) FROM t PARTITION (p1)
           UNION SELECT "p2", count(*) FROM t PARTITION (p2);
   SELECT now() AS "NOW", floor(to_seconds(now())/10)%3 AS "Bucket";
   SELECT sleep(1);
END WHILE;
END|
DELIMITER ;
This procedure is not ideal. Too many count(*) will create quite some locking. But it is accurate. The alternative is to read data from INFORMATION_SCHEMA.partitions, but this does not give the exact row count, which I needed for verification.

Increasing Accuracy

If TTL is 10 seconds, deleting or truncating every 10 seconds means you have at least 10 seconds of rows available. In reality you will have 10 to 20 seconds worth of data, so on average 15 seconds (assuming a constant INSERT rate). If you run the cleaner job more often (say once per second) the average number of rows is 10.5 seconds worth of data. This comes at the cost of running the cleaning event more often. But it might be very beneficial to increase this accuracy because all other queries benefit from less data to operate on and less memory consumed by expired rows.
If you go with the simple solution of a regular DELETE event, it is sufficient to schedule the event more often.
If you prefer the TRUNCATE PARTITION option, it is necessary to increase the number of partitions or buckets to 12 (= 2 + TTL / how often to run the cleaning job).
The expression for the calculated bucket column will be
`bucket` tinyint(4) GENERATED ALWAYS AS
       (floor(TO_SECONDS(`created_at`) / TTL) % #buckets) STORED NOT NULL
and the partitioning needs to be adapted as well.
And the CASE construct in the cleaner event must be extended for each newly existing bucket/partition:
        WHEN n THEN ALTER TABLE test.t TRUNCATE PARTITION p(n+1);

What happens if...

... the event stops?
Then you keep all your rows which will likely create some follow-up problems. As always: Proper monitoring is key. Think about MySQL Enterprise Monitor for example.

... the event procedure runs at inaccurate timing due to overall system load?
No big problem. It will never run too early. So it will never remove rows too early. If it runs too late it will clean rows too late so you have more garbage in your table which might affect other queries. The real TTL is increased if this happens.

Performance Considerations

By no means I am able to run proper performance tests. I am running on a Win10 laptop, VirtualBox with Oracle Linux and MySQL runs inside a Docker container. So plenty of reason to achieve bad numbers. But to compare the two implementations it should be sufficient.
I have extended the cleaner events to report the time needed to execute the event procedure. Here the example of the simple cleaner job:
CREATE EVENT ttl_delete
ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
DO BEGIN
    DECLARE t1,t2 TIME(6);
    SET t1=current_time(6);
    DELETE FROM t WHERE created_at < NOW() - INTERVAL 10 SECOND;
    SET t2=current_time(6);
    INSERT INTO ttl_report VALUES ("DELETE simple", now(),
                                   timediff(t2,t1));
END|
DELIMITER ;
The load was generated by mysqlslap, which only inserted rows in the table. Each test run starts the respective cleaner event, runs the mysqlslap load and then stops the cleaner event.
mysql -h 127.0.0.1 -uroot -pXXX -e \
        "USE test; ALTER event ttl_delete ENABLE;"

mysqlslap -h 127.0.0.1 -uroot -pXXX --create-schema=test \
   --concurrency=5 --iterations=20 --number-of-queries=10000
   --query="INSERT INTO test.t (created_at, content) VALUES
            (NULL,md5(id));"

mysql -h 127.0.0.1 -uroot -pXXX -e \
    "USE test; ALTER event ttl_delete DISABLE;"

mysql -h 127.0.0.1 -uroot -pXXX -e \
      "USE test; ALTER event ttl_truncate ENABLE;"

mysqlslap -h 127.0.0.1 -uroot -pXXX --create-schema=test
   --concurrency=5 --iterations=20 --number-of-queries=10000
   --query="INSERT INTO test.t (created_at, content) VALUES
           (NULL,md5(id));"
mysql -h 127.0.0.1 -uroot -pXXX -e \
    "USE test; ALTER event ttl_truncate DISABLE;"

The results are clearly in favor of truncating partitions. And the difference should be even higher the higher the INSERT rate gets. My poor setup achieved only less than 1000 inserts per second...

select who, avg(how_long) from ttl_report GROUP BY who;
+---------------+--------------------+
| who           |avg(how_long)       |
+---------------+--------------------+
| DELETE simple | 1.1980474444444444 |
| truncate      | 0.0400163333333333 |
+---------------+--------------------+
3 rows in set (0.0014 sec)

Side note

You might wonder why my test load is
INSERT INTO test.t (created_at, content) VALUES (NULL,'foo');"
Why do I mention the column "created_at" but then store NULL to give it the default of current_timestamp? If I omit the created_at column in this INSERT statement I get an error from the generated column due to bug #94550. Setting explicit_defaults_for_timestamp to OFF and then mentioning the timestamp column during INSERT is a workaround.

Node.js and MySQL on the Oracle Cloud

2017-07-28T14:30:00.001+01:00

Let's explore how to deploy a node.js app with MySQL backend on the Oracle Cloud. I want to cover several aspects:

How to deploy and initialize a MySQL instance
How to adapt the source code
How to deploy the application
How to scale the application

There are different ways to configure this. I tested the easiest deployment with MySQL Cloud Service and the Application Container Cloud for the node.js part. All configurations are done via the cloud web GUI. There is also a REST interface available. But let's keep that for later.
If you don't have access to the Oracle Cloud you can get a trial access here.

How to deploy a MySQL instance

Once you logged into the Oracle cloud you can create new instances from the dashboard. The following screenshots describe the process.

On the next screen we upload the public key for admin access to the instance. Either upload your own public key or generate a new key pair. (If you generate a new key pair you need to download the private key to your local machine.)

I skipped the backup and monitoring configurations for this demo. Let's focus on the application instead. After creating the instance (approx. 10 min) you can navigate via the dashboard to this instance and get the IP address. This is needed for the next step.
To initialize the database I ran this little script that runs ssh to the instance (using the private key), switch user to "oracle" and then call the MySQL CLI to run a few SQL statements.

How to adapt the source code

The Application Container Cloud sets a few environment variables that should be used inside the application to adapt to the environment. In my case this are the following variables:

PORT is the port number that the application should listen on
MYSQLCS_USER_NAME is the MySQL user name for the database backend
MYSQLCS_USER_PASSWORD is the corresponding password
MYSQLCS_CONNECT_STRING is the hostname and port of the database backend

I could have hardcoded the database connection parameters but that is inflexible if the architecture changes. So let's use these variables. The Node.js code looks like this:

How to deploy the application

There are two simple steps needed: Creating an application and defining service bindings. In my case the only service binding is the MySQL backend. But one step after the other. First let's create the application. First you need to create a manifest.json file to describe the application. Here is mine:

Ideally you create a zip archive with the source code, the manifest.json file and all other resources that your application needs. If you want to use my zip archive, feel free. You find it on GitHub.

From the Oracle Cloud Dashboard click on "create instance -> application container" and then select "Create Instance" and "Node.js". (Java SE, Java EE, Python, Ruby and PHP are available as well.)

On the pop-up you define the application artifacts, number of application instances and the memory per instance. After you click "create" the application is deployed automatically within a few minutes.
The last step is to connect the application service with the database backend. To achieve that, click on the application in the application overview page. Here you find the URL under which your application will be available. And on the left hand side you see three tabs:

Overview, Deployments and Administration. Click on "Deployments". Here you can add the service binding as described in the following screenshot:

After modifying the service bindings you have to click "Apply changes". This will restart the application instances. Obviously needed because now the environment variables for the database backend are set correctly.
That's it. We have an application. The URL to access the new app is listed in the application overview tab. Because this URL is not so nice for offering a short url service, I registered a new domain and forwarded that to the anota application. Maybe it is still running? Check here.

How to scale the application

This is really easy. On the application overview tab you can just increase the number of instances and the memory per instance. After applying the changes, the Application Container Cloud platform will deploy new instances, stop spare instances or reconfigure the existing instances. If you use my ANOTA application, go to the report page. The last line prints the hostname of the application server. Requests are automatically load balanced between the available application instances.

Summary

There are some minor changes to the application to run on the Oracle Cloud Platform: Reading the port variable and database connection parameters from the provided environment variables and that's it. Deployment is really easy via the GUI. And scalability is so simple now that the full Oracle Cloud Plattform is available and can be provisioned within minutes.

MySQL Shell - Easy scripting

2017-06-01T09:13:00.000+01:00

With the introduction of MySQL InnoDB Cluster we also got the MySQL Shell (mysqlsh) interface. The shell offers scripting in Javascript (default), SQL or Python. This offers a lot more options for writing scripts on MySQL, for example it is much easier now to use multiple server connections in a single script.
A customer recently asked for a way to compare the transaction sets between servers. That is useful when setting up replication or identifying the server that has most transactions applied already. So I wrote this little script which can be executed from the OS shell:

 #!/usr/bin/mysqlsh -f  
 // it is important to connect to the X protocol port,  
 // usually it is the traditional port + "0"  
 //  
 var serverA="root:root@localhost:40010"  
 var serverB="root:root@localhost:50010"  
 shell.connect(serverA)  
 var gtidA=session.sql("SELECT @@global.gtid_executed").execute().fetchOne()[0]  
 shell.connect(serverB)  
 var gtidB=session.sql("SELECT @@global.gtid_executed").execute().fetchOne()[0]  
 //  
 // If you want to use pure XdevAPI the former statements should be  
 //  
 // gtid = session.getSchema("performance_schema").global_variables.select("VARIABLE_VALUE").where("VARIABLE_NAME='gtid_executed'").execute().fetchOne()[0]  
 //  
 println(" ")  
 println ("Transactions that exist only on "+serverA)  
 println (session.sql("SELECT gtid_subtract('"+gtidA+"','"+gtidB+"')").execute().fetchOne()[0])  
 println(" ")  
 println ("Transactions that exist only on "+serverB)  
 println (session.sql("SELECT gtid_subtract('"+gtidB+"','"+gtidA+"')").execute().fetchOne()[0])

Query Rewrite Plugin and Binlog for Replication

2016-04-13T14:57:00.000+01:00

Starting with MySQL 5.7 we introduced the Query Rewrite Plugin. That tool is really useful for changing queries. Of course the best location to modify the query is the source code of the application, but this is not always possible. Either the application is not under your control or queries are generated from a framework like Hibernate and sometimes it is hard to change the query generation.
If you are interested in details about the Query Rewrite Plugin, I recommend this blogpost from the MySQL Engineering: http://mysqlserverteam.com/the-query-rewrite-plugins/
Recently I was asked how this works in replication environments. Which query goes into the binlog?

If you are using the Rewriter plugin that comes with MySQL 5.7, the answer is easy: This plugin only supports rewriting SELECT queries. SELECT queries don't get into the binlog at all. Simple.

But you might write your own preparse or postparse plugin. In that case you can define the behavior with the server option --log-raw. See documentation here: https://dev.mysql.com/doc/refman/5.7/en/server-options.html#option_mysqld_log-raw
You can either bring the original query to the binlog or the rewritten query. So all flexibility you need. However be aware that --log-raw also affects logging of passwords in the general log file. With --log-raw passwords are written in plain text to the log files. So consider this side effect when switching --log-raw on or off.

MySQL 5.7: Optimizer finds best index by expression

2016-04-04T08:42:00.003+01:00

The optimizer in MySQL 5.7 leverages generated columns. Generated columns will physically store data in two cases: Either the column is defined as STORED or you create an index on a virtual column. The optimizer will leverage such an index automatically if it encounters the same expression in a statement. Let's see an example:

mysql> DESC squares;
+-------+------------------+------+-----+---------+-------+
| Field | Type             | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| dx    | int(10) unsigned | YES |     | NULL    |       |
| dy    | int(10) unsigned | YES |     | NULL    |       |
+-------+------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*) FROM squares;
+----------+
| COUNT(*) |
+----------+
| 2097152 |
+----------+
1 row in set (0.77 sec)

We have a large table with 2 million rows. Selecting rows by the surface area of squares can hardly leverage an index on dx or dy:

mysql> EXPLAIN SELECT * FROM squares WHERE dx*dy=221\G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: squares
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2092860
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)
Now let's add an index over a generated, virtual column that defines the area:

mysql> ALTER TABLE squares ADD COLUMN (area INT AS (dx*dy));
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> ALTER TABLE squares ADD INDEX (area);
Query OK, 0 rows affected (5.24 sec)
Records: 0 Duplicates: 0 Warnings: 0

Now we can run query again:

mysql> EXPLAIN SELECT * FROM squares WHERE dx*dy=221\G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: squares
   partitions: NULL
         type: ref
possible_keys: area
          key: area
      key_len: 5
          ref: const
         rows: 18682
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
I did not change the query! The WHERE condition is still dx*dy. Nevertheless the optimizer finds the generated column, sees the index and decides to leverage that.
So you can add complex indexes and without changing the application code you can benefit from these indexes. That makes life much easier.

One limitation though: It seems the optimizer recognizes expressions only in the WHERE clause. It will not use the generated column and index for the SELECT expression:

mysql> EXPLAIN SELECT SUM(dx*dy) FROM squares\G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: squares
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2092860
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT SUM(area) FROM squares\G
*************************** 1. row ***************************
           id: 1
select_type: SIMPLE
        table: squares
   partitions: NULL
         type: index
possible_keys: NULL
          key: area
      key_len: 5
          ref: NULL
         rows: 2092860
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

CHECK constraint for MySQL - NOT NULL on generated columns

2016-04-04T07:59:00.004+01:00

Update: Starting with MySQL 8.0.16 we do have CHECK constraints implemented in SQL! See here.

During our recent TechTour event the idea came up to implement JSON document validation not necessarily via foreign keys (as I have shown here) but to define the generated column as NOT NULL. The generation expression must be defined in a way that it returns NULL for invalid data.
DISCLAIMER: This has already been explored by yoku0825 in his blogpost. He deserves all credit!

Let's do a short test:

mysql> CREATE TABLE checker (
    i int,
    i_must_be_between_7_and_12 BOOLEAN
         AS (IF(i BETWEEN 7 AND 12, true, NULL))
         VIRTUAL NOT NULL);
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO checker (i) VALUES (11);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO checker (i) VALUES (12);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO checker (i) VALUES (13);
ERROR 1048 (23000): Column 'i_must_be_between_7_and_12' cannot be null

As you can see I used the column name to create a meaningful error message when inserting invalid data. It is perfectly possible to add a generated validation column for each data column so that you run several check constraints.
Or you can even check a combination of columns:

mysql> CREATE TABLE squares (
     dx DOUBLE,
     dy DOUBLE,
     area_must_be_larger_than_10 BOOLEAN
           AS (IF(dx*dy>10.0,true,NULL)) NOT NULL);
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO squares (dx,dy) VALUES (7,4);
Query OK, 1 row affected (0.01 sec)

mysql> INSERT INTO squares (dx,dy) VALUES (2,4);
ERROR 1048 (23000): Column 'area_must_be_larger_than_10' cannot be null

As generated columns are virtual by default this costs no extra storage. Data volume is the same. The expression is evaluated when inserting or updating data.
If you add a validation column to an already existing table and want to verify all existing rows, you could define the validation column as STORED (instead of the default VIRTUAL). This will fail if there are any invalid rows in your existing data set. However in normal operation a virtual column seems more appropriate for performance reasons. So I recommend to always use VIRTUAL validation columns and check pre-existing data separately with a small procedure.

Looking for the smallest possible MySQL Footprint

2016-02-02T12:21:00.002+00:00

UPDATE: Starting with MySQL 8.0.16 we have introduced the new minimal tar ball distribution. Take a look here.

MySQL is known and famous for it’s simplicity and small size, especially compared to other RDBMSs. But what if you want to deploy on tiny hardware? I mean something even smaller than RaspberryPi?

I tested three steps to make the MySQL footprint as small as possible. All my tests were compiled for Oracle Linux 7 on x64 platform. I did not test any ARM cross compile. And these are the steps:

Compile my own binary
Remove all unnecessary tools/files
Strip symbol information from binary

Let’s take a closer look at the tree steps.

Compile my own binary

MySQL is available as a source release. Using that you can configure the make process. That is documented pretty well in the Reference Manual. By switching off some options I was able to reduce the binary size from 240MB to 216MB. I switched off some performance_schema features, removed some storage engines that are irrelevant in most environments anyway (like ARCHIVE, NDB, EXAMPLE, …) and I removed all options for profiling. The final CMAKE statement is at the bottom of this post.

Remove unnecessary tools

I removed scripts and binaries from the distribution. Ted has written an interesting blog post about this. The remaining share directory contains some SQL scripts for installing additional tools. You need these at most once during setup and never again. So let’s remove these. If you are happy to live without textual error messages you can also remove the errmsg-utf8.txt file as well and all translations in the country specific subdirs. And if you can live with reduced charset support, you can even remove the rest of the share directory. You are running essentially only with a mysqld binary.

Strip symbol information from binary

All compilations are done with extended diagnosis information in the binary. These symbol data helps if you want to analyze a core dump for example. Symbols are included by default in the MySQL binaries. These take a surprisingly large amount of space. You can remove these symbols from the binary with the tool “strip(1). After stripping the binary size came down to 24MB, which is only 10% of the initial size.

More ideas

There are some more options to use either system libraries or the libraries that come with the source code. Using existing libraries from the system might help save a few bytes.

Summary

It is possible to make MySQL very lean for your (embedded) system. Despite all the functionality that we added to MySQL in the releases since MySQL 5.1 you get a full featured RDBMS with only a handful of MB. Here are my final results:

MySQL 5.6, minimal features: 79MB, stripped 13MB
MySQL 5.7, default features: 240MB, stripped 24MB
MySQL5.7, minimal features: 216MB, stripped 24MB (removing features brings minimal savings only)

Addendum

This is the CMAKE statement I used to compile MySQL 5.7 on Oracle Linux 7:

cmake . -DCMAKE_INSTALL_PREFIX=/home/testy/TQ/dist-mysql-5.7.10/        \
        -DDOWNLOAD_BOOST=1                                              \
        -DWITH_BOOST=/home/testy/TQ/boost/                              \
        -DDISABLE_PSI_COND=1   \
        -DDISABLE_PSI_FILE=1   \
        -DDISABLE_PSI_IDLE=1   \
        -DDISABLE_PSI_MEMORY=1 \
        -DDISABLE_PSI_METADATA=1 \
        -DDISABLE_PSI_MUTEX=1 \
        -DDISABLE_PSI_RWLOCK=1 \
        -DDISABLE_PSI_SOCKET=1 \
        -DDISABLE_PSI_SP=1     \
        -DDISABLE_PSI_STAGE=1 \
        -DDISABLE_PSI_STATEMENT=1 \
        -DDISABLE_PSI_STATEMENT_DIGEST=1    \
        -DDISABLE_PSI_TABLE=1 \
-DWITH_ARCHIVE_STORAGE_ENGINE=0 \
-DWITH_BLACKHOLE_STORAGE_ENGINE=0 \
-DWITH_EXAMPLE_STORAGE_ENGINE=0 \
-DWITH_FEDERATED_STORAGE_ENGINE=0 \
-DWITH_PARTITION_STORAGE_ENGINE=0 \
-DWITH_PERFSCHEMA_STORAGE_ENGINE=0 \
-DENABLED_PROFILING=0 \
-DENABLE_DEBUG_SYNC=0 \
-DENABLE_DTRACE=0 \
-DENABLE_GCOV=0 \
-DENABLE_GPROF=0 \
-DOPTIMIZER_TRACE=0 \
-DWITH_CLIENT_PROTOCOL_TRACING=0 \
-DWITH_DEBUG=0 \
-DWITH_INNODB_EXTRA_DEBUG=0

JSON memory consumption

2015-11-26T14:21:00.000+00:00

I got some more questions on the new JSON data type and functions during our TechTours. And I like to summarize the answers in this blogpost.

Memory consumption

The binary format of the JSON data type should consume more memory. But how much? I did a little test by comparing a freshly loaded 25,000 row dataset stored as JSON and stored as TEXT. Seven top level attributes per JSON document. Average JSON_DEPTH is 5.9 . Let's see:

mysql> DESC data_as_text;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | NO   | PRI | NULL    |       |
| doc   | text    | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*),AVG(JSON_LENGTH(doc)) FROM data_as_text;
+----------+-----------------------+
| COUNT(*) | AVG(JSON_LENGTH(doc)) |
+----------+-----------------------+
|    25359 |                7.0000 |
+----------+-----------------------+
1 row in set (0.81 sec)

mysql> DESC data_as_json;
+-------+---------+------+-----+---------+----------------+
| Field | Type    | Null | Key | Default | Extra          |
+-------+---------+------+-----+---------+----------------+
| id    | int(11) | NO   | PRI | NULL    | auto_increment |
| doc   | json    | NO   |     | NULL    |                |
+-------+---------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

mysql> SELECT COUNT(*),AVG(JSON_LENGTH(doc)) FROM data_as_json;
+----------+-----------------------+
| COUNT(*) | AVG(JSON_LENGTH(doc)) |
+----------+-----------------------+
|    25359 |                7.0000 |
+----------+-----------------------+
1 row in set (0.08 sec)

mysql> select name,allocated_size/1024/1024 AS "size in MB" from information_schema.innodb_sys_tablespaces where name like "%temp%";
+-------------------+-------------+
| name              | size in MB  |
+-------------------+-------------+
| temp/data_as_json | 23.00390625 |
| temp/data_as_text | 22.00390625 |
+-------------------+-------------+
2 rows in set (0.00 sec)

The increased memory consumption is 1/22 in this case, which is roughly 4,5%. At the same time you see the benefit: The full table scan with some JSON operation has a 90% reduction in runtime when using JSON datatype.
Don't take this number for real. Of course it depends on the number of JSON attributes, character set and others. Just a rough indication. If you want all the details look at the JSON architecture in WL#8132.

Document validation of JSON columns in MySQL

2015-11-23T09:21:00.000+00:00

Starting with the new release MySQL 5.7 there is support to store JSON documents in a column. During our recent Tech Tour events we got questions about document validation, so ensuring that a JSON document has a certain structure. (Funny. It all started with the idea to be schema-free. Now people seem to need schema enforcement.)
I have two ideas how to implement a schema validation for JSON columns. The first one is by leveraging generated columns together with a foreign key. The second idea is by implementing a trigger. Today I want to focus on the generated columns and foreign keys.
When defining foreign keys with generated columns there are two limitations we need to be aware of:

Foreign keys require indexes. JSON columns cannot be indexed. We need to leverage other types.
Only STORED generated columns are supported for foreign keys.

So here is an example of an address table that leverages JSON to define an arbitrary number of phone number entries per row. In fact I use a mixed model of relational (e.g. to enforce a strict model for name NOT NULL) and document so that phone numbers are more free to define.

 CREATE TABLE `people` (  
 `name` varchar(30) NOT NULL,  
 `firstname` varchar(30) DEFAULT NULL,  
 `birthdate` date DEFAULT NULL,  
 `phones` json DEFAULT NULL,  
 `phonekeys` varchar(30) GENERATED ALWAYS AS (json_keys(phones)) STORED,  
 KEY `phonekeys` (`phonekeys`));

The generated column phonekeys is a string that includes the types of phone numbers for each row. Some sample data:

 mysql> INSERT INTO people (name,firstname,birthdate,phones)  
 VALUES ("Plumber", "Joe, the", "1972-05-05",'{"work": "+1(555)24680"}');  
 Query OK, 1 row affected (0.00 sec)  
 ...some more inserts...  
 mysql> SELECT * FROM people;  
 +---------+-----------+------------+--------------------------------------------------------+-----------------------+  
 | name | firstname | birthdate | phones | phonekeys |  
 +---------+-----------+------------+--------------------------------------------------------+-----------------------+  
 | Doe | John | 1995-04-17 | {"mobile": "+491715555555", "private": "+49305555555"} | ["mobile", "private"] |  
 | Dian | Mary | 1963-12-12 | {"work": "+43987654321"} | ["work"] |  
 | Plumber | Joe, the | 1972-05-05 | {"work": "+1(555)24680"} | ["work"] |  
 +---------+-----------+------------+--------------------------------------------------------+-----------------------+  
 3 rows in set (0.00 sec)

The column phonekeys gets populated automatically.
To check that we use "correct" attributes in our JSON object we can now create a table that contains the valid JSON keys:

  CREATE TABLE `valid_keys` (  
  `keylist` varchar(30) NOT NULL,  
  PRIMARY KEY (`keylist`)  
 ) ENGINE=InnoDB DEFAULT CHARSET=latin1 |  
 +------------+--------------------------------------------------------------------------------------------------------------------------------+  
 1 row in set (0.01 sec)  
 ... after some inserts...  
 mysql> SELECT * FROM valid_keys;  
 +-------------------------------+  
 | keylist            |  
 +-------------------------------+  
 | ["mobile", "private", "work"] |  
 | ["mobile", "private"]     |  
 | ["work"]           |  
 +-------------------------------+  
 3 rows in set (0.00 sec)

Now we can define a foreign key with the people table as a child table:
mysql> alter table people add foreign key (phonekeys) references valid_keys (keylist);

That should enforce that inserted JSON documents in the people table must have a list of attributes that matches any entry in the valid_keys table. Let's try:



mysql> INSERT INTO people (name,phones) VALUES ("me", JSON_OBJECT("work","12243"));

Query OK, 1 row affected (0.01 sec)



mysql> INSERT INTO people (name,phones) VALUES ("my friend", JSON_OBJECT("home","12243"));

ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`mario`.`people`, CONSTRAINT `people_ibfk_1` FOREIGN KEY (`phonekeys`) REFERENCES `valid_keys` (`keylist`))

mysql>

Works fine. "home" is not an allowed attribute. I can leverage the foreign keys to make sure my phone numbers match a certain attribute list. However it is not perfectly simple to use. With five different allowed attributes in an arbitrary order you would have to add all permutations to the valid_keys table. With five attributes you end up with 6! permutations ("not defining an attribute" is also an option, hence six), which results in 720 rows for valid_keys. But it is a first start. For more complex explamples the ideas with triggers might be more favorable.

Secondary Indexes on XML BLOBs in MySQL 5.7

2015-04-09T16:09:00.001+01:00

When storing XML documents in a BLOB or TEXT column there was no way to create indexes on individual XML elements or attributes. With the new auto generated columns in MySQL 5.7 (1st Release Candidate available now!) this has changed! Let me give you an example. Let's work on the following table:

 mysql> SELECT * FROM country\G  
 *************************** 1. row ***************************  
 docid: 1  
  doc: <country>  
     <name>Germany</name>  
     <population>82164700</population>  
     <surface>357022.00</surface>  
     <city name="Berlin"><population></population></city>  
     <city name="Frankfurt"><population>643821</population></city>  
     <city name="Hamburg"><population>1704735</population></city>  
 </country>  
 *************************** 2. row ***************************  
 docid: 2  
  doc: <country>  
     <name>France</name>  
     <surface></surface>  
     <city name="Paris"><population>445452</population></city>  
     <city name="Lyon"></city>  
     <city name="Brest"></city>  
     <population>59225700</population>  
 </country>  
 *************************** 3. row ***************************  
 docid: 3  
  doc: <country>  
     <population>10236000</population>  
     <name>Belarus</name>  
     <city name="Brest"><population></population></city>  
 </country>  
 *************************** 4. row ***************************  
 docid: 4  
  doc: <country>  
     <name>Pitcairn</name>  
     <population>52</population>  
 </country>  
 4 rows in set (0,00 sec)

The table has only two columns: docid and doc. Since MySQL 5.1 it is possible to extract the population value thanks to the XML functions like ExtractValue(...). But sorting the documents by the population of a country was impossible because population is not a dedicated column in the table. Starting with MySQL 5.7.6 DMR we can add an auto generated column that contains only the population. Let’s create that column:

 mysql> ALTER TABLE country ADD COLUMN population INT UNSIGNED AS (CAST(ExtractValue(doc,"/country/population") AS UNSIGNED INTEGER)) STORED;
  Query OK, 4 rows affected (0,21 sec)   
  Records: 4 Duplicates: 0 Warnings: 0   
  mysql> ALTER TABLE country ADD INDEX (population);   
  Query OK, 0 rows affected (0,22 sec)   
  Records: 0 Duplicates: 0 Warnings: 0   
  mysql> SELECT docid FROM country ORDER BY population ASC; 
  +-------+   
  | docid |   
  +-------+   
  |     4 |   
  |     3 |   
  |     2 |   
  |     1 |   
  +-------+   
  4 rows in set (0,00 sec)

The population value is extracted automatically from each document, stored in a dedicated column and the index is maintained. Really simple now. Note that the population value of the cities is NOT extracted.

What happens if we want to look for city names? Each document may contain several city names. First let’s extract the city names with the XML function and store it in an auto generated column again:

 mysql> ALTER TABLE country ADD COLUMN cities TEXT AS (ExtractValue(doc,"/country/city/@name")) STORED;  
 Query OK, 4 rows affected (0,62 sec)  
 Records: 4 Duplicates: 0 Warnings: 0  
 mysql> SELECT docid,cities FROM country;  
 +-------+--------------------------+  
 | docid | cities                   |  
 +-------+--------------------------+  
 |     1 | Berlin Frankfurt Hamburg |  
 |     2 | Paris Lyon Brest         |  
 |     3 | Brest                    |  
 |     4 |                          |  
 +-------+--------------------------+  
 4 rows in set (0,01 sec)

The XML function ExtractValue extracts the name attribute of all cities and concatenates these with whitespace. That makes it easy for us to leverage the FULLTEXT index in InnoDB:

 mysql> ALTER TABLE country ADD FULLTEXT (cities);  
 mysql> SELECT docid FROM country WHERE MATCH(cities) AGAINST ("Brest");  
 +-------+  
 | docid |  
 +-------+  
 |     2 |  
 |     3 |  
 +-------+  
 2 rows in set (0,01 sec)

All XML calculations are done automatically when storing data. Let’s add another XML document and query again:

 mysql> INSERT INTO country (doc) VALUES ('<country><name>USA</name><city name="New York"/><population>278357000</population></country>');  
 Query OK, 1 row affected (0,00 sec)  
 mysql> SELECT * FROM country WHERE MATCH(cities) AGAINST ("New York");  
 +-------+----------------------------------------------------------------------------------------------+------------+----------+  
 | docid | doc                                                                                          | population | cities   |  
 +-------+----------------------------------------------------------------------------------------------+------------+----------+  
 |     5 | <country><name>USA</name><city name="New York"/><population>278357000</population></country> |  278357000 | New York |  
 +-------+----------------------------------------------------------------------------------------------+------------+----------+  
 1 row in set (0,00 sec)

Does this also work with JSON documents? There are JSON functions available in a labs release. These functions are currently implemented as user defined functions (UDF) in MySQL. UDFs are not supported in auto generated columns. So we have to wait until JSON functions are built-in to MySQL.
UPDATE: See this blogpost. There is a first labs release to use JSON functional indexes.

What did we learn? tl;dr

With MySQL 5.7.6 it is possible to automatically create columns from XML elements or attributes and maintain indexes on this data. Search is optimized, MySQL is doing all the work for you. And Brest is not only in France but also a city in Belarus.

Profiling Stored Procedures in MySQL 5.7

2015-03-24T19:11:00.000+00:00

With the changes to performance_schema in MySQL 5.7 Development Milestone Release it is now possible to analyze and profile the execution of stored programs. This is highly useful if you develop more complex stored procedures and try to find the bottlenecks. The "old" performance_schema up to MySQL 5.6 only reported a CALL statement with a runtime, but no information on statements that were executed WITHIN the stored procedure. Now let's try this in the latest MySQL 5.7.6 DMR release. After creating some test table and a test stored procedure we need to activate the events_statements_history_long consumer, which is OFF by default:

mysql> UPDATE setup_consumers SET ENABLED="YES"
           WHERE NAME = "events_statements_history_long";

Then let's call the stored procedure that we want to inspect:

mysql> CALL test.massinsert(400,405);

To avoid that we overwrite data from the events_statements_history_long table with the following queries, let's deactivate that consumer ASAP. If you have some concurrent load running on your system, it may be wise to leverage the filter options in performance_schema like setup_actors and/or setup_objects.

mysql> UPDATE setup_consumers SET ENABLED="NO"
          WHERE NAME = "events_statements_history_long";

Next step is to find our CALL statement in the events_statements_history_long table:

mysql> SELECT event_id,sql_text,
              CONCAT(TIMER_WAIT/1000000000,"ms") AS time
                 FROM events_statements_history_long
       WHERE event_name="statement/sql/call_procedure"; +----------+-------------------------------+-----------+
| event_id | sql_text    | time      | +----------+-------------------------------+-----------+
| 144      | call massinsert(100,105)      | 0.2090ms |
| 150      | call massinsert(100,105)      | 79.9659ms |
| 421    | CALL test.massinsert(400,405) | 74.2078ms | +----------+-------------------------------+-----------+
3 rows in set (0,03 sec)

You see: I tried this stored procedure three times. The one I want to inspect in detail is event_id 421. Let's look at all nested statement events that came from 421:

 mysql> SELECT EVENT_NAME, SQL_TEXT,   
        CONCAT(TIMER_WAIT/1000000000,"ms") AS time   
     FROM events_statements_history_long   
     WHERE nesting_event_id=421 ORDER BY event_id;   
 +--------------------------+-----------------------------------+-----------+   
 | EVENT_NAME               | SQL_TEXT                          | time      |  
 +--------------------------+-----------------------------------+-----------+  
 | statement/sp/stmt        | SET @i = first                    | 0.0253ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0155ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 45.6425ms |   
 | statement/sp/jump_if_not | NULL                              | 0.0311ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0297ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 4.9695ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0726ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0365ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 6.8518ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0343ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0316ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 9.9633ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0309ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0274ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 5.6235ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0308ms  |
 +--------------------------+-----------------------------------+-----------+  
 16 rows in set (0,06 sec)

Now we have the statements that were executed in the stored procedure "massinsert(400,405)" with their individual execution times and in order of execution. We have all other information available as well, not only execution time. We can access number of rows affected, sql error text, used algorithms, ... All information that performance_schema offers for statement events. This is a great way to analyze your stored procedures. find the most costly statements and improve performance of your stored programs. That is really a great enhancement to performance_schema.

Auto Generated Columns in MySQL 5.7: Two Indexes on one Column made easy

2015-03-13T17:59:00.000+00:00

One of my customers wants to search for names in a table. But sometimes the search is case insensitive, next time search should be done case sensitive. The index on that column always is created with the collation of the column. And if you search with a different collation in mind, you end up with a full table scan. Here is an example:

The problem

 mysql> SHOW CREATE TABLE City\G  
 *************************** 1. row ***************************  
 Table: City  
 Create Table: CREATE TABLE `City` (  
 `ID` int(11) NOT NULL AUTO_INCREMENT,  
 `Name` char(35) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,  
 `CountryCode` char(3) NOT NULL DEFAULT '',  
 `District` char(20) NOT NULL DEFAULT '',  
 `Population` int(11) NOT NULL DEFAULT '0',  
 PRIMARY KEY (`ID`),  
 KEY `CountryCode` (`CountryCode`),  
 KEY `Name` (`Name`),  
 ) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1  
 1 row in set (0,00 sec)

The collation of the column `Name` is utf8_bin, so case sensitive. Let's search for a City:

 mysql> SELECT Name,Population FROM City WHERE Name='berlin';  
 Empty set (0,00 sec)  
 mysql> EXPLAIN SELECT Name,Population FROM City WHERE Name='berlin';  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 | id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 | 1  | SIMPLE      | City  | NULL       | ref  | Name          | Name | 106     | const |  1   |  100.00  | NULL  |  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 1 row in set, 1 warning (0,00 sec)

Very efficient statement, using the index. But unfortunately it did not find the row as the search is based on the case sensitive collation.
Now let's change the collation for the WHERE clause:

 mysql> SELECT Name,Population FROM City WHERE Name='berlin' COLLATE utf8_general_ci;  
 +--------+------------+  
 | Name   | Population |  
 +--------+------------+  
 | Berlin |    3386667 |  
 +--------+------------+  
 1 row in set (0,00 sec)  
 mysql> EXPLAIN SELECT Name,Population FROM City WHERE Name='berlin' COLLATE utf8_general_ci;  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 | id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 | 1  | SIMPLE      | City  | NULL       | ALL  | Name          | NULL | NULL    | NULL | 4108 |  10.00   | Using where |  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 1 row in set, 3 warnings (0,00 sec)

The result is what we wanted but the query creates a full table scan. Not good. BTW: The warnings point you to the fact that the index could not be used.

The solution

Now let's see how auto generated columns in the new MySQL 5.7 Development Milestone Release can help us. First let's create a copy of the Name column but with a different collation:

 mysql> ALTER TABLE City ADD COLUMN Name_ci char(35) CHARACTER SET utf8 AS (Name) STORED;  
 Query OK, 4079 rows affected (0,50 sec)  
 Records: 4079 Duplicates: 0 Warnings: 0

"AS (Name) STORED" is the new stuff: In the brackets is the expression to calculate the column value. Here it is a simple copy of the Name column. The keyword STORED means that the data is physically stored and not calculated on the fly. This is necessary to create the index now:

 mysql> ALTER TABLE City ADD INDEX (Name_ci);  
 Query OK, 0 rows affected (0,13 sec)  
 Records: 0 Duplicates: 0 Warnings: 0

As utf8_general_ci is the default collation with utf8, there is no need to specify this with the new column. Now let's see how to search:


mysql> SELECT Name, Population FROM City WHERE Name_ci='berlin';
+--------+------------+
| Name   | Population |
+--------+------------+
| Berlin |    3386667 |
+--------+------------+
1 row in set (0,00 sec)
mysql> EXPLAIN SELECT Name, Population FROM City WHERE Name_ci='berlin';nbsp;
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+ 
|  1 | SIMPLE      | City  | NULL       | ref  | Name_ci       | Name_ci | 106     | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+ 
1 row in set, 1 warning (0,00 sec)

Now we can search case sensitive (...WHERE Name=...) and case insensitive (WHERE Name_ci=...) and leverage indexes in both cases.

tl;dr

Use auto generated columns in MySQL 5.7 to create an additional index with a different collation. Now you can search based on different indexes.

MySQL Cluster on Raspberry Pi - Sub-second failover

2014-03-19T09:48:00.001+00:00

MySQL Cluster claims to achieve sub-second failover without any data loss for commited transactions. And I always wanted to show this in a demo. Now we created that demo finally. See Mark's blog and Keith's blog for setting up MySQL Cluster on RaspberryPi.
The nice thing about the RPis is that you can easily pull the plug to test failover. Ok, that is only one possible failure scenario but for sure the most obvious and more impressive than "kill -9".

That demo application is constantly using the database for storing new lines, removing old lines and reading all line data for the graphical view. There is no caching. It uses JDBC directly.
To document the setup here is the config.ini file for MySQL Cluster:

[ndb_mgmd]
hostname=192.168.0.101
NodeId=1

[ndbd default]
diskless=1
noofreplicas=2
DataMemory=2M
IndexMemory=1M
DiskPageBufferMemory=4M
StringMemory=5
MaxNoOfConcurrentOperations=1K
MaxNoOfConcurrentTransactions=500
SharedGlobalMemory=500K
LongMessageBuffer=512K
MaxParallelScansPerFragment=16
MaxNoOfAttributes=100
MaxNoOfTables=20
MaxNoOfOrderedIndexes=20
HeartbeatIntervalDbDb=10

[ndbd]
hostname=192.168.0.6
datadir=/home/pi/mysql/ndb_data
NodeId=3

[ndbd]
hostname=192.168.0.11
datadir=/home/pi/mysql/ndbd_data
NodeId=4

[mysqld]
NodeId=50

I made the cluster diskless so it will not write any logs and table spaces to disk. The SD card performance was not great and it does not affect failover behavior.
I also reduced the HeatbeatIntervallDbDb so that nodes detect immediately (well, 10ms) if a heartbeat is missed. After a few missed heartbeats cluster reconfigures and the remaining node takes responsibility and service continues.
BTW: Pulling the plug is nice but every now and then I had to manually fsck the root-fs during reboot.

New in 5.6: --innodb_read_only, running MySQL from DVD

2013-03-12T14:22:00.000+00:00

I recently met two distinct customers who want to use MySQL as a read-only database, stored on a read-only medium like a DVD. With MyISAM this was easily possible. Starting with MySQL 5.6 it is also possible to run this setup with InnoDB based databases. There is a new option:

--innodb_read_only

See details in the reference manual. This options opens all tablespaces and InnoDB log files as read-only. A small test showed that you need to set some more options in the my.cnf to avoid any write access to @@datadir:

innodb-read-only

log-error=/tmp/mysql-ro.err

pid-file=/tmp/mysql-ro.pid

event-scheduler=disabled

I was a bit surprised why I had to disable the event scheduler. But on the other hand, what use has a regularly running statement that cannot store any data? After all your database is read only ;-)

And together with the new fulltext indexes in InnoDB and maybe compressed table spaces you can now deploy catalogue applications or reference manuals on DVD.

What is ndb doing?

2012-05-29T08:06:00.001+01:00

In MySQL cluster each SQL statement is translated inside the NDB storage engine to NDB low level protocol that is sent to the data nodes. For the performance it is most interesting how many data is moved between data nodes and MySQL. To monitor this there are several NDB status variables that you can monitor. See this link for more documentation.

(There are also the NDBINFO tables that reflect cluster status. But these are only global values. The status variables also show session status. More about NDBINFO is here.)

To easily report the NDB status on an individual SQL statement, I wrote a little script that gives you the ndb status variables and automatically calculates the diffs before and after the statement in question:

#! /bin/bash

./mysql -t -h olga64 test <<EOF

CREATE TEMPORARY TABLE tmp_before LIKE INFORMATION_SCHEMA.SESSION_STATUS;

CREATE TEMPORARY TABLE tmp_after LIKE INFORMATION_SCHEMA.SESSION_STATUS;

INSERT INTO tmp_before SELECT * FROM INFORMATION_SCHEMA.SESSION_STATUS WHERE VARIABLE_NAME LIKE 'ndb_api%session%';

INSERT INTO tmp_after SELECT * FROM INFORMATION_SCHEMA.SESSION_STATUS WHERE VARIABLE_NAME LIKE 'ndb_api%session%';

SELECT tmp_before.VARIABLE_NAME, tmp_after.VARIABLE_VALUE - tmp_before.VARIABLE_VALUE AS 'VALUE' FROM tmp_after INNER JOIN tmp_before USING (VARIABLE_NAME) WHERE tmp_after.VARIABLE_VALUE <> tmp_before.VARIABLE_VALUE;

EOF

And here are some small examples of understanding what NDB does:

[root@olga64 bin]# ./ndb_prof.sh "SELECT COUNT(*) FROM t;"

+----------+

| COUNT(*) |

+----------+

| 32771 |

+----------+

+--------------------------------------------+----------+

| VARIABLE_NAME | VALUE |

+--------------------------------------------+----------+

| NDB_API_WAIT_SCAN_RESULT_COUNT_SESSION | 3 |

| NDB_API_WAIT_META_REQUEST_COUNT_SESSION | 2 |

| NDB_API_WAIT_NANOS_COUNT_SESSION | 19495775 |

| NDB_API_BYTES_SENT_COUNT_SESSION | 132 |

| NDB_API_BYTES_RECEIVED_COUNT_SESSION | 280 |

| NDB_API_TRANS_START_COUNT_SESSION | 1 |

| NDB_API_TRANS_CLOSE_COUNT_SESSION | 1 |

| NDB_API_TABLE_SCAN_COUNT_SESSION | 1 |

| NDB_API_SCAN_BATCH_COUNT_SESSION | 2 |

| NDB_API_READ_ROW_COUNT_SESSION | 2 |

| NDB_API_TRANS_LOCAL_READ_ROW_COUNT_SESSION | 2 |

| NDB_API_ADAPTIVE_SEND_FORCED_COUNT_SESSION | 1 |

+--------------------------------------------+----------+

So SELECT COUNT(*) only returns two rows (NDB_API_READ_ROW_COUNT_SESSION). Probably one row per fragment. (I have a two node cluster). COUNT(*) is optimized! But if you add a WHERE condition:

[root@olga64 bin]# ./ndb_prof.sh "SELECT COUNT(*) FROM t WHERE a<100;"

+----------+

| COUNT(*) |

+----------+

| 99 |

+----------+

+--------------------------------------------+----------+

| VARIABLE_NAME | VALUE |

+--------------------------------------------+----------+

| NDB_API_WAIT_SCAN_RESULT_COUNT_SESSION | 3 |

| NDB_API_WAIT_META_REQUEST_COUNT_SESSION | 2 |

| NDB_API_WAIT_NANOS_COUNT_SESSION | 18267962 |

| NDB_API_BYTES_SENT_COUNT_SESSION | 140 |

| NDB_API_BYTES_RECEIVED_COUNT_SESSION | 3248 |

| NDB_API_TRANS_START_COUNT_SESSION | 1 |

| NDB_API_TRANS_CLOSE_COUNT_SESSION | 1 |

| NDB_API_RANGE_SCAN_COUNT_SESSION | 1 |

| NDB_API_SCAN_BATCH_COUNT_SESSION | 2 |

| NDB_API_READ_ROW_COUNT_SESSION | 99 |

| NDB_API_TRANS_LOCAL_READ_ROW_COUNT_SESSION | 99 |

| NDB_API_ADAPTIVE_SEND_FORCED_COUNT_SESSION | 1 |

+--------------------------------------------+----------+

'a' is the primary key. This statement sends all 99 rows, well only the primary keys and mysqld is counting. This one can get more expensive, depending on the row count and primary key size. Look at NDB_API_BYTES_RECEIVED_COUNT_SESSION: 3248 bytes sent. This is roughly 32 bytes per row that is sent.

Why should I consider memcached plugin?

2012-03-12T12:11:00.001+00:00

My last post explained what to expect from memcached plugin in MySQL 5.6 (labs release). But I want to take a step back and think about "why" first. Why is memcached plugin of interest at all? What can I gain from using this instead of plain SQL?

First: I don't see this as a replacement for memcached. If you want memory caching with memcached then use memcached.

But the memcached plugin to MySQL is a replacement or addition to the SQL interface to MySQL. So instead of using SQL queries in your application to persist or retrieve data from MySQL you can use the memcached interface. And what are the benefits?

Much higher performance
Easier scalability via sharding
Simpler application coding

1. Performance

Performance is always good. But there are two different aspects of performance: Latency (or runtime) of a specific query and throughput. And the relation of these two measures is not easy and simple to explain. So we need to discuss both:

1.1 Latency

Latency is the time it takes to execute an individual query. If using SQL the server needs to parse that SQL query (which is not so easy because SQL is quite a complex language), then optimize the execution plan of a query and finally execute it. Memcached interface is much simpler so parsing is close to nothing and accessing a single row by an index does not require any optimization. Only the real query execution is the same. I did a test in a VM with one virtual disk. (Not the best benchmark environment ;-) I inserted single rows in multiple threads via C-API (using SQL) and via libmemcachd (using the new plugin). The mean latency for a single query was 20%-30% higher for SQL queries. Considering that my IO configuration was bad, in a better environment the share of query execution would be smaller so the benefit of memcached plugin (no parsing, no optimization) would be even bigger.

1.2 Throughput

Usually lower latency helps improving the throughput. If each query is finished earlier, the server can already start working on the next query. This is true only if the bottleneck is really the CPU or RAM. If latency is mainly due to disk IO, it is not so helpful. During my test I achieved up to 50% higher throughput with memcached plugin. Again, a suitable IO config can probably improve this advantage even further.

2. Scalability

If a single server is no longer enough one solution may be sharding. With SQL you usually have two options: Switch to MySQL cluster, which offers auto-sharding. The other alternative is to implement sharding on your own in your application, which requires some effort.

With memcached plugin it is easier: For connecting to the database you would usually use some library like libmemcached. And many of these libraries offer ... a sharding feature. This is a natural feature for memcached libraries as the goal of memcached is to offer a distributed cache. Depending on the key value (or a hash of it) the library determines which instance to talk to. There is no need for you to extend your application code and manage different database connections. In libmemcached you can simply define your connectstring as "SERVER=donald:11211 SERVER=dagobert:11211 SERVER=daisy:11211". And that's it. Libmemcached will automatically shard your data to all three database servers.

Well, one thing must be mentioned: If you want to query your data via SQL and do aggregation, you have to aggregate in your application. The SQL interface is still not auto-sharded. If you need this kind of functionality, you should consider MySQL Cluster. Cluster will shard your data automatically, wether coming via memcached API or SQL. See more in Andrew's blog. And for MySQL Cluster the memcache interface is already recommended for production use in MySQL Cluster 7.2.

3. Ease of use

The typical architecture is very often application - memcache layer - MySQL database. And for read access the application talks to memcache first, and if this results in a "miss", the application will re-do the query but with a different protocol to the MySQL server. You must implement both protocols: memcache API and SQL. If you use the memcached plugin, you can now access the database directly with the same API functions that you already use for memcache. That makes development much easier.

The sharding feature in the memcached library will make it easier to scale the database. You do not have to implement it on your own.

"Easy of use" only applies if your application is happy with doing only low level data access to a key value store. Once you also need higher level data access like JOINs or aggregation queries, you will again use memcache API and SQL in the same application. So nothing will change to before.

4. Availability?

Not a huge argument. By using the memcached plugin you essentially work with MySQL and you will leverage all possible HA options that are available for MySQL to make your data highly available. It is worth mentioning that memcached plugin supports replication. You have to define daemon_memcached_enable_binlog = 1. That's it.

Code examples

Just for your reference here is the shortcut version of the code that I used to test SQL and memcache API in connecting to MySQL server.

C-API

#include

...

MYSQL *mysql;

char stmt[256];

int key = 0;

mysql = mysql_init(NULL);

if (mysql_real_connect(mysql,"localhost","user","password",

"schema",3306,NULL,0L) == NULL) {

goodbye();

}

for (;;) {

sprintf (stmt, "INSERT INTO kvstore VALUES('key%d','%s',0,0,0)",

key++,value);

if (mysql_query (mysql, stmt) != 0) {goodbye();}

}

Libmemcached

#include

...

char *config_string = "--SERVER=localhost:11211";

memcached_st *memc;

memcached_return_t rc;

memc = memcached(config_string, strlen(config_string));

if (memc == NULL) {goodbye();}

for (;;) {

rc= memcached_set(memc, key, strlen(key), value, strlen(value),

(time_t)0, (uint32_t)0);

if (rc != MEMCACHED_SUCCESS) goodbye();

}

Playing with Memcached Plugin

2012-01-18T13:13:00.002+00:00

I am currently playing a lot with the new memcached interface to MySQL. Making MySQL a "NoSQL" solution.

Why should I access the same data via SQL and noSQL protocol?

A simple noSQL protocol like memcached only has lightweight access methods like "set" or "get". No complex parsing, no optimizing, no execution of expressions. And this additional overhead in SQL can be tremendous: I did a set of SELECT statements only based on primary key. Buffer cache is 50% of the table size. With ENGINE=InnoDB it takes 7.6 seconds to read the data. If I switch to BLACKHOLE engine it takes 6.4 seconds! BLACKHOLE has no code in the storage engine. So queries on BLACKHOLE engine create only load on parser and optimizer but nothing in the storage engine. But if I run on InnoDB it adds only 1 second or 15% runtime. Obviously the main part of execution time is outside the storage engine. Erkan found the same behaviour here. See page 12. The memcached interface accesses the storage engine directly. So it bypasses all the computing in parser, optimizer, expression evaluation and so on. Of course with INSERT statements the part of InnoDB gets bigger as there is more I/O work to do. But nevertheless there can be huge performance gains in using a simpler protocol. If you only want a glass of milk, don't buy a cow.

Why not use memcached directly? Why that plugin to MySQL?

Memcache is a memory caching but not a persistent datastore. The memcached plugin for MySQL stores the data in an InnoDB table. So you get persistence, nearly unlimited table size and still you can access your data through SQL if you want more complex stuff like COUNT(*) WHERE pkey LIKE ="page:%"; This would not be possible in memcached. But with the memcached plugin you can store data with memcached SET and report on your data with SQL queries.

How can I test it?

Download the labs release from http://labs.mysql.com, do the usual install and then follow the README-innodb_memcached file. It is very simple. Only executing a small sql script and you are ready to test memcached. BUT: If you SET data via telnet to memcached you can retrieve it via GET but you will probably not see the data in the table via SQL. This was a bit confusing to me at least. The secret is the variable daemon_memcached_w_batch_size which is set to 32 by default. The memcached plugin will commit data changes only after 32 SET commands (or INCR, DECR, ...). This batching is good for performance. In fact currently you cannot set daemon_memcached_w_batch_size to values lower than 32. Only bigger is possible. One exception is replication: If you enable replication for the plugin, daemon_memcached_w_batch_size is set to 1. See below.

If you want to see memcached data changes immediately even before a commit, you can set your session to SET TRANSACTION ISOLATION TO READ-UNCOMMITTED;

What about performance?

This is a more complex issue. I will write a separate blog post with some performance discussions. But the summary:

The main parameter is the daemon_memcached_w_batch_size. That will batch 32 statements in memcache protocol into one transaction on InnoDB side.

My first tests showed that (as expected) access via memcached protocoll offers nearly twice the throughput of SQL for writing. This is especially useful if you have only few user connections. If there are many simultaneous connections IO becomes the bottleneck and not the SQL processing.

What about replication and memcached interface?

This works seamlessly. You only have to specify innodb_direct_access_enable_binlog=1. Currently this variable is not (yet?) dynamic. So best to put it into my.cnf/my.ini. The aforementioned daemon_memcached_w_batch_size is set to 1 in this case which means each SET operation is committed separately. As binlog group commit is not implemented in this labs release, this affects performance very badly. But binlog group commit is already in another labs release and this would probably solve a lot of this performance issue.

More about replication and memcached can be found here: http://blogs.innodb.com/wp/2011/10/innodb-memcached-with-binlog-capability/

What if I want more than one value column?

Memcached only knows one value column. That's it. But the plugin to MySQL can help. Look at the configuration table innodb_memcache.config_options:

mysql> SELECT * FROM innodb_memcache.config_options;

+-----------+-------+

| name | value |

+-----------+-------+

| separator | | |

+-----------+-------+

1 row in set (0.00 sec)

This is the magic character to separate different columns. Take another look at my innodb_memcache.containers table:

mysql> SELECT * FROM innodb_memcache.containers;

+------+-----------+----------+-------------+-----------------+-------+------------+--------------------+------------------------+

+------+-----------+----------+-------------+-----------------+-------+------------+--------------------+------------------------+

+------+-----------+----------+-------------+-----------------+-------+------------+--------------------+------------------------+

What is the value to memcached protocol is split into three different columns on the MySQL side. And the separator is the "|" pipe character. Let's try:

[root@olga ~]# telnet localhost 11211

Trying ::1...

Connected to localhost.

Escape character is '^]'.

set mykey 0 0 14

abcd|1234|WXYZ

STORED

get mykey

VALUE mykey 0 14

abcd|1234|WXYZ

END

And in MySQL we find the following data:

mysql> SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM kvstore WHERE `key`="mykey";

+-------+-------+-------+------+---------+------+------+

| key | value | flags | cas | expires | val2 | val3 |

+-------+-------+-------+------+---------+------+------+

| mykey | abcd | 0 | 1 | 0 | 1234 | WXYZ |

+-------+-------+-------+------+---------+------+------+

1 row in set (0.00 sec)

Don't forget to set transaction isolation level to read-uncommitted. Due to write batching the last SET statements may not be visible otherwise.

What happens with Foreign Keys?

NoSQL usually means that data is unstructured and the knowledge about data is no longer in the database but in the application. So I am not sure if foreign keys are relevant to traditional key value store applications. But if you distribute your value into different columns in the innodb table, it might be interesting. So let's test it:

CREATE TABLE `kvstore` (

`key` varchar(32) NOT NULL DEFAULT '',

`value` varchar(1024) DEFAULT NULL,

`flags` int(11) DEFAULT NULL,

`cas` bigint(20) unsigned DEFAULT NULL,

`expires` int(11) DEFAULT NULL,

`val2` varchar(32) DEFAULT NULL,

`val3` varchar(32) NOT NULL,

PRIMARY KEY (`key`),

KEY `val3` (`val3`),

CONSTRAINT `kvstore_ibfk_1` FOREIGN KEY (`val3`) REFERENCES `refdata` (`val3`)

) ENGINE=InnoDB;

mysql> SELECT * FROM refdata;

+------+---------------------+

| val3 | somefield |

+------+---------------------+

| ABCD | Another good val3 |

| WXYZ | This entry is valid |

+------+---------------------+

2 rows in set (0.00 sec)

So we can add ABCD or WXYZ as the last field via memcached. Let's see what happens:

set mykey3 0 0 14

abcd|1234|ABCD

STORED

get mykey3

VALUE mykey3 0 14

abcd|1234|ABCD

END

set mykey4 0 0 14

efgh|5678|EFGH

NOT_STORED <--- That is cool! Memcached plugin appreciates foreign keys!

get efgh

END

get mykey4

END

So foreign key constraints are enforced with the memcached plugin.

How can I access multiple tables via memcached?

Memcached does not know anything about tables. There is only one data store. Usually memcached programmers use a trick to simulate different data stores: They include the table name into the key. So the key is something like "user:4711" or "page:/shop/home" or "session:fh5hjk543bjk". But still all data is in a single table in MySQL. If you want to report via SQL on only one type of data like "session:" you can add "...WHERE pkey LIKE "session:%" to your query. To make it comfortable you can also define different views:

mysql> SELECT * FROM kvstore;

+----------------------+-------------------------------+-------+------+---------+

+----------------------+-------------------------------+-------+------+---------+

| session:grcn34r834cn | 2012-01-12 08:32|4711 | 0 | 0 | 0 |

| session:k35jnjkj56ff | 2012-01-14 23:11|4713 | 0 | 0 | 0 |

| user:4711 | dummy|secret|Berlin | 0 | 0 | 0 |

| user:4712 | superman|unkown|London | 0 | 0 | 0 |

| user:4713 | wonderwoman|dontknow|New York | 0 | 0 | 0 |

+----------------------+-------------------------------+-------+------+---------+

5 rows in set (0.00 sec)

mysql> CREATE VIEW user AS SELECT RIGHT(`key`,4) AS userID, value FROM kvstore WHERE `key` LIKE "user%";

Query OK, 0 rows affected (0.37 sec)

mysql> SELECT * FROM user;

+--------+-------------------------------+

| userID | value |

+--------+-------------------------------+

+--------+-------------------------------+

3 rows in set (0.00 sec)

Summary

Memcached plugin is easy to enable and offers very lightweight access protocoll to InnoDB data.

You can store values into different columns. Foreign key relationships are enforced.

You can also replicate data that is stored via memcached plugin to slave servers.

The most important tuning parameter is daemon_memcached_w_batch_size, which is 32 by default, 1 if replicating.

I will add a more enhanced use case of the memcached configuration in another post that should show the benefits of using the same data via memcached protocoll and SQL at the same time.

MySQL is so slow on Windows... Really?

2010-03-29T21:20:00.000+01:00

Last week a customer called me and reported that MySQL was 30 times slower than MS SQL server. Oooops. That's a lot. No way to argue or throw smoke bombs. 30 times slower!

It was a standard installation of MySQL (typical install option) on plain Windows 7 and the same for MS SQL Server 2008. The test run was a batch of 30.000 INSERT commands in an SQL script. Runtime was 1 minute on MSSQL and 30 minutes on MySQL.

Some tests later we found out that it was only bad on InnoDB. MyISAM was as fast as MSSQL. (I didn't care which one was a bit faster. I didn't care as long as InnoDB was 30 times slower) Finally we nailed the problem down to one parameter in MySQL:

innodb_flush_log_at_trx_commit

Each INSERT statement is a single transaction (autocommit mode). MySQL is configured very faithfully and ensures that each transaction is really stored on disk. This is necessary for ACID compliance. D in ACID stands for 'durability'. To store data durable, at least the log file has to be written physically. That's why MySQL when a transaction commits forces the operating system to flush its buffers and even forces the disk cache to flush its buffer. That's the meaning of flush_log_at_trx_commit = 1 in the my.ini or my.cnf file.

MSSQL is much more relaxed with your data. It writes the data to disk device. But it may stay in the disk cache, and MSSQL does not care. If you have a crash, your data is not up-to-date on the physical disk and you may lose data. This is definitely not ACID compliant. Microsoft documented this here:

http://support.microsoft.com/default.aspx?scid=kb;en-us;234656

By default, the disk cache is enabled. Use the 'Disk Properties', Hardware tab to access the 'Properties', 'Policy' tab to control the disk cache setting. (Note Some drives do not honor this setting. These drives require a specific manufacturer utility to disable cache.)
...
Disk caching should be disabled in order to use the drive with SQL Server.

So to have a fair comparison beween MSSQL and MySQL either

set innodb_flush_log_at_trx_commit = 0 (or 2)
This forces the flush to disk only once per second and brings good performance but data is not 100% safe on disk (unless you have a battery backed write cache)
disable the disk cache in Windows 7
This will force MSSQL to write physically to disk. And then MSSQL is 30 times slower than before. ;-)

Lessons learned:

Think a lot about how to do a fair comparison.
Either run (unsafe and fast), or (safe and slow) or (safe and fast and expensive) with a battery backed write cache controller
Read the manual for MSSQL. There may be important news on page 3647+x.

The funny thing, that confused me a lot: On my Mac InnoDB was at around 1 minute. And even MySQL on Windows 7 in Virtual Box on my Mac was about 1 minute. The reason is: MySQL on Mac makes a fsync() that does not flush the disk cache. It flushes only the buffer cache of the operating system. And Windows 7 on Virtual Box: The disk is a plain file on the host OS. And this file is of course buffered in the host OS. So Virtualization may destroy your ACID compliance...

If you need more info on the different cache levels for file IO here is a very good link:

http://smallvoid.com/article/hard-disk-cache.html

"How to find the source of queries in MySQL Query Analyzer" or "SQL comments in Query Analyzer"

2010-01-11T11:46:00.000+00:00

MySQL Enterprise Monitor offers a tool called "Query Analyzer" (QuAn). QuAn sits between any client app and the MySQL server and logs every query and its runtime statistics. A very cool tool for analyzing your SQL. More information is available here.

If you identify a query, that needs some improvement, sometimes it is hard to identify the source of that query as well. With hundreds of different PHP scripts for example it is not easy to know, which one issued the query, that you want to modify.

A good way to achieve this is adding C-style SQL comments. Let's look at an example:

SELECT * FROM mytable /*main.php*/;

Query Analyzer will strip that comment off before archiving the query. This is ok, because QuAn wants to consolidate all similar queries and this comment is irrelevant for the query.

But you can use version specific comments in MySQL. QuAn cannot ignore these comments,

because they may be relevant for query execution. And this is the solution for our problem:

SELECT * FROM mytable /*!99999 main.php*/;

This SQL comment will be executed on MySQL version 9.99.99 and later. But this version does not exist. So in reality the comment gets not executed at all. Currently we are at MySQL 5.1.42 so it will take some time before we reach MySQL 9.99.99 ;-)

The comment is in QuAn and will be logged. So you see the comment when monitoring the query and you know immediately, where that query came from.

How to install MySQL Enterprise Monitor agents in a failover environment

2009-11-13T18:59:00.000+00:00

MySQL Enterprise Monitor is a tool to watch and analyze multiple MySQL environments from a single web based dashboard. More information is available on the MySQL homepage. Each MySQL instance is monitored by a small agent that connects to the MySQL instance and reads statistics that is sent to the MySQL Enterprise Monitor (MEM) Server.

That setup is very easy. But if the MySQL server is in a cluster failover configuration, there are some things to consider when installing the MEM agent:

What do you want?

Do you want to have two entries in the MEM dashboard for both physical servers?

This is good because:

You can monitor them separately, you can define different rules for both servers in case they offer different capabilities.
You can immediately see, which physical server runs the MySQL instance. The other entry will always report either "MySQL server down" or "MEM agent not reachable".

This is not so good because:

You cannot watch the data if a failover occurred. E.g. you can only see graphs for a specific physical host.
You get red alarms because the agent on the passive node cannot reach the MySQL instance. This alarm is harmless. But it will train you to ignore red alarms. Not good.

If you like this approach here is the description how to install.

Do you want to see only one entry in the MEM dashboard that displays the data no matter, which physical server is running the instance at the moment?
This is good because:

You have continuous data even if a failover occurred during the period. E.g. you can watch the cache hit rate graph of your MySQL server and you will see only a dip where the failover took place.
You will not see false events like "MySQL server is down on the passive node". Of course it is down. That's why it is called the passive node ;-)

This is not so good because:

It's not so easy to see which physical server currently runs the MySQL instance: In the meta info on the dashboard screen you see the physical hostname that runs the MySQL instance.
You need to apply the same rules to both physical servers. This may be a problem if they have different capabilities, e.g. if the backup node is smaller.

If you prefer this approach here is the description how to install.
By the way: I recommend this way, as the disadvantages are really small and the installation is even a little easier. Both procedures work for all common HA frameworks like SunCluster, OpenHA Cluster, Linux Heartbeat, Microsoft Windows Cluster.

Installing MEM agent on a cluster on the logical host

2009-11-13T18:40:00.000+00:00

The goal is to have only one entry in the Enterprise Monitor Dashboard that shows the status of the MySQL instance, no matter on which physical server in runs. There are two ways to achieve this:

You can install the agent on both physical nodes
You can install the agent on a shared storage.

In either case you have to make sure, that only one agent runs at a time. You have to integrate the agent into your cluster framework. I will not describe how this works, as it is highly dependant on your cluster framework.

The following description assumes, that you will install the agent on both physical nodes.

Install the agent but DO NOT START the agent yet.
Edit the [agent-installdir]/etc/mysql-monitor-agent.ini
In the [mysql-proxy] section add the following line:
agent-host-id=[logical hostname]
Do steps 1. and 2. for the other cluster node as well.
Include the agent in the cluster's failover group so that it will start automatically on that node, where the MySQL instance is running.
Start the agent via the cluster framework.
The entry in MEM will get the hostname of the first node that runs MySQL. Even after a failover this name will stay. You should change that name to the virtual hostname of the virtual ip adress of your MySQL service. Use the MEM BUI, choose the tab "Settings". From the secondary menu right below the tabs choose "Manage Servers". If you move the mouse over the server name a submenu will appear. Choose "Rename" and enter the name of the virtual IP.

Installing MEM agent in a cluster on the physical hosts

2009-11-13T15:17:00.000+00:00

To install the MEM agent in a way that both physical servers are listed in the MEM dashboard, you have to install the agent on both physical nodes. But: Do not start the agent after the installation!

There are three different IDs in MEM: agent-uuid, mysql-uuid and host-id. Usually they are generated automatically and you will never notice these IDs. For more information about the meaning of the different IDs look at this very good explanation from Jonathon Coombes.

The agent stores the uuid and the hostid in a MySQL table called mysql.inventory. After a failover the other agent on the new node will notice "wrong" hostid and uuid entries in the inventory table. The agent will stop and ask you to TRUNCATE mysql.inventory. But with this procedure MEM creates a new instance, so all old data is lost. Not good for a failover environment.

So in case of a failover you have to provide the mysql.inventory table, that the agent expects.

And here is how you can achieve this:

Install the MEM agent on the node that currently runs MySQL. Start the agent.
Make a table that stores the hostid and uuid for every physical host:
USE mysql;
CREATE TABLE inventory_hostname LIKE inventory;
DROP INDEX `PRIMARY` ON inventory_hostname;
ALTER TABLE inventory_hostname ADD COLUMN (hostname VARCHAR(64));
ALTER TABLE inventory_hostname ADD PRIMARY KEY (hostname,name);
INSERT INTO inventory_hostname SELECT *,@@hostname FROM inventory;
SELECT * FROM inventory_hostname;
The newly created table should look like this:
```
+--------+--------------------------------------+----------+
| name   | value                                | hostname |
+--------+--------------------------------------+----------+
| uuid   | 96936e90-56bd-4eb1-aef3-e708d149a4cb | wclus-1  |
| hostid | mac:{005056a138c10000}               | wclus-1  |
+--------+--------------------------------------+----------+
```
(Notice that the hostid is based on the mac address in my case. Usually this is the public ssh host key.)
Stop the agent
Empty the inventory_table:
TRUNCATE mysql.inventory;
Failover the MySQL instance to the other node.
Install and start the agent on the other node. It will save new values in inventory.
Copy these new values to the inventory_hostname table:
USE mysql;
INSERT INTO inventory_hostname SELECT *,@@hostname FROM inventory;
Both nodes should be visible in MEM dashboard right now.

With every failover we need to populate the inventory table with the host-specific rows. Easiest way (and independent on the operating system and cluster framework) is to define an init-file:
On both nodes create a file named [MySQL Basedir]/mysql_init_HA_MEM.sql with the following statements:
USE mysql;
REPLACE INTO inventory SELECT name,value FROM inventory_hostname WHERE hostname=@@hostname;
On both nodes edit you my.cnf or my.ini file. In the section [mysqld] add the following line:
init-file=[MySQL Basedir]/mysql_init_HA_MEM.sql
(If you already have an init-file defined you can add the commands "USE mysql; REPLACE..." to you init-file.)
Start all agents and try to failover the MySQL instance. Check that you init-file really modifies the inventory table.

Installing MySQL in Solaris 10 zones / containers

2009-05-14T09:09:00.000+01:00

Now that installing MySQL in Solaris zones is even officially supported by the MySQL support group (see http://www.mysql.com/about/legal/supportpolicies/policies-06.html#q03), the question is: What is the right way of installing MySQL in a zone. Of course this depends on what you want to achieve. The following description is based on Solaris 10. On Opensolaris this is different (somewhat easier, as there are no more sparse root zones.)

If you run a local zone as a whole root zone, you can easily install MySQL from tarball or the package installer.

If you run a local zone as a sparse root zone, there are different options:

First you cannot use the package installer, as this procedure will copy binaries to /usr/bin. But /usr/bin is inherited from the global zone and write protected. You have to use the tarball installation.

1. Make /usr/local/mysql writable
The tar ball will install in /usr/local/mysql. You can create a symbolic link in your GLOBAL ZONE:
> ln -s /localsoftware/mysql /usr/local/mysql
This link points to a directory, that is not inherited. So in every zone /usr/local/mysql will point to a dedicated directory with write permission. You untar the software in the zone in /localsoftware

2. Make /usr/local writable
If you want no software from /usr/local in the global zone to be available in the local zones as well, you can use solution#1 even with /usr/local. So in every zone, /usr is inherited and write protected. But /usr/local points to a directory, that is writable. This is my personal favorit:

In the global zone:

> mv /usr/local /LOCAL
> ln -s /LOCAL /usr/local
and in the local zone
> mkdir /LOCAL
> cd /usr/local
> gtar -xzf /anywhere/mysql.tgz

3. Install in different location
You can untar the MySQL tarball in any other location, that is not inherited from the global zone. Maybe /opt/ ? Check with mount, what directories are loopback mounted from the global zone with read-only flag.

4. Install MySQL globally for all zones
You can untar MySQL in the global zone's /usr/local. Then every local zone has a MySQL installation as well. But then it is important, that all write access in MySQL is done to a writable directory, like /var/lib/mysql. Otherwise MySQL in the local zone will stop, because it cannot write it's logfile/errorfile/datafiles in /usr/local/mysql/data