Saturday, May 4, 2019

Finding Tables without Primary Key

Having a primary key defined for each user table is best practice for performance in InnoDB. And when using Group Replication or InnoDB Cluster for automatic high availability it is (almost) mandatory. (The full story see here.) So it is wise to check if you have tables running without primary key. You can identify these tables by running:
SELECT t.table_schema, t.table_name FROM tables AS t 
       LEFT JOIN key_column_usage AS c  
       ON (t.table_name = c.table_name AND 
           c.constraint_schema = t.table_schema AND 
           c.constraint_name = 'PRIMARY' ) 
   WHERE t.table_schema NOT IN ("mysql", "information_schema", 
                                "performance_schema", "sys") 
         AND c.constraint_name IS NULL
         AND t.table_type = "BASE TABLE"; 

And if you want to make life even easier, you can add this as a report to the sys schema:
CREATE VIEW sys.schema_tables_without_pk AS
SELECT t.table_schema, t.table_name FROM tables AS t 
       LEFT JOIN key_column_usage AS c  
       ON (t.table_name = c.table_name AND 
           c.constraint_schema = t.table_schema AND 
           c.constraint_name = 'PRIMARY' ) 
   WHERE t.table_schema NOT IN ("mysql", "information_schema", 
                                "performance_schema", "sys") 
         AND c.constraint_name IS NULL
         AND t.table_type = "BASE TABLE"; 

It is easy to detect but a little more challenging to solve. You need to consider your application and potential load when adding a new primary key. One solution is to add a new auto_increment column. In many cases this might help already. In other cases you already have a natural primary key in your table definition, It's just not defined as such.

To add auto_increment columns to all affected tables (which I do not recommend without thinking about it and testing first!), you can use the beauty of the Python mode in MySQL Shell:

$ mysqlsh root@localhost:33060 --py
MySQL Shell 8.0.16

Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'root@localhost:39010'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 762 (X protocol)
Server version: 8.0.16-commercial MySQL Enterprise Server - Commercial
No default schema selected; type \use <schema> to set one.

MySQL  localhost:33060+ ssl  Py >l=session.get_schema("sys").get_table("schema_tables_without_pk").select().execute().fetch_all()
MySQL  localhost:33060+ ssl  Py >for val in l: session.sql("ALTER TABLE "+val[0]+"."+val[1]+" ADD COLUMN (__id int unsigned auto_increment PRIMARY KEY)");

Saturday, March 9, 2019

TTL - Perfect Accuracy by using an insertable VIEW

One more comment regarding TTL in MySQL:
If you are looking for perfect accuracy and never want to access rows that are older than the defined TTL you can hide the table t (from my previous post) behind a view. This view will automatically select only rows within TTL lifespan:
CREATE VIEW ttl as SELECT id, content, created_at FROM t 
       WHERE created_at >= NOW() - INTERVAL 10 SECOND;
This view is insertable, so you can fully use this view and you are not distracted by the additional column "bucket".
INSERT INTO ttl VALUES (NULL, "This is a test", NULL);
You could even exclude column "created_at" from the view definition, if there was not bug #94550. 'created_at' could be fully handled internally.
This view does not affect performance much. In my simple test it did not show any affect. Just better usability and better accuracy of TTL.

Limitations

You cannot use foreign keys with your ttl'ed table and view. This is because partitioning and foreign keys are mutually exclusive. If you need foreign keys go with the simple delete event procedure and forget about the view.
Due to bug #94550 you have to set explicit_defaults_for_timestamp to OFF and you always have to insert NULL into column 'created_at'.
In this whole setup the TTL is mentioned in four locations: In the partitioning definition, in the definition of the generated column 'bucket', in the cleaning event procedure and in the WHERE clause of the view. This makes it easier to screw up the setup. Make sure you use the same value everywhere. Same applies for the number of partitions in the table definition as well as the cleaning event procedure.

Friday, March 8, 2019

TTL - Time-to-Live in MySQL

A customer recently asked for a TTL feature in MySQL. The idea is to automatically delete rows from a certain table after a defined lifespan, e.g. 60 seconds. This feature is common in many NoSQL databases, but it is not available in MySQL. However MySQL offers all you need to implement this. And due to partitioning much more efficient than only deleting rows. Let's test it.

tl;dr

Partition the table and truncate partitions in a regular event procedure, that does the trick and comes at a fraction of the cost of regularly deleting rows.

The test case

The table needs a column to keep track of row age. This can be either a "created_at" column or an "expires_at" column. ("expires_at" has the additional advantage that each row can have an individual lifespan. Not possible in many NoSQL solutions.)
So my table is
CREATE TABLE `t` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `content` varchar(42) DEFAULT NULL,
);
I tested two variants to implement a 10 seconds TTL on my table "t":

The simple solution

Run an event every 10 seconds to delete rows that have been created more than 10s ago.
DELIMITER |
CREATE EVENT ttl_delete
        ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
        DO BEGIN
                DELETE FROM t WHERE created_at < NOW() - INTERVAL 10 SECOND;
        END |
DELIMITER ;
And index on "created_at" might improve performance for the DELETE job. But in any case it is quite expensive to scan the table and remove roughly 50% of the rows of a table, at least if the INSERT rate is high.

The efficient solution

Instead of DELETing we can use the much faster TRUNCATE operation. Obviously we do not want to TRUNCATE the whole table but if we distribute the inserted rows into partitions it is safe to truncate any partition that contains outdated rows. Let's define three partitions (or buckets): One that is currently being written to, one that holds rows of the last 10 seconds and one partition that can be truncated because the rows are older than 10 seconds. Key is to calculate the bucket from the current time. This can be done with the expression FLOOR(TO_SECONDS(NOW()/10)) % 3, or more generic FLOOR(TO_SECONDS(NOW()/ttl))% number_of_buckets 
Now we can partition the table by this expression. For that we add a generated column to calculate the bucket from the column "created_at" and partition the table by column "bucket". The table now looks like this:
CREATE TABLE `t` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `content` varchar(42) DEFAULT NULL,
  `bucket` tinyint(4) GENERATED ALWAYS AS     

       (floor(TO_SECONDS(`created_at`) / 10) % 3) STORED NOT NULL,
  PRIMARY KEY (`id`,`bucket`)
)  PARTITION BY LIST (`bucket`)
(PARTITION p0 VALUES IN (0),
 PARTITION p1 VALUES IN (1),
 PARTITION p2 VALUES IN (2));

And the event procedure is like this:
DELIMITER |
CREATE EVENT ttl_truncate
    ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
    DO BEGIN
            CASE FLOOR(TO_SECONDS(NOW())/10)%3
            WHEN 0 THEN ALTER TABLE test.t TRUNCATE PARTITION p1;
            WHEN 1 THEN ALTER TABLE test.t TRUNCATE PARTITION p2;
            WHEN 2 THEN ALTER TABLE test.t TRUNCATE PARTITION p0;
            END CASE;
        END|
DELIMITER ;

Watching the rows come and go

To verify that the procedure works as expected I created a small monitor procedure that displays each second the number of rows per partition. Then it is easy to follow in which partition data is currently added and when a partition gets truncated.
DELIMITER |
CREATE PROCEDURE monitor()
BEGIN  
  WHILE 1=1 DO
   SELECT "p0" AS "part", count(*) FROM t PARTITION (p0)
           UNION SELECT "p1", count(*) FROM t PARTITION (p1)
           UNION SELECT "p2", count(*) FROM t PARTITION (p2);

   SELECT now() AS "NOW", floor(to_seconds(now())/10)%3 AS "Bucket";
   SELECT sleep(1);
  END WHILE;
END|
DELIMITER ;

This procedure is not ideal. Too many count(*) will create quite some locking. But it is accurate. The alternative is to read data from INFORMATION_SCHEMA.partitions, but this does not give the exact row count, which I needed for verification. 

Increasing Accuracy

If TTL is 10 seconds, deleting or truncating every 10 seconds means you have at least 10 seconds of rows available. In reality you will have 10 to 20 seconds worth of data, so on average 15 seconds (assuming a constant INSERT rate). If you run the cleaner job more often (say once per second) the average number of rows is 10.5 seconds worth of data. This comes at the cost of running the cleaning event more often. But it might be very beneficial to increase this accuracy because all other queries benefit from less data to operate on and less memory consumed by expired rows.
If you go with the simple solution of a regular DELETE event, it is sufficient to schedule the event more often.
If you prefer the TRUNCATE PARTITION option, it is necessary to increase the number of partitions or buckets to 12 (= 2 + TTL / how often to run the cleaning job).
The expression for the calculated bucket column will be
`bucket` tinyint(4) GENERATED ALWAYS AS     
       (floor(TO_SECONDS(`created_at`) / TTL) % #buckets) STORED NOT NULL
and the partitioning needs to be adapted as well.
And the CASE construct in the cleaner event must be extended for each newly existing bucket/partition:
        WHEN n THEN ALTER TABLE test.t TRUNCATE PARTITION p(n+1);

What happens if...

... the event stops?
Then you keep all your rows which will likely create some follow-up problems. As always: Proper monitoring is key. Think about MySQL Enterprise Monitor for example.

... the event procedure runs at inaccurate timing due to overall system load?
No big problem. It will never run too early. So it will never remove rows too early. If it runs too late it will clean rows too late so you have more garbage in your table which might affect other queries. The real TTL is increased if this happens.

Performance Considerations

 By no means I am able to run proper performance tests. I am running on a Win10 laptop, VirtualBox with Oracle Linux and MySQL runs inside a Docker container. So plenty of reason to achieve bad numbers. But to compare the two implementations it should be sufficient.
I have extended the cleaner events to report the time needed to execute the event procedure. Here the example of the simple cleaner job:
CREATE EVENT ttl_delete
  ON SCHEDULE EVERY 10 SECOND STARTS '2019-03-04 16:00:00' DISABLE
  DO BEGIN
    DECLARE t1,t2 TIME(6);
    SET t1=current_time(6);

    DELETE FROM t WHERE created_at < NOW() - INTERVAL 10 SECOND;

    SET t2=current_time(6);
    INSERT INTO ttl_report VALUES ("DELETE simple", now(),
                                   timediff(t2,t1));

  END|
DELIMITER ;

The load was generated by mysqlslap, which only inserted rows in the table. Each test run starts the respective cleaner event, runs the mysqlslap load and then stops the cleaner event.
mysql -h 127.0.0.1 -uroot -pXXX -e \
        "USE test; ALTER event ttl_delete ENABLE;"
 

mysqlslap -h 127.0.0.1 -uroot -pXXX --create-schema=test \   
   --concurrency=5 --iterations=20 --number-of-queries=10000 
   --query="INSERT INTO test.t (created_at, content) VALUES 
            (NULL,md5(id));"
 

mysql -h 127.0.0.1 -uroot -pXXX -e \
    "USE test; ALTER event ttl_delete DISABLE;"

mysql -h 127.0.0.1 -uroot -pXXX -e \

      "USE test; ALTER event ttl_truncate ENABLE;"
 

mysqlslap -h 127.0.0.1 -uroot -pXXX --create-schema=test 
   --concurrency=5 --iterations=20 --number-of-queries=10000 
   --query="INSERT INTO test.t (created_at, content) VALUES 
           (NULL,md5(id));"
mysql -h 127.0.0.1 -uroot -pXXX -e \

    "USE test; ALTER event ttl_truncate DISABLE;"

The results are clearly in favor of truncating partitions. And the difference should be even higher the higher the INSERT rate gets. My poor setup achieved only less than 1000 inserts per second... 

select who, avg(how_long) from ttl_report GROUP BY who;
+---------------+--------------------+
| who           |avg(how_long)       |
+---------------+--------------------+
| DELETE simple | 1.1980474444444444 |

| truncate      | 0.0400163333333333 |
+---------------+--------------------+
3 rows in set (0.0014 sec)

 

Side note

You might wonder why my test load is
  INSERT INTO test.t (created_at, content) VALUES (NULL,'foo');"
Why do I mention the column "created_at" but then store NULL to give it the default of current_timestamp? If I omit the created_at column in this INSERT statement I get an error from the generated column due to bug #94550. Setting explicit_defaults_for_timestamp to OFF and then mentioning the timestamp column during INSERT is a workaround.