Tuesday, March 24, 2015

Profiling Stored Procedures in MySQL 5.7

With the changes to performance_schema in MySQL 5.7 Development Milestone Release it is now possible to analyze and profile the execution of stored programs. This is highly useful if you develop more complex stored procedures and try to find the bottlenecks. The "old" performance_schema up to MySQL 5.6 only reported a CALL statement with a runtime, but no information on statements that were executed WITHIN the stored procedure. Now let's try this in the latest MySQL 5.7.6 DMR release. After creating some test table and a test stored procedure we need to activate the events_statements_history_long consumer, which is OFF by default:

mysql> UPDATE setup_consumers SET ENABLED="YES" 
           WHERE NAME = "events_statements_history_long"; 

Then let's call the stored procedure that we want to inspect:

mysql> CALL test.massinsert(400,405); 

To avoid that we overwrite data from the events_statements_history_long table with the following queries, let's deactivate that consumer ASAP. If you have some concurrent load running on your system, it may be wise to leverage the filter options in performance_schema like setup_actors and/or setup_objects.

mysql> UPDATE setup_consumers SET ENABLED="NO" 
          WHERE NAME = "events_statements_history_long"; 

Next step is to find our CALL statement in the events_statements_history_long table:

mysql> SELECT event_id,sql_text, 
              CONCAT(TIMER_WAIT/1000000000,"ms") AS time 
                 FROM events_statements_history_long 
       WHERE event_name="statement/sql/call_procedure"; +----------+-------------------------------+-----------+ 
| event_id | sql_text                      | time      | +----------+-------------------------------+-----------+ 
| 144      | call massinsert(100,105)      | 0.2090ms  |
| 150      | call massinsert(100,105)      | 79.9659ms | 
| 421      | CALL test.massinsert(400,405) | 74.2078ms | +----------+-------------------------------+-----------+ 
3 rows in set (0,03 sec) 

You see: I tried this stored procedure three times. The one I want to inspect in detail is event_id 421. Let's look at all nested statement events that came from 421:

 mysql> SELECT EVENT_NAME, SQL_TEXT,   
        CONCAT(TIMER_WAIT/1000000000,"ms") AS time   
     FROM events_statements_history_long   
     WHERE nesting_event_id=421 ORDER BY event_id;   
 +--------------------------+-----------------------------------+-----------+   
 | EVENT_NAME               | SQL_TEXT                          | time      |  
 +--------------------------+-----------------------------------+-----------+  
 | statement/sp/stmt        | SET @i = first                    | 0.0253ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0155ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 45.6425ms |   
 | statement/sp/jump_if_not | NULL                              | 0.0311ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0297ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 4.9695ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0726ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0365ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 6.8518ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0343ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0316ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 9.9633ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0309ms  |   
 | statement/sp/stmt        | SET @i = @i + 1                   | 0.0274ms  |   
 | statement/sp/stmt        | INSERT INTO a VALUES (@i,MD5(@i)) | 5.6235ms  |   
 | statement/sp/jump_if_not | NULL                              | 0.0308ms  |
 +--------------------------+-----------------------------------+-----------+  
 16 rows in set (0,06 sec)   
 
Now we have the statements that were executed in the stored procedure "massinsert(400,405)" with their individual execution times and in order of execution. We have all other information available as well, not only execution time. We can access number of rows affected, sql error text, used algorithms, ... All information that performance_schema offers for statement events. This is a great way to analyze your stored procedures. find the most costly statements and improve performance of your stored programs. That is really a great enhancement to performance_schema.

Friday, March 13, 2015

Auto Generated Columns in MySQL 5.7: Two Indexes on one Column made easy

One of my customers wants to search for names in a table. But sometimes the search is case insensitive, next time search should be done case sensitive. The index on that column always is created with the collation of the column. And if you search with a different collation in mind, you end up with a full table scan. Here is an example:

The problem

 mysql> SHOW CREATE TABLE City\G  
 *************************** 1. row ***************************  
 Table: City  
 Create Table: CREATE TABLE `City` (  
 `ID` int(11) NOT NULL AUTO_INCREMENT,  
 `Name` char(35) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,  
 `CountryCode` char(3) NOT NULL DEFAULT '',  
 `District` char(20) NOT NULL DEFAULT '',  
 `Population` int(11) NOT NULL DEFAULT '0',  
 PRIMARY KEY (`ID`),  
 KEY `CountryCode` (`CountryCode`),  
 KEY `Name` (`Name`),  
 ) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1  
 1 row in set (0,00 sec)  

The collation of the column `Name` is utf8_bin, so case sensitive. Let's search for a City:
 mysql> SELECT Name,Population FROM City WHERE Name='berlin';  
 Empty set (0,00 sec)  
 mysql> EXPLAIN SELECT Name,Population FROM City WHERE Name='berlin';  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 | id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 | 1  | SIMPLE      | City  | NULL       | ref  | Name          | Name | 106     | const |  1   |  100.00  | NULL  |  
 +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+  
 1 row in set, 1 warning (0,00 sec)  


Very efficient statement, using the index. But unfortunately it did not find the row as the search is based on the case sensitive collation.
Now let's change the collation for the WHERE clause:
 mysql> SELECT Name,Population FROM City WHERE Name='berlin' COLLATE utf8_general_ci;  
 +--------+------------+  
 | Name   | Population |  
 +--------+------------+  
 | Berlin |    3386667 |  
 +--------+------------+  
 1 row in set (0,00 sec)  
 mysql> EXPLAIN SELECT Name,Population FROM City WHERE Name='berlin' COLLATE utf8_general_ci;  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 | id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 | 1  | SIMPLE      | City  | NULL       | ALL  | Name          | NULL | NULL    | NULL | 4108 |  10.00   | Using where |  
 +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+  
 1 row in set, 3 warnings (0,00 sec)  


The result is what we wanted but the query creates a full table scan. Not good. BTW: The warnings point you to the fact that the index could not be used.

The solution

Now let's see how auto generated columns in the new MySQL 5.7 Development Milestone Release can help us. First let's create a copy of the Name column but with a different collation:
 mysql> ALTER TABLE City ADD COLUMN Name_ci char(35) CHARACTER SET utf8 AS (Name) STORED;  
 Query OK, 4079 rows affected (0,50 sec)  
 Records: 4079 Duplicates: 0 Warnings: 0  

"AS (Name) STORED" is the new stuff: In the brackets is the expression to calculate the column value. Here it is a simple copy of the Name column. The keyword STORED means that the data is physically stored and not calculated on the fly. This is necessary to create the index now:
 mysql> ALTER TABLE City ADD INDEX (Name_ci);  
 Query OK, 0 rows affected (0,13 sec)  
 Records: 0 Duplicates: 0 Warnings: 0  

As utf8_general_ci is the default collation with utf8, there is no need to specify this with the new column. Now let's see how to search:


mysql> SELECT Name, Population FROM City WHERE Name_ci='berlin';
+--------+------------+
| Name   | Population |
+--------+------------+
| Berlin |    3386667 |
+--------+------------+
1 row in set (0,00 sec)
mysql> EXPLAIN SELECT Name, Population FROM City WHERE Name_ci='berlin';nbsp;
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+ 
|  1 | SIMPLE      | City  | NULL       | ref  | Name_ci       | Name_ci | 106     | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-------+ 
1 row in set, 1 warning (0,00 sec)  

Now we can search case sensitive (...WHERE Name=...) and case insensitive (WHERE Name_ci=...) and leverage indexes in both cases.

tl;dr

Use auto generated columns in MySQL 5.7 to create an additional index with a different collation. Now you can search based on different indexes.