I have two ideas how to implement a schema validation for JSON columns. The first one is by leveraging generated columns together with a foreign key. The second idea is by implementing a trigger. Today I want to focus on the generated columns and foreign keys.
When defining foreign keys with generated columns there are two limitations we need to be aware of:
- Foreign keys require indexes. JSON columns cannot be indexed. We need to leverage other types.
- Only STORED generated columns are supported for foreign keys.
CREATE TABLE `people` (
`name` varchar(30) NOT NULL,
`firstname` varchar(30) DEFAULT NULL,
`birthdate` date DEFAULT NULL,
`phones` json DEFAULT NULL,
`phonekeys` varchar(30) GENERATED ALWAYS AS (json_keys(phones)) STORED,
KEY `phonekeys` (`phonekeys`));
The generated column phonekeys is a string that includes the types of phone numbers for each row. Some sample data:
mysql> INSERT INTO people (name,firstname,birthdate,phones)
VALUES ("Plumber", "Joe, the", "1972-05-05",'{"work": "+1(555)24680"}');
Query OK, 1 row affected (0.00 sec)
...some more inserts...
mysql> SELECT * FROM people;
+---------+-----------+------------+--------------------------------------------------------+-----------------------+
| name | firstname | birthdate | phones | phonekeys |
+---------+-----------+------------+--------------------------------------------------------+-----------------------+
| Doe | John | 1995-04-17 | {"mobile": "+491715555555", "private": "+49305555555"} | ["mobile", "private"] |
| Dian | Mary | 1963-12-12 | {"work": "+43987654321"} | ["work"] |
| Plumber | Joe, the | 1972-05-05 | {"work": "+1(555)24680"} | ["work"] |
+---------+-----------+------------+--------------------------------------------------------+-----------------------+
3 rows in set (0.00 sec)
The column phonekeys gets populated automatically.
To check that we use "correct" attributes in our JSON object we can now create a table that contains the valid JSON keys:
CREATE TABLE `valid_keys` (
`keylist` varchar(30) NOT NULL,
PRIMARY KEY (`keylist`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+------------+--------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
... after some inserts...
mysql> SELECT * FROM valid_keys;
+-------------------------------+
| keylist |
+-------------------------------+
| ["mobile", "private", "work"] |
| ["mobile", "private"] |
| ["work"] |
+-------------------------------+
3 rows in set (0.00 sec)
Now we can define a foreign key with the people table as a child table:
mysql> alter table people add foreign key (phonekeys) references valid_keys (keylist);
That should enforce that inserted JSON documents in the people table must have a list of attributes that matches any entry in the valid_keys table. Let's try:
mysql> INSERT INTO people (name,phones) VALUES ("me", JSON_OBJECT("work","12243"));
Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO people (name,phones) VALUES ("my friend", JSON_OBJECT("home","12243"));
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`mario`.`people`, CONSTRAINT `people_ibfk_1` FOREIGN KEY (`phonekeys`) REFERENCES `valid_keys` (`keylist`))
mysql>
Works fine. "home" is not an allowed attribute. I can leverage the foreign keys to make sure my phone numbers match a certain attribute list. However it is not perfectly simple to use. With five different allowed attributes in an arbitrary order you would have to add all permutations to the valid_keys table. With five attributes you end up with 6! permutations ("not defining an attribute" is also an option, hence six), which results in 720 rows for valid_keys. But it is a first start. For more complex explamples the ideas with triggers might be more favorable.
No comments:
Post a Comment