For a very long time, the EXPLAIN command remained the same in the MariaDB and MySQL world. It was good enough in most cases, but the community welcomed the implementation of extensions and variants. However, MariaDB and MySQL improved it in different ways. So, if you’re familiar with one of these two databases, and you happen to use the other, you might not be aware of which options you have. And if you come from a totally different technology, you might not know anything about MariaDB and MySQL EXPLAIN. This article is a quick summary.
Basic EXPLAIN
The common implementation of EXPLAIN that we had for centuries shows a table with basic information about how a table is accessed, which indexes could have been used, which are actually used, and extra information. Notes:
- The
Extra
column shouldn’t be overlooked, as it can contain some very important information: for example, it tells us whether the results are materialised for a two-step ordering. - In a
JOIN
, the order of rows should represent the order in which tables are read. But to be sure, check theref
column. - Sometimes
EXPLAIN
generates warnings or errors that are actually generated by the query itself, not byEXPLAIN
.
MySQL allows to specify a FOR DATABASE
clause. If the query doesn’t explicitly mention a database name, the tables will be assumed to be in the database mentioned in FOR SCHEMA
. This might look like a very minor usability enhancement, but it’s very useful when developing tools that consume the slow log and test the queries. Without this feature, you’d have to modify the query itself, which is not simple if you want to take account of all the syntax edge cases.
Query execution vs approximation
Normally, EXPLAIN
runs a query without executing it. This means that some information, like the count of read rows and filtered rows, is approximated, based on the index / table / column statistics. These statistics are usually accurate enough, but since they’re based on random dives into data and indexes, they are occasionally wrong. Under some circumstances, they can also become obsolete over time.
But it’s also possible to execute the query and obtain accurate numbers. To do this, we can run ANALYZE
with MariaDB, or EXPLAIN ANALYZE
with MySQL.
Very old versions of MySQL (up to 5.5, I believe?) used to run subqueries contained in EXPLAIN
. This was a problem for slow subqueries that were executed many times.
Explaining a running query
Some queries are occasionally slow. This means that the optimiser sometimes chooses a good plan, but sometimes it doesn’t. This can depend on at least two factors:
- The query parameters. For example,
WHERE date < '2020-01-01'
might be much slower thanWHERE date < '2019-12-01'
. A possible reason is that it chooses to use a full table scan even if it should use an index, or the other way around. - But it’s also possible that the same query, with the same parameters, is slow on some situations and fast in others. This might happen because the data change often, and for some reason the statistics are not changed frequently enough.
As a consequence, it might be useless to run the EXPLAIN
at a later time. The plan it shows might be different from the one the query actually followed.
To solve this problem:
- MariaDB first implemented
SHOW EXPLAIN FOR <connection_id>
; - MySQL implemented
EXPLAIN ... FOR CONNECTION <connection_id>
to create an incompatibility, but this syntax was later added by MariaDB, too.
In both cases, the connection id is the same you can get from SHOW PROCESSLIST
or various system tables.
Partitions
In MariaDB, and up to MySQL 5.7, a PARTITIONS
option was supported. It added a column with information about which partitions were read.
In MySQL 8 this column is always included in the result set.
Query rewriting
In MariaDB, and up to MySQL 5.7, an EXTENDED
option was supported. It had the following effects:
- A
filtered
column is added. This column shows an estimated percentage of the rows that are filtered by theWHERE
condition. - A warning is produced. This warning contains the query internally rewritten by MariaDB or MySQL. I rarely use it to check if MariaDB applied a subquery optimisation, and probably for no other reason.
In MySQL 8 the filtered
column and the warning are always generated.
An example, with MariaDB:
> EXPLAIN EXTENDED SELECT * FROM library.author a LEFT JOIN library.book b ON a.id = b.author_id WHERE b.id IS NULL;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------+
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 14 | 100.00 | |
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 30 | 100.00 | Using where; Not exists; Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+-------------------------------------------------------------+
> SHOW WARNINGS \G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: select `library`.`a`.`id` AS `id`,`library`.`a`.`name` AS `name`,`library`.`b`.`id` AS `id`,`library`.`b`.`title` AS `title`,`library`.`b`.`author_id` AS `author_id`,`library`.`b`.`genre_id` AS `genre_id` from `library`.`author` `a` left join `library`.`book` `b` on(`library`.`b`.`author_id` = `library`.`a`.`id`) where `library`.`b`.`id` is null
Alternative formats
As we’ll discuss in a moment, both MariaDB and MySQL support a JSON format, and MySQL supports a Tree format.
With MySQL, rather than specifying the FORMAT
clause for every EXPLAIN
command, you can set the explain_format
system variable.
Alternative formats: JSON
Both MariaDB and MySQL support a JSON format, but the way information is formatted is quite different. MariaDB’s JSON format have properties that match the traditional table format’s column, though it nests JOIN
ed tables, UNION
ed tables and subqueries in an array.
In both cases, however, the JSON format adds information that is not available in the table format. For example:
- Cost estimations;
- Number of scans for a joined table;
- Number of rows per scan;
- Used columns in an index.
A MariaDB example:
{
"query_block": {
"select_id": 1,
"cost": 0.2229144,
"const_condition": "1",
"nested_loop": [
{
"table": {
"table_name": "a",
"access_type": "ALL",
"loops": 1,
"rows": 14,
"cost": 0.0131368,
"filtered": 100
}
},
{
"block-nl-join": {
"table": {
"table_name": "b",
"access_type": "ALL",
"loops": 14,
"rows": 30,
"cost": 0.2097776,
"filtered": 100
},
"buffer_type": "flat",
"buffer_size": "6Kb",
"join_type": "BNL",
"attached_condition": "trigcond(b.author_id = a.`id`)"
}
}
]
}
}
A MySQL example, for the same query:
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "39.50"
},
"nested_loop": [
{
"table": {
"table_name": "a",
"access_type": "ALL",
"rows_examined_per_scan": 13,
"rows_produced_per_join": 13,
"filtered": "100.00",
"cost_info": {
"read_cost": "0.25",
"eval_cost": "1.30",
"prefix_cost": "1.55",
"data_read_per_join": "5K"
},
"used_columns": [
"id",
"name"
]
}
},
{
"table": {
"table_name": "b",
"access_type": "ALL",
"rows_examined_per_scan": 29,
"rows_produced_per_join": 377,
"filtered": "100.00",
"using_join_buffer": "hash join",
"cost_info": {
"read_cost": "0.26",
"eval_cost": "37.70",
"prefix_cost": "39.51",
"data_read_per_join": "300K"
},
"used_columns": [
"id",
"title",
"author_id",
"genre_id"
],
"attached_condition": "<if>(is_not_null_compl(b), (`library`.`b`.`author_id` = `library`.`a`.`id`), true)"
}
}
]
}
}
Alternative formats: Tree
This format is only supported by MySQL. It resembles Oracle’s format. It includes all essential information and it’s easy to read for humans, especially when JOIN
s are involved. An example:
-> Left hash join (b.author_id = a.id) (cost=38.2 rows=377)
-> Table scan on a (cost=1.55 rows=13)
-> Hash
-> Table scan on b (cost=0.243 rows=29)
EXPLAIN in the slow query log
In MariaDB and Percona Server, the slow log can contain the results of EXPLAIN
for slow queries. This is not done by default, as this extra content will make the slow log grow. To log the output of EXPLAIN, setlog_slow_verbosity
to full
, or make sure it is a comma-separated list that contains explain
. See Logging all MariaDB and MySQL queries into the Slow Log for more information.
Conclusions
Both MariaDB and MySQL implemented EXPLAIN extensions, though MySQL made this statement a bit more usable. You can use this page as a reference if you work with both MariaDB and MySQL.
Federico Razzoli
0 Comments