MariaDB released a new Long Term Support version as Generally Available: 11.8. I always review MariaDB LTS versions once they’re GA, so it’s time to write a new review.
Support and timeline
MariaDB 11.8 is the latest Long Term Support version (LTS). The first Generally Available release is 11.8.2, released on 4 June 2025.
This branch will be supported til June 2028, unless the support is extended. The previous LTS branch was 11.4 (see my MariaDB 11.4 review), which will be supported til 29 May 2029. This means that support for LTS versions has been reduced to 3 years.
All versions released between 10.11 and 11.4 are Short Term Support (STS) versions. This means that their End Of Life is set one year after their first GA release. So, while some users prefer to wait some time before upgrading to the latest LTS, you should make sure that you are using a supported version.
- For supported systems, see MariaDB Deprecation Policy.
- For plugins maturity level, you can query the
information_schema.ALL_PLUGINS
table. - MariaDB 10.11 can be used as a replica of older MariaDB versions. However, the use of removed or modified features might cause replication outages or data inconsistencies.
The 11.8 branch includes the enhancements introduced by recent Short Term Support versions: 11.5, 11.6, 11.7, as well as new enhancements not previously published.
References
For this review, I used the following sources of information:
- Various parts of the MariaDB documentation.
- MariaDB JIRA.
- MariaDB Foundation blog.
Key Improvements
If you don’t have the time to go through the whole article, here is a list of improvements that I find particularly interesting:
- Parallel backups with mariadb-dump.
- Limiting temporary files.
slave_abort_blocking_timeout
.- PARSEC auth plugin.
- user_statistics improvements.
CHECK TABLE EXTENDED
to find foreign key inconsistencies.- UUIDv4 and UUIDv7.
- Default values for stored procedures arguments.
11.8 Improvements in Details
Data Types and Related Functions
Strings
All types that contain texts allow us to specify:
- A character set – The set of characters that can be used. For example, latin1 roughly contains characters used by Western Europe alphabets.
- A collation – The rules to determine whether two characters are equal and, if not, which character is greater. For example,
'a'
and'à'
are equal for some collations and not for others. This affectsORDER BY
, comparisons,UNIQUE
indexes, etc.
With MariaDB 11.8, a historical change happened: MariaDB changed the default character set from latin1
to utf8mb4
.
The default collation is now uca1400_ai_ci
. It adheres to latest standards and handles multilingual sorting better.
TIMESTAMP
TIMESTAMP
was extended. The previous max value was '2038-01-19 03:14:07 UTC
, the new value is '2106-02-07 06:28:15 UTC'
. Interestingly, the type size didn’t change: 4 bytes. The reason is that the first byte was previously used for the sign, even if negative values were not allowed.
While physical storage is normally not affected, the max value change is very important for system-versioned tables. For ROW END
columns, the value must be set to the highest possible TIMESTAMP
for current row versions. Upgrading tables to the new format might be slow, depending on the tables size. Applications are not affected: they may not set ROW START
or ROW END
values explicitly.
The UUID
type was introduced in version 10.11. We recommend to use it for primary keys. However, in the early 10.11 versions, it was supposed to only store UUIDv1 values. Since that variant contains timestamp bits but is not chronologically ordered at a physical level, MariaDB changes the order of the bits on disk to make it more efficient for indexes. This is the reason why it wasn’t suitable for versions other than 1.
MDEV-29959, released in 10.11.5, fixed this problem by checking the UUID variant, and only shuffling the bits for versions older than 6.
UUID
and Related Functions
MariaDB 11.8 comes with more functions to generate UUIDs:
UUID()
, to explicitly generate v1 UUIDs.- UUID_v4(), which is unique but predictable. Task: MDEV-11339.
UUID_v7()
, the best version. Task: MDEV-32637.
The best UUID variant to use depends on your requirements. If low latency is important, you might want to go for UUIDv1, unless your instance runs on a CPU with sufficiently fast random number generator and has good entropy. For example, avoid 4 and 7 on small EC2 t2 instances.
All UUID variants take 128 bits. We can have different UUID variants in the same UUID
column, which makes upgrades smoother.
An important UUID bug was fixed: MDEV-35468. It occasionally caused UUIDs to incorrectly fail WHERE
conditions matching when multiple UUID variants were stored in the same column.
GIS Functions
GIS functions can be used with geospatial data types, like POINT
, LINE
, POLYGON
, etc, Several new GIS functions have been introduced. Pasted from the 11.8.0 release notes:
- ST_Validate (MDEV-34137)
- MBRCoveredBy (MDEV-34138)
- ST_Simplify (MDEV-34141)
- ST_GeoHash (MDEV-34158)
- ST_LatFromGeoHash (MDEV-34159)
- ST_LongFromGeoHash (MDEV-34160)
- ST_PointFromGeoHash (MDEV-34277)
- ST_IsValid (MDEV-34276)
- ST_Collect (MDEV-34278)
SQL
Vectors
For many people, vectors are probably the most important feature in this LTS version. The VECTOR data type and vector-similarity search functions enable AI in general, and RAG in particular.
The flow is the following:
- You have data in MariaDB, or in other data sources.
- You generate vectors from this data, using an embedder. The choice of an embedder depends on which LLM plan to use, and some other factors like the GPUs you can count on. You write the resulting embeds into MariaDB.
- You get an input from the user, for example a question in natural language. You use the same embedder to generate a vector.
- You need to find the most similar vector(s) in a MariaDB table. You can use a vector distance function for this.
Actually, the AI you plan to use doesn’t have to be an LLM. For example, you can use time2vec to turn a time series into vectors, and then use this information to find patterns or to forecast future metrics.
MariaDB vectors are faster than pgvector, according to Mark Callaghan’s benchmarks. But there are some caveats here. If we only care about performance, the biggest problem is that MariaDB apparently decided to never implement stored procedures in languages other than SQL. This means that the embedding process must happen outside of MariaDB, normally in another server, even if the original data is in MariaDB. With PostgreSQL, you can do everything in Postgres itself, with incremental batches or (for low-writes workloads) in real-time with triggers.
We will discuss MariaDB vectors in a separate article. Stay tuned!
System-Versioned Tables
System-versioned tables contain versioned rows, with two timestamps that indicate the period of validity of that row. A valid row has a ROW_START
timestamp that indicated when the row was created and a ROW_END
that is the highest legal TIMESTAMP
value. When the row is DELETE
d, the ROW_END
timestamp will be changed to the current timestamp. The same happens when a valid row is UPDATE
d, but in that case a new row is also created.
The ROW_START
and ROW_END
columns might be specified explicitly, in which case we can mention them in our queries. This allows to use plain SQL, and run complex analyses. But they can also be implicit, in which case the only way to see historical data is to use specific SQL syntax that probably no ORM supports.
Before MariaDB 11.8, a system-versioned table created with implicit timestamp columns could not converted to a table using explicit columns. There was no acceptable workaround for this: data could be copied to a new table, but all temporal information would be lost.
Fortunately, MariaDB 11.8 allows us to add explicit ROW_START
and ROW_END
columns to a system-versioned table that doesn’t have them:
> CREATE OR REPLACE TABLE customer_contact (
-> uuid UUID DEFAULT UUID() PRIMARY KEY,
-> user_uuid UUID,
-> contact_type VARCHAR(20) NOT NULL,
-> contact_value VARCHAR(50) NOT NULL
-> )
-> WITH SYSTEM VERSIONING
-> ;
> INSERT INTO customer_contact (user_uuid, contact_type, contact_value)
-> VALUES
-> ('dd1bc9c2-49fb-11f0-a71d-06e7dafdc5e9', 'phone', '12345'),
-> ('dd1bc9c2-49fb-11f0-a71d-06e7dafdc5e9', 'email', 'xyzzy@colossal-cave.com')
-> ;
> SET SESSION system_versioning_alter_history = KEEP;
]> ALTER TABLE customer_contact
-> ADD COLUMN valid_since TIMESTAMP(6) AS ROW START,
-> ADD COLUMN valid_until TIMESTAMP(6) AS ROW END,
-> ADD PERIOD FOR SYSTEM_TIME (valid_since, valid_until)
-> ;
> SET SESSION system_versioning_alter_history = ERROR;
> SELECT * FROM customer_contact \G
*************************** 1. row ***************************
uuid: bc664ce2-49fe-11f0-a71d-06e7dafdc5e9
user_uuid: dd1bc9c2-49fb-11f0-a71d-06e7dafdc5e9
contact_type: phone
contact_value: 12345
valid_since: 2025-06-15 15:38:07.982490
valid_until: 2106-02-07 06:28:15.999999
*************************** 2. row ***************************
uuid: bc664e74-49fe-11f0-a71d-06e7dafdc5e9
user_uuid: dd1bc9c2-49fb-11f0-a71d-06e7dafdc5e9
contact_type: email
contact_value: xyzzy@colossal-cave.com
valid_since: 2025-06-15 15:38:07.982490
valid_until: 2106-02-07 06:28:15.999999
Unfortunately, columns created in this way can’t be invisible, and can’t even be made invisible later. Usually, unexpected columns are not a problem for an application. But if it’s an issue in your case, keep an eye on MDEV-37008.
Task: MDEV-27293.
Sequences
Sequences are not to be confused with the SEQUENCE storage engine. They are simple serieses of numbers that can be extracted one at the time, and they automatically advance each time. They can be used as multi-table id’s, or as more structured alternatives to AUTO_INCREMENT
. My suggestion is to use UUIDs as primary keys for the general case, but sequences have their use cases, including compatibility with various other DBMSs.
Sequences have been extended. We can now specify their datatype. Also, we have a new SEQUENCES
table in the information_schema
database.
> CREATE OR REPLACE SEQUENCE s_tiny AS TINYINT;
Query OK, 0 rows affected (0.022 sec)
MariaDB [test]> CREATE OR REPLACE SEQUENCE s_big_s AS BIGINT;
Query OK, 0 rows affected (0.019 sec)
MariaDB [test]> CREATE OR REPLACE SEQUENCE s_big_u AS BIGINT UNSIGNED;
Query OK, 0 rows affected (0.020 sec)
MariaDB [test]> SELECT SEQUENCE_NAME, MINIMUM_VALUE, MAXIMUM_VALUE
-> FROM information_schema.SEQUENCES;
+---------------+---------------+----------------------+
| SEQUENCE_NAME | MINIMUM_VALUE | MAXIMUM_VALUE |
+---------------+---------------+----------------------+
| s_big_s | 1 | 9223372036854775806 |
| s_big_u | 1 | 18446744073709551614 |
| s_tiny | 1 | 126 |
+---------------+---------------+----------------------+
As you can see, even with signed types the minimum value and the start value are 1 by default. But nothing prevents us from creating a negative or mixed sequence:
> CREATE OR REPLACE SEQUENCE s_nega
AS TINYINT SIGNED
MINVALUE = -100
MAXVALUE = 100
;
> SELECT NEXT VALUE FOR s_nega;
+-----------------------+
| NEXT VALUE FOR s_nega |
+-----------------------+
| -100 |
+-----------------------+
Sequences are implemented as tables. For this reason, ANALYZE TABLE
used to work with a sequence. However, collecting statistics for sequence data is not useful for the query optimiser, so this command now does nothing with views:
> ANALYZE TABLE s_nega \G
*************************** 1. row ***************************
Table: test.s_nega
Op: analyze
Msg_type: note
Msg_text: The storage engine for the table doesn't support analyze
Tasks: MDEV-28152, MDEV-33938.
Correlation List for Derived Tables
A derived table is a subquery in a FROM
or JOIN
clause (see 3 ways to select rows from a query result in SQL).
Derived tables, just like any SELECT
, can set results column names with the AS
syntax. However, the SELECT
list can be long and hard to read. MariaDB 11.8 supports a more readable syntax that is supported by several other DBMSs: correlation lists.
SELECT *
FROM (
SELECT MIN(b), MIN(c)
FROM t
GROUP BY a
) v (b, c)
WHERE v.b IN (SELECT b FROM t2)
;
v
is the table alias, which is mandatory. (b, c)
is the correlation list: a list of columns aliases.
Task: MDEV-31466.
DELETE
with alias
DELETE
queries can now always use table aliases, even when deleting data from a single table. This way of specifying tables was already implemented for UPDATE
. These kinds of inconsistencies are unexpected and force users to spend time on long debugging activities. Syntax examples:
-- AS alias
DELETE FROM user AS u WHERE u.id = 24;
-- AS is optional
DELETE FROM user u WHERE u.id = 24;
Task: MDEV-33988.
Function: FORMAT_BYTES()
The FORMAT_BYTES()
function, previously implemented in MySQL 8.0, has been introduced in MariaDB. It formats a size in bytes as a human-readable string, using the most appropriate unit. Some examples:
> SELECT FORMAT_BYTES(1000), FORMAT_BYTES(1024), FORMAT_BYTES(NULL);
+--------------------+--------------------+--------------------+
| FORMAT_BYTES(1000) | FORMAT_BYTES(1024) | FORMAT_BYTES(NULL) |
+--------------------+--------------------+--------------------+
| 1000 bytes | 1.00 KiB | NULL |
+--------------------+--------------------+--------------------+
> SELECT FORMAT_BYTES(12345678912345678900.11);
+---------------------------------------+
| FORMAT_BYTES(12345678912345678900.11) |
+---------------------------------------+
| 10.71 EiB |
+---------------------------------------+
FORMAT_BYTES()
accepts a DOUBLE argument but, as far as I understand, the decimal part can’t affect the result.
Checking Foreign Key Inconsistencies
Foreign keys inconsistencies should never happen while foreign keys are enabled. However, if you temporarily disable foreign keys and modify data, you might introduce inconsistencies.
CHECK TABLE ... EXTENDED
is now able to find foreign key inconsistencies. Task: MDEV-34309.
Security
In MariaDB, many authentication methods are supported. They are implemented as authentication plugins. To list available authentication plugins:
SELECT * FROM information_schema.PLUGINS WHERE PLUGIN_TYPE = 'AUTHENTICATION';
information_schema.USERS table
information_schema
is a standard schema with metadata about all other schemas. It has row-level security, it can’t be modified and it has (approximately) the same structure for all DBMSs. Metadata should usually be queried from there.
Now information_schema has a USERS table with limited information about MariaDB accounts:
> SELECT * FROM information_schema.USERS WHERE USER LIKE SFORMAT('%@{0}', QUOTE('localhost'));
+---------------------------+-----------------+--------------------------+
| USER | PASSWORD_ERRORS | PASSWORD_EXPIRATION_TIME |
+---------------------------+-----------------+--------------------------+
| 'mariadb.sys'@'localhost' | 0 | 0000-00-00 00:00:00 |
| 'mysql'@'localhost' | NULL | NULL |
| 'root'@'localhost' | NULL | NULL |
+---------------------------+-----------------+--------------------------+
Tasks: MDEV-23729, MDEV-32218.
Authentication Plugin Added: PARSEC
The PARSEC
authentication plugin has been added in 11.8. PARSEC
(Password Authentication using Response Signed with Elliptic Curve) is a secure authentication method that protects access even if an attacker steals a password. This is possible because the protocol allows users and servers to verify each other’s identity without ever transmitting the actual password over the network, using it to generate a private key instead. See Wikipedia’s page.
Task: MDEV-32618.
Authentication Plugin Improved: UNIX_SOCKET
In previous versions, the UNIX_SOCKET
authentication plugin allowed an OS user to connect MariaDB if its OS username and its MariaDB username matched. This delegates authentication to the system, and allows to avoid having a MariaDB password at all. UNIX_SOCKET
is used by default for root
.
In 11.8, UNIX_SOCKET
can still be used in this way, but it also allows to specify an authentication string. When specified, the OS username needs to match the authentication string, rather MariaDB username. This is done in this way:
CREATE USER leonard_nimoy IDENTIFIED VIA UNIX_SOCKET AS 'spock';
Task: MDEV-33479.
SET USER Privilege
Stored procedures and functions, triggers, events, and views have a DEFINER clause that can be used to specify their owner. The owner’s privileges determine what the routine, trigger or view is allowed to do. So, for example, a stored procedure that has root
as its DEFINER
can typically drop a table even if the user who calls it (the invoker) doesn’t have sufficient privileges.
By default, the definer is the user who creates the object. A different definer can be specified:
CREATE DEFINER=reviewer VIEW AS (SELECT * FROM article WHERE state = 'DRAFT');
However, historically one needed the SUPER
privilege to set the DEFINER
clause. This was excessive, because only root
should posses this privilege. Now the SET USER
privilege is sufficient:
GRANT SET USER TO schema_migrator@'app-%.vettabase.com';
Stored Routines
MariaDB supports stored procedures and functions written in its procedural SQL language, or in Oracle’s PL/SQL dialect. See Stored Procedures in MariaDB: Smarter, Easier and More Powerful.
UPDATE
Triggers Columns
We can now define BEFORE
or AFTER UPDATE
triggers on a list of columns. These triggers are only activated when one of the specified columns is updated.
> CREATE OR REPLACE TABLE user (
-> uuid UUID DEFAULT UUID() PRIMARY KEY,
-> username VARCHAR(20) NOT NULL UNIQUE,
-> password VARCHAR(50) NOT NULL DEFAULT NOW(),
-> created_on TIMESTAMP
-> );
Query OK, 0 rows affected (0.020 sec)
> DELIMITER ||
MariaDB [test]> CREATE OR REPLACE TRIGGER bu_user
-> BEFORE UPDATE OF created_on
-> ON user FOR EACH ROW
-> BEGIN
-> SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT='created_on cannot be updated';
-> END ||
Query OK, 0 rows affected (0.013 sec)
> DELIMITER ;
> INSERT INTO user (username, password) VALUES ('kirk', '1701');
Query OK, 1 row affected (0.001 sec)
> UPDATE user SET username = 'j_kirk' WHERE username = 'kirk';
Rows matched: 1 Changed: 1 Warnings: 0
> UPDATE user SET created_on = NOW() WHERE username = 'j_kirk';
ERROR 1644 (45000): created_on cannot be updated
Note that the trigger is activated even when a column is set to the value it already holds (UPDATE ... SET created_on = created_on
).
Task: MDEV-34551.
Note that information_schema
doesn’t expose the column list. See MDEV-36996.
DEFAULT
values
Stored routines can now have DEFAULT
values:
> DELIMITER ||
> CREATE OR REPLACE PROCEDURE p(a INTEGER, b INTEGER DEFAULT 1, c INTEGER DEFAULT 1)
-> DETERMINISTIC
-> BEGIN
-> SELECT a, b, c;
-> END ||
> DELIMITER ;
> CALL p(1);
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 1 | 1 |
+------+------+------+
> CALL p(1, 2);
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 2 | 1 |
+------+------+------+
> CALL p(1, 2, 3);
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 2 | 3 |
+------+------+------+
Optional parameters and arguments must be aligned on the right. In the above example, there is no way to specify the first and third argument but not the second. However, you may want to follow MDEV-35965, which is about passing arguments by name as we can do in Oracle (or Python).
Task: MDEV-12252.
RETURN ROW
You can see the ROW
type as a PostgreSQL composite type, or a C struct. It was already partly implemented, and usable in stored routines bodies.
Now it can also be returned by stored functions:
> DELIMITER ||
> CREATE OR REPLACE FUNCTION pair(a VARCHAR(8), b INTEGER)
-> RETURNS ROW(a VARCHAR(8), b INTEGER)
-> BEGIN
-> RETURN ROW(a, b);
-> END ||
> CREATE OR REPLACE FUNCTION snd(p ROW(a VARCHAR(8), b INTEGER))
-> RETURNS INTEGER
-> BEGIN
-> RETURN p.b;
-> END ||
> DELIMITER ;
> SELECT 'It works!' AS result WHERE pair('k1', 24) = ('k1', 24);
+-----------+
| result |
+-----------+
| It works! |
ROW
s can be returned by functions and be used by other functions or routines. However, keep in mind that its use outside of routines is very limited: it cannot be returned by a SELECT
, and it cannot be assigned to a session variables (the @var
syntax). As far as I understand, it can only be compared to row constructors, as I did in the example above.
Task: MDEV-12252.
A similar syntax was implemented for the Oracle-compatibility mode: TYPE IS RECORD
. Task: MDEV-34317.
InnoDB
InnoDB is the default storage engine in MariaDB. If you don’t know which storage engine you use or don’t even know what storage engines are, you use InnoDB. This is normally a great choice, because InnoDB performance and reliability are excellent. Other storage engines should be used in corner cases, or to leverage particular features.
Some InnoDB improvements are mentioned in the Observability and Backups sections.
innodb_adaptive_hash_index
performance problem fixed (MDEV-35049). Performance used to degrade when hash indexes in the same table were used concurrently. I recommend to disable this feature unless you see evidence that it brings performance benefits in your case. In my experience, it rarely happens.
High Availability: Galera
Galera is the most advanced replication method in MariaDB, though it’s not suitable for all use cases. It’s a synchronous, multi-master replication. It’s implemented as a plugin that comes with MariaDB. Galera was produced by Codership, but the company was acquired by MariaDB in May 2025.
When a new node joins a Galera cluster, it receives a complete backup of the data from a donor node, via a process that is called SST (State Snapshot Transfer). The same process takes place when a node has been unreachable for too much time to be able to only receive incremental changes. An SST can use several methods, depending on MariaDB configuration:
- mariabackup;
- mysqldump;
- rsync.
The last method only involves copying files. But for Mariabackup and mysqldump, a user credentials need to be used.
Before 11.8, these credentials had to be created manually. With 11.8, the credentials are created manually when needed, and deleted after the backup has been restored.
While this technique is overall more secure, I recommend you make sure that no unwanted node can join the cluster. To do this, set wsrep_allowlist
.
High Availability: Replication
Replication refers to the oldest type of built-in replication: asynchronous, master-replica data replication based on the binary log. The binary log is sent to the slave, written as relay log, and then the data changes are applied from the relay log to the data.
Seconds_behind_master
has changed
The historical meaning of Seconds_behind_master is not as straight-forward as it may seem. It depends on many details, like the use of parallel replication and delayed replication.
The meaning is simpler starting from version 11.8. It’s always the difference between two precise timestamps:
- The latest event completely written into the relay log. The master might have completed more transaction, but the replica doesn’t know about them yet.
- The latest write completely applied to the replica’s data.
Task: MDEV-33856.
slave_abort_blocking_timeout
A common problem with replicas is that SELECT
s run manually by data analysts or jobs can block replication. SELECT
s are often considered non-blocking, but there are two notable exceptions:
SELECT
s always acquire a metadata lock, which blocks statements likeALTER TABLE
orDROP TABLE
.SELECT ... IN SHARE MODE
andSELECT ... FOR UPDATE
acquire row locks that last until the end of the current transaction.
On a replica, both these locks types can block the whole replication stream.
To make things worse, Python MariaDB and MySQL connectors use autocommit=0
by default. This means that metadata locks and row locks might remain open for a long time, because a data analyst wasn’t aware that the transaction needs be committed explicitly.
MariaDB 11.8 comes with a solution to this problem: the slave_abort_blocking_timeout
variable.
Task: MDEV-34857.
Observability
user_statistics plugin improvements
The user_statistics plugin provides many useful information about our workloads. Such information is written in information_schema
. The plugin is installed but not enables by default. To enable it, add this line to the configuration file:
userstat = 1
The INDEX_STATISTICS
table has now a new QUERIES
column that indicates how many queries used each index:
> SELECT * FROM information_schema.INDEX_STATISTICS WHERE TABLE_NAME = 't';
+--------------+------------+------------+-----------+---------+
| TABLE_SCHEMA | TABLE_NAME | INDEX_NAME | ROWS_READ | QUERIES |
+--------------+------------+------------+-----------+---------+
| test | t | idx_a_b | 33 | 1 |
+--------------+------------+------------+-----------+---------+
> SELECT a FROM t WHERE a = 2 AND b = 2;
...
> SELECT * FROM information_schema.INDEX_STATISTICS WHERE TABLE_NAME = 't';
+--------------+------------+------------+-----------+---------+
| TABLE_SCHEMA | TABLE_NAME | INDEX_NAME | ROWS_READ | QUERIES |
+--------------+------------+------------+-----------+---------+
| test | t | idx_a_b | 66 | 2 |
+--------------+------------+------------+-----------+---------+
The column counts all queries, not just unique queries. It is (correctly) not incremented by EXPLAIN
and not incremented when an index is considered but not used.
Task: MDEV-33152.
The TABLE_STATISTICS
. USER_STATISTICS
and CLIENT_STATISTICS
tables have new columns called KEY_READ_HITS
and KEY_READ_MISSES
, which help us understand if indexes are not cached enough, and which use cases are affected.
> SELECT TABLE_NAME, KEY_READ_HITS, KEY_READ_MISSES FROM information_schema.TABLE_STATISTICS
-> WHERE TABLE_NAME = 't';
+------------+---------------+-----------------+
| TABLE_NAME | KEY_READ_HITS | KEY_READ_MISSES |
+------------+---------------+-----------------+
| t | 21 | 0 |
+------------+---------------+-----------------+
> SELECT CLIENT, KEY_READ_HITS, KEY_READ_MISSES FROM information_schema.CLIENT_STATISTICS WHERE CLIENT = 'localhost';
+-----------+---------------+-----------------+
| CLIENT | KEY_READ_HITS | KEY_READ_MISSES |
+-----------+---------------+-----------------+
| localhost | 21 | 0 |
+-----------+---------------+-----------------+
> SELECT USER, KEY_READ_HITS, KEY_READ_MISSES FROM information_schema.USER_STATISTICS WHERE USER = 'root';
+------+---------------+-----------------+
| USER | KEY_READ_HITS | KEY_READ_MISSES |
+------+---------------+-----------------+
| root | 21 | 0 |
+------+---------------+-----------------+
Task: MDEV-33151.
InnoDB async reads and writes
InnoDB has aysnchronous IO, but in the previous LTS no information about these operations was exposed. Now we have a set of status variables that can be monitored:
> SHOW GLOBAL STATUS LIKE 'inno%async%';
+------------------------------------+----------+
| Variable_name | Value |
+------------------------------------+----------+
| Innodb_async_reads_pending | 0 |
| Innodb_async_reads_tasks_running | 0 |
| Innodb_async_reads_total_count | 5207 |
| Innodb_async_reads_total_enqueues | 0 |
| Innodb_async_reads_queue_size | 0 |
| Innodb_async_reads_wait_slot_sec | 0.000000 |
| Innodb_async_writes_pending | 0 |
| Innodb_async_writes_tasks_running | 0 |
| Innodb_async_writes_total_count | 268493 |
| Innodb_async_writes_total_enqueues | 0 |
| Innodb_async_writes_queue_size | 0 |
| Innodb_async_writes_wait_slot_sec | 0.000000 |
+------------------------------------+----------+
See the documentation. Task: MDEV-32841.
Performance: Optimiser
The biggest change to the optimiser was implemented with version 11.0, which implemented a cost-based optimiser. Version 11.8, however, comes with several optimiser improvements.
As a general rule, MariaDB and other DBMSs can’t use indexes to find the results of an expression. For example, even if you have an index on date
, DBMSs can’t use an index to optimise WHERE YEAR(date) = 1994
. This is because the optimisers know nothing about functions, and can’t make assumptions about what they will return.
In recent versions, MariaDB implemented several exceptions to this rule. Functions like UPPER()
, DATE()
and YEAR()
can now use indexes. However, it is not reasonable to expect that any developer remembers which functions can use indexes, and starting from which MariaDB version. I recommend to optimise your queries as if no function could use an index.
To optimise queries with functions, we can build indexes on functions. MariaDB doesn’t allow to do this directly, but we can create a generated column (a column based on an SQL expression) and build an index on it. For example, we can create a virtual column on the expression DAYOFMONTH(date)
, and then build an index on this column. With older versions of MariaDB this was possible and useful, but not practical: one had to explicitly mention the generated column in the query. But starting with 11.8, we can simply use the expression. In this example, 11.8 can use an index to optimise WHERE
.DAYOFMONTH
(date) = 1
Index condition pushdown is an optimisation used for WHERE
clauses that can only partly make use of indexes. It simply consists in using the index to check part of the condition and, only where this condition part is satisfied, lookup the table rows. I explained this in Query Optimisation: Using indexes for WHERE with multiple conditions. While this optimisation existed since forever, only with 11.8 it became usable with partitioned tables.
UPDATE
and DELETE
with subqueries can use some of the subquery optimisations that SELECT
s can use. Starting from 11.8, the choice between these optimisation is cost-based.
Performance: General
Temporary Files
MariaDB can create temporary tables or files to store the intermediate results of a query. For example, this happens when an ORDER BY
doesn’t use any index, or when GROUP BY
and ORDER BY
use different columns, requiring a two-steps sorting. When possible, MariaDB creates in-memory temporary tables for this purpose. However, when in-memory tables exceed a certain threshold or they use types not supported by the MEMORY storage engine, MariaDB has to create on-disk temporary Aria tables in the tmpdir. This can occasionally create problems, especially on replicas used by data analysts that don’t have enough disk space to handle the peaks of additional storage caused by certain simultaneous queries. With 11.8, the size of on-disk temporary tables can be limited in two ways:
max_tmp_session_space_usage
is a per-user limit.max_tmp_total_space_usage
is a global limit.
To know when you’re reaching the global limit, monitor these status variables:
tmp_space_used
is the currently used temporary space.max_tmp_space_used
is the maximum peak since the last MariaDB restart.
Task: MDEV-9101.
Other Performance Improvements
A long transaction no longer locks the binary log, locking other transactions. Task: MDEV-32014.
Backups
mariadb-dump is now capable of creating dumps using concurrent parallel threads, similarly to mydumper (which is not fully compatible with MariaDB, tho). mariadb-import can restore such backups. Both these tools will work with parallel dumps if we specify a --dir
option. The specified path will contain subdirectories for each database. Each of them will contain DDLs and tab-separated data. Tasks: MDEV-33625, MDEV-33627.
mariadb-dump now uses --no-autocommit
by default, which makes backups faster.
mariadb-import has a new option, --innodb-optimize-keys
, which makes restores faster.
Removals, incompatibilities and usability enhancements
The following variables were deprecated:
alter_algorithm
binlog_optimize_thread_scheduling
Galera variables:
SPIDER storage engine variables:
OQGRAPH storage engine variables:
What was left out
Some features have been under development for years. They didn’t make it to version 11.8, but we’ll see them at some point in the future.
Catalogs
Many talks have been dedicated to this feature, especially from Monty. It is also already documented.
In practice, catalogs will be the equivalent for other DBMSs’ databases. What MariaDB calls a database is actually a schema. In MariaDB, databases and schemas are synonymous – both in the documentation and in the SQL syntax. While MariaDB terminology is confusing, catalog is actually the term used by SQL standard.
The intended use case for catalogs is to enable multi-tennant applications. But I believe that catalogs will be useful for many other use cases. See MariaDB Catalogs: some use cases.
The main task for catalogs is MDEV-31542.
Foreign Keys
Currently foreign keys are only implemented in the InnoDB and RocksDB storage engines. This causes many limitations, for example:
- Other storage engines cannot have foreign keys.
- Multi-engine foreign keys are not supported, not even between InnoDB and RocksDB.
- The SQL layers and foreign keys can’t communicate enough. So, for examples, foreign keys do not activate triggers, and foreign keys can’t have an
ON DELETE SET DEFAULT
clause.
There is an ongoing effort to move foreign keys implementation to the SQL layer. This will solve the above mentioned limitations. I also see tasks to lift some other limitations, for example allowing foreign keys for partitioned tables and for system-versioned tables.
As far as I can see, there is no parent task about moving foreign keys implementation out of InnoDB.
Federico Razzoli
0 Comments