MariaDB 10.8 was released in May 2022 and I’m writing a review, as I did with MariaDB 10.6. I plan to keep doing this with every major release. Bear in mind that my reviews reflect my personal opinions.
10.8 is a short-term release, which means that the support expires after 1 year. It’s good to experiment with the new features, but shouldn’t be used in production. However, Vettabase supports short-term releases.
Most important changes in MariaDB 10.8
In my opinion, the features I’m describing here are the most important changes in MariaDB 10.8. I’ll discuss them in order of importance.
Lag free ALTER TABLE in replication
Task: MDEV-11675.
With older MariaDB versions and all MySQL versions, replicas can’t start replicating an ALTER TABLE
until it has successfully completed on the master. This means that a long running ALTER TABLE
on the master will normally be a bottleneck for replication, blocking other data changes from being applied until the master completes the operation.
With this change, replicas don’t wait for ALTER TABLE
completion on the master. This is the workflow applied starting from version 10.8:
- The master receives an
ALTER TABLE
statement. - The master logs the statement and starts to run it.
- The statement is received by the replicas, that start to run it without unnecessary delays (unless replication is already lagging for other reasons).
- If
ALTER TABLE
succeeds on the master:- The master logs a commit.
- The replicas that already completed the change will make it effective (applications see the new table definition).
- Other replicas will make it effective as soon as possible.
- If
ALTER TABLE
fails on the master:- The master logs a rollback.
- The replicas that already completed the change will cancel it. Applications will never see the new table definition.
- Other replicas cancel the operation before it completes.
InnoDB changes
Task: MDEV-14425.
The Redo Log format was modified again to reduce write amplification. See the task for detailed explanations.
This follows the changes made in version 10.5: with MDEV-14425 the Redo Log format was made more efficient, and thanks to MDEV-20907 only one (logically circular) Redo Log file exists now.
Task: MDEV-25342.
innodb_buffer_pool_chunk_size
is now allocated dynamically. The automatically set value is innodb_buffer_pool_size / 128
, or 128M, whatever is higher; this value may be increased so that it becomes a multiple of the OS page size (or hugepage size).
The previous default was 128M, to the value has increased for buffer pools bigger than 16G.
Faster Unicode collations
I admit that this part confuses me. The documentation points us to these tasks: MDEV-27266 and MDEV-27265, which are still work in progress. My guess, however, is that part of the code produced as a result of those tasks has been included in 10.8. I think so because I can clearly see improvements in the utf8mb3_unicode_ci
and utf8mb4_unicode_ci
collations. In my trivial, unprofessional tests, a comparison of two short strings shows more than 25% performance improvement:
SELECT BENCHMARK(1000000000, _utf8mb3'qwerty' = _utf8mb3'qwezxc');
10.6: 20.489 sec
10.8: 14.825 sec
SELECT BENCHMARK(1000000000, _utf8mb4'qwerty' = _utf8mb4'qwezxc');
10.6: 20.828 sec
10.8: 14.234 sec
SELECT BENCHMARK(100000000, _utf8mb4'qwerty' SOUNDS LIKE _utf8mb4'qwezxc');
What about the relative unicode collations, the less precise but faster counterparts? I couldn’t see any improvement here.
Note that I didn’t test longer strings. These “benchmarks” are only good for a first impression.
Descending indexes
Tasks: MDEV-13756, MDEV-26938, MDEV-26939, MDEV-26996.
This is a MySQL 8 feature, and MariaDB has finally implemented it. This feature solves the problem with ORDER BY
clauses that mention multiple columns in a different sort order, for example:
SELECT id, publishing_date, title
FROM book
ORDER BY publishing_date DESC, title ASC
;
Optimising a query like this was normally possible with indexed generated columns. For example, in this case a generated column could be defined in this expression: (DATEDIFF({ DATE '2100-01-01' }, publishing_date)
, which is the number of days to the year 2100 (more recent values are smaller). Then an index on (days_to_2100, title)
could be built, and the query could use: ORDER BY days_to_2100, title
.
So, if you really need to optimise such an ORDER BY
, you can do it, even with older versions. But you’ll need to use a bad hack that I wouldn’t like to see in a database I own, to be honest.
Notes:
- I understand that not all possible optimisations are supported yet, but I believe that the most practical case for this feature is the one demonstrated above (a simple
ORDER BY
, different sort directions). - Consider building DESC indexes when you index a column that is normally used in
ORDER BY c DESC
. This will bring a small improvement because of the InnoDB indexes internal structure (pages form a doubly linked list, records don’t).
JSON Histograms
Tasks: MDEV-21130, MDEV-26519.
This feature concerns optimiser statistics. These are the statistics about data distribution in tables, indexes and columns, intended for optimising queries (in which order the tables will be read, which indexes will be used, etc).
MariaDB implemented Histogram-Based statistics in version 10.4. Histogram-based statistics are collected for columns (not indexes) as engine-independent statistics. JSON histograms allow MariaDB to store more precise statistics about column data distribution. This feature was originally advertised as a 10.7 feature, but it was only merged into 10.8.
JSON histograms enable better query optimisation when we join tables, and we have WHERE
conditions on non-indexed columns.
Note that this feature is not used by default. Since version 10.4, engine-independent statistics are not collected by default. To collect them, we can use ANALYZE TABLE table_name PERSISTENT FOR ALL
, or ANALYZE TABLE table_name PERSISTENT FOR COLUMNS (column list)
. Once collected, engine-independents statistics are used for queriesby default. So, to use this feature, you can do the following:
- Use
ANALYSE TABLE PERSISTENT
on selected tables/columns; - Or change the value for
use_stat_tables
.
Usability enhancements
mariadb-binlog
options--start-position
and--end-position
now accept GTIDs.- Temporal tables partitioning has become easier. We can specify the desired number of partitions, and the rotation will be automatic. For example, if we type
CREATE TABLE ... PARTITION BY SYSTEM_TIME INTERVAL 1 YEAR PARTITIONS 3
a temporal table with 3 partitions will be created, one for current data and two for historical data. Partitions will be rotated once a year. - While creating a SPIDER table, we can now use the
REMOTE_SERVER
,REMOTE_DATABASE
andREMOTE_TABLE
options, instead of using theCOMMENT
clause.
Conclusions
I listed MariaDB 10.8’s most useful features (in my opinion). Most of those features are performance improvements.
Remember, however, that MariaDB 10.8 is a short-term support release, and it will be discontinued in one year since its first stable release (May 2023). While it is helpful for testing new features, most users shouldn’t use it in production. The latest long-term support release is 10.6.
If you want to take advantage of the recent and old MariaDB performance optimisations, consider MariaDB Health Check service from Vettabase.
Federico Razzoli
0 Comments