Schedule Meeting

a

How to compose strings in MariaDB

by | Dec 3, 2022 | MariaDB | 0 comments

MariaDB supports several ways to compose strings dynamically. Some of them might be well-known, others aren’t familiar to you. This article covers all the ways to compose strings.

CONCAT()

CONCAT() is a simple function to concatenate strings:

> SELECT CONCAT(user, '@', host) FROM mysql.user WHERE user = 'root';
+-------------------------+
| CONCAT(user, '@', host) |
+-------------------------+
| root@%                  |
| root@localhost          |
+-------------------------+

It accepts one or more arguments:

MariaDB [test]> SELECT CONCAT('1', '2', '3');
+-----------------------+
| CONCAT('1', '2', '3') |
+-----------------------+
| 123                   |
+-----------------------+

Note that, if any of the arguments is NULL, CONCAT() will return NULL:

> SELECT CONCAT('A', NULL);
+-------------------+
| CONCAT('A', NULL) |
+-------------------+
| NULL              |
+-------------------+

This is more or less consistent with the way SQL treats NULL, but in most cases this is not what you want.

CONCAT_WS()

CONCAT_WS() means concat with separator. It works like CONCAT(), except that:

  • the first argument is a separator;
  • there must be at least two arguments;
  • if any argument after the first is NULL, it is ignored.

In the example below, we are using CONCAT_WS() to produce a list:

> SELECT CONCAT_WS(', ', email1, email2, email3) AS concacts
       FROM user WHERE id = 1;
+-------------------------------------------------------------+
| concacts                                                    |
+-------------------------------------------------------------+
| john.doe@gmail.com, j_doe@hotmail.com, monster666@gmail.com |
+-------------------------------------------------------------+

CONCAT_WS() behaves like a NULL-safe CONCAT() when the first argument is an empty string. So it can be used to handle NULL values:

> INSERT INTO tab (a, b, c) VALUES ('1', NULL, '3');
Query OK, 1 row affected (0.016 sec)

> SELECT CONCAT_WS('', a, b, c) FROM tab;
+------------------------+
| CONCAT_WS('', a, b, c) |
+------------------------+
| 13                     |
+------------------------+

Concatenation operator

If the sql_mode variable contains the PIPES_AS_CONCAT flag, MariaDB allows to use the || concatenation operator. This also happens if sql_mode contains the ORACLE flag, that makes MariaDB accept many Oracle-specific syntaxes.

The concatenation operator behaves like CONCAT(), so it returns NULL if one of the operands is NULL:

> SELECT
    ('A' || 'B' || 'C') AS test,
    (NULL || 'B') AS test_with_null;
+------+----------------+
| test | test_with_null |
+------+----------------+
| ABC  | NULL           |
+------+----------------+

SFORMAT()

The SFORMAT() function is an easier way to compose strings. It is the equivalent of several functions existing in some programming languages, such as the Python fmt module which is based on the fmt library. SFORMAT() was introduced in MariaDB 10.7. As far as I know, it’s not supported in other DBMSs.

fmt works by string interpolation. Yet, it does much more than that. Its methods allow to declare the type of a variable, format its output, output to a file, etc. But fmt was designed for an entirely different context. That’s why SFORMAT() supports a subset of its capabilities. Unfortunately, at the time of its writing, the documentation page doesn’t help much, as it only describes a basic case. I’ll try to show some undocumented uses of SFORMAT() here.

Interpolation can be done. Optionally we can specify in which order the parameters should appear:

> SELECT SFORMAT('{} + {} = {}', 3, 5, (3 + 5)) AS str;
+-----------+
| str       |
+-----------+
| 3 + 5 = 8 |
+-----------+

> -- indexes start from 0
> SELECT SFORMAT('{2} - {1} = {0}', 3, 5, (3 + 5)) AS str;
+-----------+
| str       |
+-----------+
| 8 - 5 = 3 |
+-----------+

We can also declare a variable type:

> SELECT SFORMAT('{:d}th son of a {:d}th son', 7, 7) AS iron_maiden;
+----------------------+
| iron_maiden          |
+----------------------+
| 7th son of a 7th son |
+----------------------

If we pass a variable that is not of the declared type, the result will be NULL and we’ll get a warning:

> \W
Show warnings enabled.
> SELECT SFORMAT('{:d}th son of a {:d}th son', 7, 'seven')
      AS iron_maiden;
+-------------+
| iron_maiden |
+-------------+
| NULL        |
+-------------+
1 row in set, 1 warning (0.003 sec)

Warning (Code 4183): SFORMAT error: invalid type specifier

We can of course combine the above features:

> SELECT SFORMAT('{1:s}teen', 7, 'seven') AS age;
+-----------+
| age       |
+-----------+
| seventeen |
+-----------+

Certain specifiers don’t just specify a type, but format the output as well:

> SELECT SFORMAT('Dec: {0:d}, Oct: {0:#o}, Hex: {0:#X}', 10)
      AS num;
+-----------------------------+
| num                         |
+-----------------------------+
| Dec: 10, Oct: 012, Hex: 0XA |
+-----------------------------+

> SELECT SFORMAT('{:+f}', 2.1) AS num;
+-----------+
| num       |
+-----------+
| +2.100000 |
+-----------+

Fancy alignments:

> SELECT SFORMAT(
      '\n\n{:=^30}\n\n{:>30}\n{:>30}\n{:<30}\n{:<30}\n\n',
      'Jazz Musicians',
      'John Coltrane',
      'Ornette Coleman',
      'Max Roach',
      'Warren Baby Dolls') AS jazz \G
*************************** 1. row ***************************
jazz: 

========Jazz Musicians========

                 John Coltrane
               Ornette Coleman
Max Roach                     
Warren Baby Dolls        

Formatting dates doesn’t work at the time of this writing.

Quoting

Sometimes we may need to quote a value, and escape the quote character. If we need to use a single quote, we can use the QUOTE() function:

> SELECT QUOTE('Vettabase') AS vetta;
+-------------+
| vetta       |
+-------------+
| 'Vettabase' |
+-------------+

With any other character, we’ll need to concatenate strings in one of the ways shown above. To escape the quote character, we’ll use REPLACE():

> SELECT CONCAT('"', REPLACE('a"b', '"', '\\"'), '"') AS quote;
+--------+
| quote  |
+--------+
| "a\"b" |
+--------+

JSON

MariaDB has many JSON functions. Amongst other things, they help us compose JSON documents.

To compose a JSON array:

> SELECT JSON_ARRAY(1, 2, 'three') AS arr;
+-----------------+
| arr             |
+-----------------+
| [1, 2, "three"] |
+-----------------+

To create a JSON object:

> SELECT JSON_OBJECT('name', 'Tom', 'surname', 'Baker') AS obj;
+-------------------------------------+
| obj                                 |
+-------------------------------------+
| {"name": "Tom", "surname": "Baker"} |
+-------------------------------------+

These functions can of course be combined:

> SELECT JSON_OBJECT('numbers', JSON_ARRAY(1, 2, 3)) AS nest;
+------------------------+
| nest                   |
+------------------------+
| {"numbers": [1, 2, 3]} |
+------------------------+

We can do much more with JSON functions, but this post is not the right place to discuss JSON more in depth.

Aggregate functions

This paragraph explains how to concatenate values by column. For example, while CONCAT() can concatenate the values of the columns name and surname, GROUP_CONCAT() can concatenate the values in the column surname, from one or more rows. These functions are normally (but not necessarily) used with the GROUP BY clause.

If you are not familiar with this concept, take a look at GROUP BY and other aggregate functions in the MariaDB Knowledge Base.

GROUP_CONCAT()

GROUP_CONCAT() is the most generic aggregate function for concatenating strings. In practical cases, it can be used to compose any string. GROUP_CONCAT() ignores NULL values. Its clauses are the following:

  • DISTINCT can be used to remove duplicates from the list.
  • ORDER BY can be used to sort the elements of the list. This is different from the global ORDER BY that you can use outside of this function. The global order by sorts the rows, GROUP_CONCAT(... ORDER BY ...) sorts the lists contained in each row.
  • SEPARATOR specifies a string separator.
  • LIMIT sets the maximum number of elements in each list, and OFFSET skips some elements at the beginning of the list. These clauses don’t make any sense without an ORDER BY.

All these clauses are optional. If present, they must be specified in this order.

IMPORTANT NOTE. When DISTINCT or ORDER BY are used, and the table has no suitable index, MariaDB will order the strings with a 2 steps sort. This can be slow and consume a lot of memory. For this reason, the group_concat_max_len variable sets the maximum length in bytes for each list. When the value exceeds, the list will be truncated. Before MariaDB 10.2, this value was quite small for many practical cases. If truncated lists are a problem and you know that you’re not overloading your database, use:

SET STATEMENT group_concat_max_len = <high_number> FOR
SELECT ...

Let’s see a basic example. To get a comma-separated list of databases in the current MariaDB instance:

SELECT GROUP_CONCAT(
        SCHEMA_NAME
        ORDER BY SCHEMA_NAME
        SEPARATOR ','
    ) AS db_list
    FROM information_schema.SCHEMATA;

Sometimes it’s practical to pass GROUP_CONCAT() an expression, i.e. the concatenation of multiple columns. For example, to get a list of tables in each database, in the form of db_schema.db_table:

SET STATEMENT group_concat_max_len = 2097152 FOR
SELECT GROUP_CONCAT(
        CONCAT_WS('.', TABLE_SCHEMA, TABLE_NAME)
        ORDER BY TABLE_SCHEMA
        SEPARATOR ','
    ) AS table_list
    FROM information_schema.TABLES
    GROUP BY TABLE_SCHEMA;

Aggregating JSON data

Creating JSON data with GROUP_CONCAT() would be a bit too verbose. There are two specialised aggregate functions that do this easily:

  • JSON_ARRAYAGG() composes a JSON array;
  • JSON_OBJECTAGG() composes a key/value JSON object.

These functions were added in MariaDB 10.5.

There is no variable to limit JSON_ARRAYAGG() and JSON_OBJECTAGG() memory usage.

The clauses are the same as GROUP_CONCAT(), except that there is not SEPARATOR clause, so I won’t repeat the explanation. Let’s just see some examples.

To get a JSON array of all the tables in each database, where each database is a separate row:

SELECT
        TABLE_SCHEMA AS db_name,
        JSON_ARRAYAGG(
            TABLE_NAME
            ORDER BY TABLE_NAME
        ) AS table_list
    FROM information_schema.TABLES
    GROUP BY TABLE_SCHEMA;

Let’s turn this semi-relational result set in a single JSON document with this structure:

{
    "db1": ["table1", "table2", ...],
    "db2": ["table3", "table4", ...],
    ...
}

The query to obtain this is as follows:

SELECT
        JSON_DETAILED(
            JSON_OBJECTAGG(
                db_name,
                table_list
            )
        ) AS json
    FROM (
        SELECT
                TABLE_SCHEMA AS db_name,
                    JSON_ARRAYAGG(
                        TABLE_NAME
                        ORDER BY TABLE_NAME
                    ) AS table_list
            FROM information_schema.TABLES
            GROUP BY TABLE_SCHEMA
        ) v;

JSON_DETAILED() just makes the output easier to read for humans.

We can do much more with MariaDB JSON functions. See the JSON Functions section in the MariaDB Knowledge Base.

What about MySQL?

Some differences:

Other techniques explained here should work equally well on MySQL. Note that MariaDB does not aim to full compatibility with MySQL. There could always be small differences – especially when it comes to features implemented since version 10.0.

Federico Razzoli

All content in this blog is distributed under the Creative Commons Attribution 4.0 International license (CC BY 4.0). You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by/4.0/

[/et_pb_column]
About Federico Razzoli
Federico is Vettabase Ltd founder, and he's an expert database consultant specialised in the MariaDB and MySQL ecosystems.

Recent Posts

A summary of MariaDB 10.10: INET4 type, RANDOM_BYTES() and more

A summary of MariaDB 10.10: INET4 type, RANDOM_BYTES() and more

In my first post in 2023, I want to give you a summary of MariaDB 10.10. Some may argue that I'm a bit late because it was released last November, and versions 10.11 and 11.0 have been announced since then... All that is correct, but 10.10 is the latest stable...

Hints to optimise queries with a LIKE comparison

Hints to optimise queries with a LIKE comparison

In SQL, using the LIKE operator is a powerful way to find strings that match a certain pattern. It's suitable for most use cases, thanks to its two jolly characters: _ means any one character. % means any sequence of zero or more characters. However, many queries out...

Services

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *