Schedule Meeting

a

MariaDB SEQUENCE: a Simple Approach to Synthetic Data

by | Mar 24, 2025 | MariaDB, MariaDB Features

Need Help?  Click Here for Expert Support

MariaDB SEQUENCE is a storage engine that generates a sequence of positive integer numbers. However, in this article I will show you that it’s easy to use SEQUENCE to generate more complex sequences, that are not necessarily numeric. This is a very convenient way to generate relatively simple synthetic data.

Numeric sequences

SEQUENCE generates a numeric sequence, even though these numbers can be converted to something else. We specify a minimum, a maximum, and optionally a step (or increment). SEQUENCE tables are created on the fly when you query a non-existing table with a name that follows a certain pattern.

Simple sequences

Let’s see trivial sequence examples:

> SELECT * FROM seq_1_to_3;
+-----+
| seq |
+-----+
|   1 |
|   2 |
|   3 |
+-----+
> SELECT * FROM seq_10_to_30_step_5;
+-----+
| seq |
+-----+
|  10 |
|  15 |
|  20 |
|  25 |
|  30 |
+-----+

Including negative numbers

While SEQUENCE doesn’t natively generate negative numbers, we can obtain one in a simple way:

> SELECT -seq FROM seq_3_to_1;
+------+
| -seq |
+------+
|   -3 |
|   -2 |
|   -1 |
+------+

We can also generate a sequence containing both negative numbers, zero and positive numbers:

(SELECT -seq AS s FROM seq_3_to_0)
UNION
(SELECT seq AS s FROM seq_1_to_3)
ORDER BY s;
+----+
| s  |
+----+
| -3 |
| -2 |
| -1 |
|  0 |
|  1 |
|  2 |
|  3 |
+----+

FLOAT numbers

We can’t specify FLOAT boundaries, or a FLOAT increment. But again, some very simple arithmetic can do the trick:

> SELECT seq * 0.25 FROM seq_0_to_16;
+------------+
| seq * 0.25 |
+------------+
|       0.00 |
|       0.25 |
|       0.50 |
|       0.75 |
|       1.00 |
|       1.25 |
|       1.50 |
|       1.75 |
|       2.00 |
|       2.25 |
|       2.50 |
|       2.75 |
|       3.00 |
|       3.25 |
|       3.50 |
|       3.75 |
|       4.00 |
+------------+

Calculating the parameters is also simple:

  • seq must be multiplied by the desired step;
  • The upper bound is the number of values, rather than a maximum. It’s the maximum multiplied by the step.

Multiples

Given a number, we can generate a sequence of its multiples:

> SELECT seq
    FROM seq_1_to_20
    WHERE seq MOD 5 = 0
;
+-----+
| seq |
+-----+
|   5 |
|  10 |
|  15 |
|  20 |
+-----+

The MOD operator returns the rest of an integer division. When the left number is a multiple of the right number, there is no rest.

Similarly, we can can a sequence of numbers that are multiple of two numbers:

> SELECT seq
    FROM seq_1_to_50
    WHERE seq MOD 5 = 0 AND seq MOD 3 = 0
;
+-----+
| seq |
+-----+
|  15 |
|  30 |
|  45 |
+-----+

Powers

A sequence of squares:

> SELECT POW(seq, 2) FROM seq_1_to_5;
+-------------+
| POW(seq, 2) |
+-------------+
|           1 |
|           4 |
|           9 |
|          16 |
|          25 |
+-------------+

A sequence of powers of 2:

> SELECT POW(2, seq) FROM seq_1_to_5;
+-------------+
| POW(2, seq) |
+-------------+
|           2 |
|           4 |
|           8 |
|          16 |
|          32 |
+-------------+

We can easily adapt this idea to use other mathematical expressions.

Mathematical series

The following series are used in mathematics and science for many purposes.

Fibonacci series (using variables):

> SELECT
	seq,
	IF(seq = 0, 0, IF(seq = 1,
		(@a := 0) + (@b := 1) + (@s := 0),
		(@b := LAST_VALUE(LAST_VALUE(@s := (@a + @b), @a := @b), @s))
	)) AS fibo
  FROM seq_0_to_10;
+-----+------+
| seq | fibo |
+-----+------+
|   0 | 0    |
|   1 | 1    |
|   2 | 1    |
|   3 | 2    |
|   4 | 3    |
|   5 | 5    |
|   6 | 8    |
|   7 | 13   |
|   8 | 21   |
|   9 | 34   |
|  10 | 55   |
+-----+------+

Factorials series:

> SELECT IF(seq < 2, (@fact := 1), (@fact := @fact * seq)) AS fact FROM seq_0_to_10;
+---------+
| fact    |
+---------+
|       1 |
|       1 |
|       2 |
|       6 |
|      24 |
|     120 |
|     720 |
|    5040 |
|   40320 |
|  362880 |
| 3628800 |
+---------+

Triangular numbers:

> SELECT CAST(seq * (seq + 1) / 2 AS UNSIGNED) AS triangular
    FROM seq_1_to_10;
+------------+
| triangular |
+------------+
|          1 |
|          3 |
|          6 |
|         10 |
|         15 |
|         21 |
|         28 |
|         36 |
|         45 |
|         55 |
+------------+

Fermat numbers (we can go farther because the 6th number exceeds BIGINT):

> SELECT seq, POW(2, POW(2, seq)) + 1 AS fermat FROM seq_1_to_5;
+-----+------------+
| seq | fermat     |
+-----+------------+
|   1 |          5 |
|   2 |         17 |
|   3 |        257 |
|   4 |      65537 |
|   5 | 4294967297 |
+-----+------------+

Random Values and UUID’s

What if we want to generate multiple random values and/or multiple UUIDs? It’s a logical sequence, but not one that the SEQUENCE engine can generate. Normally we could generate a single UUID without specifying a FROM clause. If we have to call it many times, though, this method is inconvenient and slow. Instead, we can use SEQUENCE to control how many times these values are generated:

> SELECT UUID() FROM seq_1_to_5;
+--------------------------------------+
| UUID()                               |
+--------------------------------------+
| 1fc4789c-0726-11f0-83a4-0242c0a8d002 |
| 1fc478a9-0726-11f0-83a4-0242c0a8d002 |
| 1fc478ae-0726-11f0-83a4-0242c0a8d002 |
| 1fc478b2-0726-11f0-83a4-0242c0a8d002 |
| 1fc478b7-0726-11f0-83a4-0242c0a8d002 |
+--------------------------------------+

Strings

Generating sequences of ASCII characters is easy, and we can use any supported character set. If we only need very common characters, we can use ASCII.

1-character sequences, lowercase

To generate a sequence or all the lowercase letters:

SELECT CHAR(seq USING ascii) AS chr FROM seq_97_to_122;
+------+
| chr  |
+------+
| a    |
| b    |
| c    |
| d    |
| e    |
...
| z    |
+------+

This works because ASCII characters have a numeric code, which matches the alphabetical order. For lowercase letters, the codes are between 97 and 122. The CHAR() function accepts an integer and returns the corresponding ASCII character. Since ASCII is a subset of UTF8, we could have specified utf8mb4 instead.

1-character sequences, all simple Latin letters and digits

The following query returns a sequence with all the lowercase and uppercase ASCII letters and digits:

SELECT CHAR(seq) AS ch
    FROM (
                -- lowercase
                (SELECT seq FROM seq_97_to_122 l)
            UNION
                -- uppercase
                (SELECT seq FROM seq_65_to_90 u)
            UNION
                -- digits
                (SELECT seq FROM seq_48_to_57 d)
        ) v;

The result is too long to be shown here, but you can run this query on a MariaDB instance.

2-characters sequences

To generate a 2-characters sequence, we’ll use a CROSS JOIN between two identical subqueries:

SELECT CONCAT(ch1.ch1, ch2.ch2) AS ch
    FROM (
        (SELECT CHAR(seq) AS ch1
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l1)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u1)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d1)
                ) s1
        )
    ) ch1
    CROSS JOIN (
        (SELECT CHAR(seq) AS ch2
            FROM (
                        -- lowercase
                        (SELECT seq FROM seq_97_to_122 l2)
                    UNION
                        -- uppercase
                        (SELECT seq FROM seq_65_to_90 u2)
                    UNION
                        -- digits
                        (SELECT seq FROM seq_48_to_57 d2)
                ) s2
        )
    ) ch2;

SEQUENCE is a vert fast engine, so a relatively small CROSS JOIN is still fast.

The above query can easily be modified, for example, to generate a lowercase letter as the first character and a digit as the second character.

Dates and Times

Suppose your company produces an app that allows customers to book some dates (for example, a hotel chain does this) or time slots during working hours (for example, for dentist appointments). In these cases, it can be useful to have a way to quickly generate suitable sequences of dates or times.

Calendar dates

Let’s generate a sequence of dates:

SELECT DATE ('2025-01-01' + INTERVAL (s.seq - 1) DAY) AS date
    FROM (SELECT seq FROM seq_1_to_30) s;
+------------+
| date       |
+------------+
| 2025-01-01 |
| 2025-01-02 |
| 2025-01-03 |
| 2025-01-04 |
| 2025-01-05 |
...
| 2025-01-30 |
+------------+

Generating these types of sequences is easy, because we can add a number to a date. It will be interpreted as a number of days.

Working days

Generating a sequence of working days is often a complex problem. You should have a table with working days. In a data warehouse, you typically have a calendar table with a column that indicates which rows represent working days. But if you only need an approximation, you could, for example, list the week days from Monday to Friday. Here’s an example:

SELECT DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY) AS date
    FROM (SELECT seq FROM seq_1_to_30) s
    -- exclude sunday (1) and saturday (7)
    WHERE DAYOFWEEK(DATE ('2014-01-01' + INTERVAL (s.seq - 1) DAY)) BETWEEN 2 AND 6;
+------------+
| date       |
+------------+
| 2014-01-01 |
| 2014-01-02 |
| 2014-01-03 |
| 2014-01-06 |
| 2014-01-07 |
...
| 2014-01-30 |
+------------+

Hours

To generate a sequence of hours in a day:

SELECT CAST('00:00:00' AS TIME) + INTERVAL (s.seq - 1) HOUR AS time
    FROM (SELECT seq FROM seq_1_to_24) s;
+----------+
| time     |
+----------+
| 00:00:00 |
| 01:00:00 |
| 02:00:00 |
...
| 23:00:00 |
+----------+

Time slots

A sequence of time slots of 30 minutes, in a day:

SELECT CAST('09:00:00' AS TIME) + INTERVAL 30 * (s.seq - 1) MINUTE AS time
    FROM (SELECT seq FROM seq_1_to_18) s;
+----------+
| time     |
+----------+
| 09:00:00 |
| 09:30:00 |
| 10:00:00 |
| 10:30:00 |
| 11:00:00 |
...
| 17:30:00 |
+----------+

Time slots in working days

And now, let’s combine the above queries to generate a sequence of 30 minutes time slots in working hours, in working days:

SELECT {dt '2013-01-01 01:00:00'} + INTERVAL (wd.d - 1) DAY + INTERVAL (dh.h - 1) HOUR AS wh
    FROM
        (
            -- working days in a month
            SELECT seq AS d FROM seq_1_to_30
            WHERE DAYOFWEEK(DATE ('2025-01-01' + INTERVAL (seq - 1) DAY)) BETWEEN 2 AND 6
        ) wd
    CROSS JOIN
        (
            -- daily working hours:
            -- 9-13, 14-18
            (SELECT seq AS h FROM seq_9_to_12)
            UNION
            (SELECT seq AS h FROM seq_14_to_17)
        ) dh
;

IPs

Suppose we need to generate a sequence of IP’s. We know the minimum IP and the maximum IP. First, we need to convert them to integers with the INET_ATON() function:

> SELECT INET_ATON('125.0.0.0') AS min, INET_ATON('125.5.0.255') AS max;
+------------+------------+
| min        | max        |
+------------+------------+
| 2097152000 | 2097479935 |
+------------+------------+

Now that we know those numbers, we can use them to compose the SEQUENCE table name. We’ll use the INET_NTOA() function to convert the sequence numbers to IPs:


> SELECT INET_NTOA(seq) AS ip FROM seq_2097152000_to_2097479935;
+-----------+
| ip        |
+-----------+
| 125.0.0.0 |
| 125.0.0.1 |
| 125.0.0.2 |
| 125.0.0.3 |
| 125.0.0.4 |
...

For IPv6, we can use INET6_ATON() and INET6_NTOA().

Complex sequences

Until now, we worked with single sequences. But we can also do something more complex, like combining sequences.

Combinations of sequences

Now, suppose we want all the combinations of any two numbers between 1 and 3. This kind of combinations are useful when you need to generate coordinates for a physical space, or a board game such as tic tac toe or chess. To do this, we’ll use a JOIN:

> SELECT a.seq AS x, b.seq AS y
    FROM seq_1_to_3 a
    CROSS JOIN seq_1_to_3 b
;
+---+---+
| x | y |
+---+---+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 2 | 2 |
| 3 | 2 |
| 1 | 3 |
| 2 | 3 |
| 3 | 3 |
+---+---+

The order is correct, because the storage engine generates the number in their requested order. I advocate the use of ORDER BY even when it is apparently useless, but in this case it would have the only effect of causing a resultset materialisation, to sort something that is already sorted.

One sequence per column

We can also have orthogonal sequences in your resultset, where each column represents a different sequence. For example, let’s generate a resultset containing squares and cubes of natural numbers up to 5:

> SELECT
    POW(seq, 2) AS square,
    POW(2, seq) AS power_of_2
    FROM seq_1_to_5
;
+--------+------------+
| square | power_of_2 |
+--------+------------+
|      1 |          2 |
|      4 |          4 |
|      9 |          8 |
|     16 |         16 |
|     25 |         32 |
+--------+------------+

Circulas sequences

A circular sequence is one that restarts after hitting its maximum. For example, let’s generate a circular sequence of 10 numbers between 1 and 4:

> SELECT IF(seq MOD 4 = 0, 4, seq MOD 4) AS seq
    FROM seq_1_to_10
;
+------+
| seq  |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    1 |
|    2 |
|    3 |
|    4 |
|    1 |
|    2 |
+------+

Once again, we use the MOD operator. But when the original number is 4, the integer division will return 0, so we have to replace 0 with 4.

The same trick can be used with any data type.

Conclusions

SEQUENCE is known for being able to generate a sequence of numbers. We saw that, with a reasonable effort, we can generate much more than that. But is it worth to write tricky SQL queries against SEQUENCE tables, rather than using a complete synthetic data generation solution? It depends. But for simple (but not necessarily trivial!) needs, SEQUENCE is enough. And it might save you from learning a new tool, or adding some complexity to your environment.

Federico Razzoli

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli
Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but prefers to do it from Scotland where he lives.

Recent Posts

Stored Procedures in MariaDB: Smarter, Easier and More Powerful

Stored Procedures in MariaDB: Smarter, Easier and More Powerful

MariaDB supports stored procedures written in procedural SQL. Which essentially means "SQL with IF's and loops". Most DBMSs do the same, but every project supports different, incompatible syntaxes. MariaDB implemented Oracle's dialect, called PL/SQL. The base of the...

MariaDB and the GROUP BY error

MariaDB and the GROUP BY error

Developers who are not familiar with SQL are often confused by MariaDB and MySQL's infamous GROUP BY error. From time to time, customers ask us to explain it, so it's time we publish an article on this topic. The error I'm talking about is the following: ERROR 1055...

Tuning MariaDB timeouts that affect applications

Tuning MariaDB timeouts that affect applications

MariaDB has many configurable timeouts. Some of them can only be configured by DBAs (database administrators). Others can be adjusted by developers in the application code, though DBAs can configure defaults. MariaDB timeouts can be grouped in these categories:...

Services

Need Help?  Click Here for Expert Support

2 Comments

  1. John C

    How does SEQUENCE handle concurrency? Are there any risks of contention or performance degradation when multiple sessions generate synthetic data simultaneously?

    Reply
    • Federico Razzoli

      Hi John! At its core, SEQUENCE is a trivial function that generates numbers. I don’t think that any contention is involved and I don’t think it can have performance problems in any realistic case.

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *