How do you represent information like this in a database?
- This event happened in 2015/06, but we don’t know in which day.
- This job is scheduled to happen on the first day of the month at 00:00:00, every month and every year.
- This never happened.
There are many ways to do that. The most common is to split dates in three different columns, each of which will be NULL when it doesn’t have a specific value.
But this makes date harder to validate for the database, it’s inpractical as it complicates SQL queries, and on top of this, NULL is error-prone.
A more practical and efficient way to store partial dates is to use zero-dates, and date with zero-components. In this article I’ll show you how to use them.
sql_mode and date validity
MariaDB has an sql_mode variable that affects the way SQL queries are interpreted. It’s useful to:
- Make error-checking looser, accepting values that are not entirely valid.
- Making error-checking stricter, making sure that invalid values are rejected.
- Increasing compatibility with some other DBMSs, at both syntax and semantic levels.
The sql_mode is a comma-separated list of flags. In this article, we’re interested in the following flags:
NO_ZERO_DATE allows the special date 0000-00-00.
NO_ZERO_IN_DATE allows the year, month, or day component to be zero, even if the rest of the date is not necessarily zero. For example: 0000-09-30.
ALLOW_INVALID_DATES has no effects on zero dates or partial dates. It simply makes the validity check more trivial: instead of checking the length of the month, keeping into account leap years, it will simply consider valid the day part when it doesn’t exceed 31.
To see the current sql_mode, run this query:
SELECT @@sql_mode;
The above flag affects the DATE and DATETIME data types. Don’t try to use this feature with TIMESTAMP, because only valid dates can be converted to UNIX timestamps.
For simplicity, in this article we’ll only use the DATE type.
Working with Zero Dates and Partial Dates
Let’s start by creating a normal table with a DATE column with a UNIQUE index on it:
CREATE OR REPLACE TABLE schedule (
id INT UNSIGNED AUTO_INCREMENT,
date TIMESTAMP NULL,
PRIMARY KEY (id),
UNIQUE unq_date (date)
);
Now let’s insert some values. As you can see, they include a zero date and somew partial dates of all types:
INSERT INTO schedule (date) VALUES
('0000-00-00')
, ('0000-00-01')
, ('0000-00-31')
, ('0000-12-01')
, ('0000-11-00')
, ('0000-12-00')
, ('0000-12-10')
, ('0000-12-21')
, ('2000-00-01')
, ('2000-00-11')
, ('2000-01-00')
, ('2000-01-01')
, ('2000-12-21')
, ('2001-00-00')
, ('2001-01-00')
, ('2001-02-02')
, ('2001-02-03')
;
We can now test the UNIQUE index. We know that UNIQUE indexes accept multiple NULLs, but we don’t expect zero dates or zero date components to be treated as NULLs:
> INSERT INTO schedule (date) VALUES ('0000-00-00');
ERROR 1062 (23000): Duplicate entry '0000-00-00' for key 'unq_date'
> INSERT INTO schedule (date) VALUES ('2001-01-00');
ERROR 1062 (23000): Duplicate entry '2001-01-00' for key 'unq_date'
Dates are ordered as expected, with the zeroes preceding other numbers:
> SELECT date FROM schedule ORDER BY 1;
+------------+
| date |
+------------+
| 0000-00-00 |
| 0000-00-01 |
| 0000-00-31 |
| 0000-11-00 |
| 0000-12-00 |
| 0000-12-01 |
| 0000-12-10 |
| 0000-12-21 |
| 2000-00-01 |
| 2000-00-11 |
| 2000-01-00 |
| 2000-01-01 |
| 2000-12-21 |
| 2001-00-00 |
| 2001-01-00 |
| 2001-02-02 |
| 2001-02-03 |
+------------+
We can also find zero years, zero months and zero dates by using the YEAR(), MONTH(), and DAYOFMONTH() SQL functions:
> SELECT date FROM schedule WHERE DAYOFMONTH(date) = 0;
+------------+
| date |
+------------+
| 0000-00-00 |
| 0000-11-00 |
| 0000-12-00 |
| 2000-01-00 |
| 2001-00-00 |
| 2001-01-00 |
+------------+
What happens if we ask for information that won’t make sense without a complete valid date, like the day of the week? In this case, we’ll obtain NULL:
> SELECT date, DAYOFWEEK(date) FROM schedule LIMIT 8;
+------------+-----------------+
| date | DAYOFWEEK(date) |
+------------+-----------------+
| 0000-00-00 | NULL |
| 0000-00-01 | NULL |
| 0000-00-31 | NULL |
| 0000-11-00 | NULL |
| 0000-12-00 | NULL |
| 0000-12-01 | 6 |
| 0000-12-10 | 1 |
| 0000-12-21 | 5 |
+------------+-----------------+
Partial Dates Validation
When the month is zero, MariaDB can’t decide the maximum day, but it still assumes that it can’t be more than 31:
> SELECT DATE '2000-00-31';
+-------------------+
| DATE '2000-00-31' |
+-------------------+
| 2000-00-31 |
+-------------------+
1 row in set (0.000 sec)
MariaDB [(none)]> SELECT DATE '2000-00-32';
ERROR 1525 (HY000): Incorrect DATE value: '2000-00-32'
Similarly, I’d expect MariaDB to accept 0000-02-29, because of leap years. This is not the case, so I reported the bug MDEV-38455.
That said, you can force MariaDB to only accept days <= 28, to only accept dates that are surely valid:
CREATE OR REPLACE TABLE schedule (
id INT UNSIGNED AUTO_INCREMENT,
date DATE NULL CHECK (DAYOFMONTH(date) <= 28),
PRIMARY KEY (id),
UNIQUE unq_date (date)
);
Zeroes versus NULL
As mentioned before, instead of '0000-00-00', you might use NULL. And instead of '2026-01-00' you might use three different columns, and set day to NULL.
I discourage the use of NULL when possible, for many reasons. For example, it is error-prone and semantically inconsistent: depending on the situation, it could mean non-applicable or unknown value. So 0+NULL=NULL, which makes sense if it’s an unknown value. But MAX(col) ignores NULLs, which only makes sense if it’s an absent value. However, in some situations it’s just so much more practical than the alternatives that, in practice, we don’t have an alternative.
In the case of dates, if you only need a value that means unknown or (XOR) a value that means not-applicable, my recommendation is to only use zero dates or partial dates.
But if you need both, it might be a good idea to use NULL and zero dates or partial dates. In this case, I’d use NULL as an unknown value, because its behaviour tends to be a bit more consistent with this interpretation.
MySQL Compatibility
The features described in this article were implemented in MySQL long before MariaDB existed. They should work without changes in all present and past versions (which currently means, up to 9.5). But the NO_ZERO_DATE and NO_ZERO_IN_DATE sql_mode flags have been deprecated for some years, and will be removed at some point. When it happens, zero dates and partial dates won’t be accepted anymore.
If you use these features but you use MySQL, consider migrating to MariaDB.
Conclusions
Not many people know that MariaDB supports zero dates and partial dates. And I’d be surprised to find out that some ORMs support it.
Nevertheless, this is a convenient feature and it has many practical uses. The most common alternative is using three different columns, which is less logical, more error-prone, and less efficient. Additional virtual columns can be created to build appropriate indexes, if needed, but that is only needed in some cases. You should be able to use a single date as a single column, even if some or all its components are set to zero.
If you want to know more about MariaDB specific features, consider Vettabase training courses.
Federico Razzoli






0 Comments