Schedule Meeting

a

Identify useless queries in MariaDB/MySQL

by | Sep 7, 2021 | MariaDB, MySQL | 0 comments

Do you want to improve your application’s and database server’s performance? Of course one of the most important things to do is to optimise the most impacting queries.

You have various ways to find them, depending on what exactly you mean by “most impacting”. You can have an approach oriented to your systems workload, or an approach oriented to your users experience. The former is what you typically do with tools like PMM, the latter is done with tools like New Relic. They are not mutually exclusive and they both are important, for different reasons.

But there is a third important approach: get rid of the queries that serve no purpose. Because they add work to the server, they probably make requests slower, they could lock rows, and they could consume enough CPU and memory to increase the cost of our servers.

MariaDB/MySQL performance_schema will help find those queries. It needs to be enabled.

Performance Schema and useless queries

I know, I know: you think you don’t have such queries. If query went to production, they must have a purpose. Well, chances are you’re wrong. In complex environments, normally in a production environment useless queries can be found. Maybe they were useful when they were written, but at some point they lost their purpose and no one removed them. Maybe they used to return data that don’t exist anymore, because a feature was removed or rows changed. Maybe they stopped working correctly after a table was modified. Maybe they never worked but no one ever noticed that because testing is insufficient.

Whatever the reason… let’s find them.

Queries that always return an error

Many queries return an error, but… only in certain situations. For example, some queries use the IN (1, 2, 3...) syntax, which is fine. But sometimes the list of values is empty, and the IN () syntax is not valid. Other examples include unquoted strings: if the string is a number, the query will work. But if it’s alphanumeric, it will fail. Of course those queries should be found and fixed.

But some queries always return an error. This happens if their syntax is not valid, and the syntax error does not depend on parameters. Or maybe parameters always cause errors (in the example above, maybe strings are always alphanumeric).

To find such queries, run:

SELECT
        DIGEST_TEXT,
        COUNT_STAR,
        FIRST_SEEN,
        LAST_SEEN
    FROM performance_schema.events_statements_summary_by_digest
    WHERE
        DIGEST_TEXT IS NOT NULL
        AND SUM_ERRORS = COUNT_STAR
        AND COUNT_STAR > 20
        AND LAST_SEEN > (NOW() - INTERVAL 1 MONTH)
    ORDER BY COUNT_STAR DESC
    LIMIT 10
G

Some notes:

  • We are looking for completely useless queries. But after that, you may want to see queries whose ratio between COUNT_STAR and SUM_ERRORS is bad. Or even queries that returned at least one error.
  • We put a lower limit to COUNT_STAR to avoid false positives – queries that were executed by mistake, perhaps manually, and will never run again.
  • The limit on LAST_SEEN is also important. We want to exclude queries that don’t run anymore, or don’t fail anymore because the conditions that caused the error have changed.
  • Queries executed more times are obviously more important to erase.
Running queries that always fail with an error is not much different from Sisyphus eternal effort. This picture is from Highgate Cemetery, in London.

Queries that never return rows

Believe it or not, I often see queries that always return an empty resultset. If you find such queries, you should at least suspect that they are useless and can be removed.

All queries consume some resources. But queries that don’t return rows are much more important to remove than queries that fail. The reason is that these queries are executed. They return nothing, but they could examine a big amount of rows, materialise them in memory, or keep them locked.

Find them in this way:

SELECT *
    FROM performance_schema.events_statements_summary_by_digest
    WHERE
        (
               TRIM(DIGEST_TEXT) LIKE 'SELECT%'
            OR TRIM(DIGEST_TEXT) LIKE 'CREATE%TABLE%SELECT%'
            OR TRIM(DIGEST_TEXT) LIKE 'DELETE%'
            OR TRIM(DIGEST_TEXT) LIKE 'UPDATE%'
            OR TRIM(DIGEST_TEXT) LIKE 'REPLACE%'
        )
        AND SUM_ROWS_SENT = 0
        AND SUM_ROWS_AFFECTED = 0
        AND COUNT_STAR > 20
        AND LAST_SEEN > (NOW() - INTERVAL 1 MONTH)
    ORDER BY SUM_ROWS_EXAMINED DESC
    LIMIT 10
G

Notes:

  • See notes about the previous query.
  • We need to filter by DIGEST_TEXT because we are not interested in CREATEDROPSET, commands, and many others.
  • Writes (UPDATE, etc) do not return rows, but they do affect some rows (usually). We exclude them with SUM_ROWS_AFFECTED = 0.
  • Here I ordered the results by the number of rows examined because I consider it very significant about the amount of work cause by the query. Depending on which problems you have, you may want to filter by something else, for example SUM_CREATED_TMP_TABLES if you have memory consumption problems.

Empty queries

TL;DR: you hardly care about this.

A last type of useless queries is… empty queries, or queries with no text. A typical way to run them is to add a semicolon (;) at the end of a query. by mistake:

SELECT * FROM information_schema.TABLES;;

You may think it only happens when running queries manually, but occasionally I saw many empty queries in production. I suspect it’s caused by certain ORMs bugs.

However, finding empty queries could be time consuming for developers, and it’s hardly worth the effort. You may try to ask them to do so if the number of such queries is high (like, they happen multiple times a second), but I’m mentioning empty queries mainly for the sake of completeness.

We can spot empty queries more easily with the user_statistics plugin, by Percona – but it’s not necessary:

SHOW GLOBAL STATUS LIKE 'Com_empty_query';
SET GLOBAL userstat := 1;
SELECT USER, EMPTY_QUERIES
    FROM USER_STATISTICS
    ORDER BY EMPTY_QUERIES DESC;

What to do next

Talk to developers. There could be reasons why a query must not be eliminated. If the syntax is wrong, it can be fixed. If it returns no rows, that could be expected for some reason. If that’s the case, a comment could be added to indicate that the query is not useless, so the next time you check you won’t waste your time on it.

Many high level developers hate to optimise queries. But this particular type of problems has more chances to attract their attention. Their code does something potentially useless or wrong, and they usually will want to know why and fix the problem.

See also

Reference:

Related articles:

Conclusions

When we put our hands on a new OLTP environment, we find these queries more often than not. Eliminating them helps reducing the resource usage and increase the speed of certain.

Finding queries that always fail or never return rows is part of our MariaDB Health Checks and MySQL Health Checks.

Federico Razzoli

All content in this blog is distributed under the Creative Commons Attribution 4.0 International license (CC BY 4.0). You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by/4.0/

[/et_pb_column]
About Federico Razzoli
Federico is Vettabase Ltd founder, and he's an expert database consultant specialised in the MariaDB and MySQL ecosystems.

Recent Posts

A summary of MariaDB 10.10: INET4 type, RANDOM_BYTES() and more

A summary of MariaDB 10.10: INET4 type, RANDOM_BYTES() and more

In my first post in 2023, I want to give you a summary of MariaDB 10.10. Some may argue that I'm a bit late because it was released last November, and versions 10.11 and 11.0 have been announced since then... All that is correct, but 10.10 is the latest stable...

How to compose strings in MariaDB

How to compose strings in MariaDB

MariaDB supports several ways to compose strings dynamically. Some of them might be well-known, others aren’t familiar to you. This article covers all the ways to compose strings.

Hints to optimise queries with a LIKE comparison

Hints to optimise queries with a LIKE comparison

In SQL, using the LIKE operator is a powerful way to find strings that match a certain pattern. It's suitable for most use cases, thanks to its two jolly characters: _ means any one character. % means any sequence of zero or more characters. However, many queries out...

Services

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

How to compose strings in MariaDB

How to compose strings in MariaDB

MariaDB supports several ways to compose strings dynamically. Some of them might be well-known, others aren’t familiar to you. This article covers all the ways to compose strings.

read more

How to compose strings in MariaDB

How to compose strings in MariaDB

MariaDB supports several ways to compose strings dynamically. Some of them might be well-known, others aren’t familiar to you. This article covers all the ways to compose strings.

read more