Understanding tables usage with User Statistics on MariaDB and Percona Server

by Federico Razzoli | Aug 12, 2020 | MariaDB

Need Help? Click Here for Expert Support

User Statistics is a plugin for MySQL developed by Percona. It is distributed and enabled by default in Percona Server and in MariaDB. It should be possible to compile it against MySQL, but we did not try.

In this article we’ll see a few simple, yet useful queries to analyse the way tables are used, which of them are used most – and therefore, on what we need to focus on to optimise the server’s workload.

Contents hide

1 Enabling and disabling User Statistics

2 Use Cases

3 Queries

3.1 See how the workload is distributed over engines and databases

3.2 See index usage

3.3 Miscellaneous

4 Conclusions

Enabling and disabling User Statistics

To enable it, just run this SQL statement:

SET GLOBAL userstat := 1;

Set to 0 to disable. But its impact on performance is very low. And having more information about a server performance is generally more important than a small speed difference.

Use Cases

In this article, we want to investigate table usage. This is not necessarily about bad queries. For that purpose, we can use the slow log and tables from the information schema. Also, an older article shows how to find useless queries that are executed by a server.

Understanding tables usage, in this context, means:

Seeing which tables are read and written mostly. Then we can focus on optimising them and the queries that involve them.
Seeing which indexes are being used. Then we can check if the mostly used are as well-designed as they should be.
Find tables and indexes that are not used at all.

User Statistics creates some tables in the information_schema. For our purposes, we’ll use two of them:

TABLE_STATISTICS;
INDEX_STATISTICS.

These tables can also be monitored with some monitoring tool. PMM monitors TABLE_STATISTICS automatically, if User Statistics is enabled. This allows to:

Have a visual representation of how reads and writes are distributed among tables;
Easily find spikes in one table’s usage, that could match some server’s performance decrease;
Easily seeing if a table is being used a lot all of a sudden. When I was a DBA in companies where multiple microservices could access the same database servers, checking this graph after a performance degradation allowed me to immediately know which team did something wrong.

Queries

See how the workload is distributed over engines and databases

Workload by storage engine (where workload means reads and writes):

SELECT
        t.ENGINE,
        COUNT(*) AS table_count,
        COUNT(s.ROWS_READ > 0 OR s.ROWS_CHANGED > 0) AS used_table_count,
        SUM(IFNULL(s.ROWS_READ, 0)) AS SUM_ROWS_READ,
        SUM(IFNULL(s.ROWS_CHANGED, 0)) AS SUM_ROWS_CHANGED,
        SUM(IFNULL(s.ROWS_CHANGED_X_INDEXES, 0)) AS SUM_ROWS_CHANGED_X_INDEXES
    FROM information_schema.TABLES t
    LEFT JOIN information_schema.TABLE_STATISTICS s
        ON t.TABLE_SCHEMA = s.TABLE_SCHEMA AND t.TABLE_NAME = s.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
    GROUP BY t.ENGINE
        WITH ROLLUP
;

Workload by database:

SELECT
        t.TABLE_SCHEMA,
        COUNT(*) AS table_count,
        COUNT(s.ROWS_READ > 0 OR s.ROWS_CHANGED > 0) AS used_table_count,
        SUM(IFNULL(s.ROWS_READ, 0)) AS SUM_ROWS_READ,
        SUM(IFNULL(s.ROWS_CHANGED, 0)) AS SUM_ROWS_CHANGED,
        SUM(IFNULL(s.ROWS_CHANGED_X_INDEXES, 0)) AS SUM_ROWS_CHANGED_X_INDEXES
    FROM information_schema.TABLES t
    LEFT JOIN information_schema.TABLE_STATISTICS s
        ON t.TABLE_SCHEMA = s.TABLE_SCHEMA AND t.TABLE_NAME = s.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
    GROUP BY t.TABLE_SCHEMA
        WITH ROLLUP
;

Workload by database and engine:

SELECT
        t.TABLE_SCHEMA, t.ENGINE,
        COUNT(*) AS table_count,
        COUNT(s.ROWS_READ > 0 OR s.ROWS_CHANGED > 0) AS used_table_count,
        SUM(s.ROWS_READ) AS SUM_ROWS_READ,
        SUM(s.ROWS_CHANGED) AS SUM_ROWS_CHANGED,
        SUM(s.ROWS_CHANGED_X_INDEXES) AS SUM_ROWS_CHANGED_X_INDEXES
    FROM information_schema.TABLES t
    LEFT JOIN information_schema.TABLE_STATISTICS s
        ON t.TABLE_SCHEMA = s.TABLE_SCHEMA AND t.TABLE_NAME = s.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
    GROUP BY t.TABLE_SCHEMA, t.ENGINE
        WITH ROLLUP
;

Workload by engine and database (same info, opposite approach):

SELECT
        t.ENGINE, t.TABLE_SCHEMA,
        COUNT(*) AS table_count,
        COUNT(s.ROWS_READ > 0 OR s.ROWS_CHANGED > 0) AS used_table_count,
        SUM(s.ROWS_READ) AS SUM_ROWS_READ,
        SUM(s.ROWS_CHANGED) AS SUM_ROWS_CHANGED,
        SUM(s.ROWS_CHANGED_X_INDEXES) AS SUM_ROWS_CHANGED_X_INDEXES
    FROM information_schema.TABLES t
    LEFT JOIN information_schema.TABLE_STATISTICS s
        ON t.TABLE_SCHEMA = s.TABLE_SCHEMA AND t.TABLE_NAME = s.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
    GROUP BY t.ENGINE, t.TABLE_SCHEMA
        WITH ROLLUP
;

See index usage

Rows read with an index vs. read with a tablescan:

SELECT
        t.TABLE_SCHEMA, t.TABLE_NAME,
        t.ROWS_READ AS TOTAL_ROWS_READ,
        SUM(i.ROWS_READ) AS INDEX_ROWS_READ,
        t.ROWS_READ - SUM(i.ROWS_READ) AS TABLESCAN_ROWS_READ
    FROM information_schema.TABLE_STATISTICS t
    LEFT JOIN information_schema.INDEX_STATISTICS i
        ON t.TABLE_SCHEMA = i.TABLE_SCHEMA AND t.TABLE_NAME = i.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
    GROUP BY t.TABLE_SCHEMA, t.TABLE_NAME, t.ROWS_READ
;

Note: for some small tables it’s ok to have a tablescan (eg: application configuration). For others it’s not, but if you dump them you have tablescans. This query still helps you to find tables largely read with tablescans where this is not expected.

Index usage with index type added (this allows to see, for example, FULLTEXT indexes usage; this query also includes unused indexes, which are not included in INDEX_STATISTICS):

SELECT
        s.TABLE_SCHEMA, s.TABLE_NAME,
        s.INDEX_NAME, s.INDEX_TYPE,
        IFNULL(i.ROWS_READ, 0) AS ROWS_READ
    FROM information_schema.STATISTICS s
    LEFT JOIN information_schema.INDEX_STATISTICS i
        ON s.TABLE_SCHEMA = i.TABLE_SCHEMA
            AND s.TABLE_NAME = i.TABLE_NAME
            AND s.INDEX_NAME = i.INDEX_NAME
    WHERE s.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
;

Specifically list unused indexes:

SELECT st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME
    FROM information_schema.STATISTICS st
    LEFT JOIN information_schema.INDEX_STATISTICS idx
        ON  idx.INDEX_NAME    = st.INDEX_NAME
        AND idx.TABLE_NAME    = st.TABLE_NAME
        AND idx.TABLE_SCHEMA  = st.TABLE_SCHEMA
    WHERE
        st.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
        AND idx.INDEX_NAME IS NULL
        AND st.NON_UNIQUE = 1
    ORDER BY st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME
;

Compose the ALTER TABLEs to drop unused indexes:

SELECT st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME
    FROM information_schema.STATISTICS st
    LEFT JOIN information_schema.INDEX_STATISTICS idx
        ON  idx.INDEX_NAME    = st.INDEX_NAME
        AND idx.TABLE_NAME    = st.TABLE_NAME
        AND idx.TABLE_SCHEMA  = st.TABLE_SCHEMA
    WHERE
        st.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
        AND idx.INDEX_NAME IS NULL
        AND st.NON_UNIQUE = 1
    ORDER BY st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME
;

Miscellaneous

Find unused tables (since last server restart or User Statistics activation):

SELECT
        t.TABLE_SCHEMA, t.TABLE_NAME, t.ENGINE
    FROM information_schema.TABLES t
    LEFT JOIN information_schema.TABLE_STATISTICS s
        ON t.TABLE_SCHEMA = s.TABLE_SCHEMA AND t.TABLE_NAME = s.TABLE_NAME
    WHERE t.TABLE_SCHEMA NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
        AND ROWS_READ IS NULL AND ROWS_CHANGED IS NULL
;

Conclusions

Here are the queries I occasionally use to get information from the User Statistics plugin. As a general rule, the performance_schema gives more in-depth information, but not all these data are available in it. User Statistics is also simpler to use. And it can be enabled at runtime.

These queries only concern the TABLE_STATISTICS and INDEX_STATISTICS tables. Other tables can also provide insightful information, and this can be a topic for one or more future articles, if there is interest.

Our MySQL Health Checks include table usage analysis, if User Statistics are enabled. If it is not enabled, the report includes some similar (but less specific) information obtained by querying the information_schema and performance_schema databases. This is meant to both give you a better understanding on your workload, and find some problems about your indexes that need to be fixed.

Federico Razzoli

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli

Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but prefers to do it from Scotland where he lives.

MariaDB 11.8 LTS: Parallel Dumps, PARSEC authentication, new SQL syntaxes, and more

Jun 18, 2025

MariaDB released a new Long Term Support version as Generally Available: 11.8. I always review MariaDB LTS versions once they're GA, so it's time to write a new review. Support and timeline MariaDB 11.8 is the latest Long Term Support version (LTS). The...

Query Optimisation: Using indexes for WHERE with multiple conditions

Jun 5, 2025

You have a beautiful application, but a page is slow. After some investigation, you find out that a query is unexpectedly slow, ruining user experience and causing frustration for you. It is a simple query with a WHERE clause and nothing else. You tried to build an...

Vettabase is a MariaDB Foundation sponsor!

Jun 2, 2025

Vettabase is a MariaDB Foundation partner. We contributed documentation about automating MariaDB deployments and configuration management.

Services



Email

Schedule Meeting

Phone

Understanding tables usage with User Statistics on MariaDB and Percona Server

Enabling and disabling User Statistics

Use Cases

Queries

See how the workload is distributed over engines and databases

See index usage

Miscellaneous

Conclusions

Recent Posts

MariaDB 11.8 LTS: Parallel Dumps, PARSEC authentication, new SQL syntaxes, and more

Query Optimisation: Using indexes for WHERE with multiple conditions

Vettabase is a MariaDB Foundation sponsor!

Services

Database Automation

Database Training

Database Health Check

Monthly DBA Time

Database Upgrade

0 Comments

Submit a Comment Cancel reply

Email

Schedule Meeting

Phone

Quick Links

Recent Posts

Why Your Database Deserves Consistent Names and Types

To BLOB or not to BLOB? The image storage dilemma

MariaDB 11.8 LTS: Parallel Dumps, PARSEC authentication, new SQL syntaxes, and more

Policies & Licenses

Follow Us on Social Media