Schedule Meeting

a

Can we shrink InnoDB Buffer Pool?

by | Jan 21, 2021 | MariaDB InnoDB, MariaDB, MySQL, MySQL InnoDB

Need Help?  Click Here for Expert Support

In my previous article, I wrote how to answer the question: Is InnoDB Buffer Pool big enough? But the buffer pool can also be too big. So you may wonder if you can shrink it.

Piazza del Popolo seen from a higher location called Il Pincio, in Rome.
Piazza del Popolo seen from il Pincio, Rome.
Is it oversized?

Reasons to shrink the Buffer Pool

Why would one do such a thing? A common reason is saving resources. Saving memory can be very important. For example, with AWS you may need a more expensive EC2 instance to have more memory. As a general rule, for db class instances, “more expensive” means that both the on demand cost and the potential effective cost double (at the moment of this writing).

It’s also possible that you need more memory to allocate in other ways. Maybe you have many concurrent users, or they run many complex JOINs, and session buffers need more memory.

And don’t forget that a big buffer pool can be slow. Giving your server an impressive amount of memory to allocate a huge buffer pool could slow down all your reads and your writes.

How to check if InnoDB Buffer Pool is oversized

Here I describe how I check if InnoDB buffer pool is oversized. I’m sure there are other ways.

Whichever method we use, let’s keep in mind that the buffer pool should always have some free pages.

Let’s also take growth into account. If the amount of data you frequently read is getting bigger over time, we should be prepared to handle foreseeable loads in the near future.

Number of free pages

Innodb_buffer_pool_pages_free is the number of free pages. If it’s always a large number, the buffer pool is bigger than the amount of data we use.

It’s hard to give numbers that anyone can use, because every workload is different. Let’s say that having 10% of the buffer pool made of free pages is very prudent, keeping this percentage even bigger is probably useless.

Hits and misses ratio

Innodb_buffer_pool_read_requests is the number of reads. Innodb_buffer_pool_reads is the number of disk reads, that occur when the desired data is not found in the buffer pool. The ratio between these variables tells us if the buffer pool is efficient.

Trying to shrink the buffer pool could reduce this ratio. If we can’t afford this, and the buffer pool is not clearly oversized, let’s leave things as they are.

Counting accesses to pages

In am oversized buffer pool that is not as big as the whole dataset, all the most frequently pages stay in memory all the time. Other pages, depending on the running queries, may stay there forever, or be regularly evicted to make room for other pages that are not frequently read (and will therefore be evicted). But how to check if this is what happens?

The information_schema database contains a table called INNODB_BUFFER_PAGE. It provides a list of the pages present in the buffer pool, and each page can be uniquely identified. My basic idea is to check for how much time the pages stay in memory.

For this purpose I wrote a stored procedure. It is based on a Morgan Tocker’s idea. What my procedure does is:

  • Scan information_schema.INNODB_BUFFER_PAGE for a specified number of times, at a specified interval.
  • Write each found page id’s into a temporary table.
  • After all observations, return the number of pages by frequency. Example:
+--------------+-------------+----------+
| occurrencies | page_number | bytes    |
+--------------+-------------+----------+
| 3            | 34          | 557056   |
| 4            | 57          | 933888   |
| 5            | 562         | 9207808  |
| <null>       | 687         | 11255808 |
+--------------+-------------+----------+

In this example 34 pages were found 3 times, 4 pages were found 57 times and 5 pages were found 562 times. The total number of found pages is 687. Looking at the size in bytes helps us understand if a certain number of pages is relevant or not.

The procedure is the following:

-- Example:
-- CALL _.show_working_set_size(5, 3);
CREATE PROCEDURE show_working_set_size(
        IN in_observations_count INT UNSIGNED,
        IN in_observation_interval INT UNSIGNED
    )
    MODIFIES SQL DATA
BEGIN
    DROP TEMPORARY TABLE IF EXISTS innodb_used_pages;
    CREATE TEMPORARY TABLE innodb_used_pages (
        block_id INT UNSIGNED NOT NULL COMMENT 'Memory block id',
        pool_id INT UNSIGNED NOT NULL COMMENT 'Bupper pool instance id',
        occurrencies INT UNSIGNED NOT NULL COMMENT 'How many times the page was found in buffer pool',
        -- block_id is probably not unique across instances
        PRIMARY KEY (block_id, pool_id)
    )
        ENGINE MEMORY
        COMMENT 'Stats on pages found in the buffer pool'
    ;

    WHILE in_observations_count > 0 DO
        INSERT IGNORE INTO innodb_used_pages
            SELECT pool_id, block_id, 1 AS occurrencies
                FROM information_schema.INNODB_BUFFER_PAGE
                WHERE PAGE_STATE NOT IN ('NOT_USED', 'READY_FOR_USE')
                AND PAGE_TYPE NOT IN ('IBUF_FREE_LIST', 'TRX_SYSTEM', 'UNDO_LOG')
            ON DUPLICATE KEY UPDATE occurrencies := occurrencies + 1;
        DO SLEEP(in_observation_interval);
        SET in_observations_count = in_observations_count - 1;
    END WHILE;
   
    SELECT
            occurrencies,
            COUNT(*) AS page_number,
            COUNT(*) * @@innodb_page_size AS bytes
        FROM innodb_used_pages
        GROUP BY occurrencies WITH ROLLUP;
    DROP TEMPORARY TABLE innodb_used_pages;
END;

Notes:

  • The procedure works on both MySQL and MariaDB. I expect Morgan’s procedure to run on them both too.
  • The above procedure can probably be improved, and if you do, you are very welcome to leave a comment here.
  • Queries against information_schema.INNODB_BUFFER_PAGE cause contention. Don’t use this procedure on busy servers.
  • Morgan’s solution identifies the pages by (block_id). I believe this was correct at the time, but not now we have multiple buffer pool instances, so I identify them by (block_id, pool_id). I hope I’m correct; if not, I don’t think that the impact on the results is relevant.
  • I try to ignore irrelevant pages, but I don’t understand all values of PAGE_TYPE and PAGE_STATE.

Reference

MySQL

MariaDB

See also

Related articles

Related courses

Conclusions

An oversized InnoDB buffer pool is waste of resources, and possibly a performance problem. Not only it can be slower than necessary, but it takes memory that could be needed by other buffers.

We discussed the methods I use to check if a buffer pool is oversized. We recommend not to use the third method yourself unless you are a MariaDB/MySQL professional: as mentioned, it can cause contention, and probably you don’t know how to check that.

Checking the size of the free list and the ratio between memory reads and disk reads is part of our MariaDB Health Checks or MySQL Health Checks. Consider using a health check if you have a doubt that you are wasting your memory. Also, if you find out that your buffer pool is oversized, in my experience this means that there are many other aspects to tune in your MySQL servers.

For an expert review of your MariaDB or MySQL configuration, consider our MariaDB Health Checks or MySQL Health Checks.

Federico Razzoli

Did you like this article?

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli
Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but prefers to do it from Scotland where he lives.

Recent Posts

How to query a REST API with MariaDB CONNECT engine

How to query a REST API with MariaDB CONNECT engine

Querying a remote REST API directly from the database: does it sound like pure madness? Some people will surely answer that yes, it does. So, first of all, let me spend a paragraph to show some great benefits of doing so. Then I'll show you how to query a REST API...

Coming up next, a ColumnStore webinar

Coming up next, a ColumnStore webinar

The 27th is fast approaching but you can still signup to our live webinar where we will be exploring the benefits of time series data using MariaDB ColumnStore. We will be jumping into a live demo with some example data and queries simulating high volume time series...

The benefits of MariaDB ColumnStore

The benefits of MariaDB ColumnStore

Last week Richard announced our projects on MariaDB ColumnStore. Take a look at his great post, if you didn't already. Since then, I've got some questions from customers, colleagues and friends: why did you guys decide to robustly invest into ColumnStore, and offer...

Services

Need Help?  Click Here for Expert Support

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *