Can we shrink InnoDB Buffer Pool?

Last updated on 26 Marzo 2021

In my previous article, I wrote how to answer the question: Is InnoDB Buffer Pool big enough? But the buffer pool can also be too big. So you may wonder if you can shrink it.

Piazza del Popolo seen from a higher location called Il Pincio, in Rome.
Piazza del Popolo seen from il Pincio, Rome.
Is it oversized?

Reasons to shrink the Buffer Pool

Why would one do such a thing? A common reason is saving resources. Saving memory can be very important. For example, with AWS you may need a more expensive EC2 instance to have more memory. As a general rule, for db class instances, “more expensive” means that both the on demand cost and the potential effective cost double (at the moment of this writing).

It’s also possible that you need more memory to allocate in other ways. Maybe you have many concurrent users, or they run many complex JOINs, and session buffers need more memory.

And don’t forget that a big buffer pool can be slow. Giving your server an impressive amount of memory to allocate a huge buffer pool could slow down all your reads and your writes.

How to check if InnoDB Buffer Pool is oversized

Here I describe how I check if InnoDB buffer pool is oversized. I’m sure there are other ways.

Whichever method we use, let’s keep in mind that the buffer pool should always have some free pages.

Let’s also take growth into account. If the amount of data you frequently read is getting bigger over time, we should be prepared to handle foreseeable loads in the near future.

Number of free pages

Innodb_buffer_pool_pages_free is the number of free pages. If it’s always a large number, the buffer pool is bigger than the amount of data we use.

It’s hard to give numbers that anyone can use, because every workload is different. Let’s say that having 10% of the buffer pool made of free pages is very prudent, keeping this percentage even bigger is probably useless.

Hits and misses ratio

Innodb_buffer_pool_read_requests is the number of reads. Innodb_buffer_pool_reads is the number of disk reads, that occur when the desired data is not found in the buffer pool. The ratio between these variables tells us if the buffer pool is efficient.

Trying to shrink the buffer pool could reduce this ratio. If we can’t afford this, and the buffer pool is not clearly oversized, let’s leave things as they are.

Counting accesses to pages

In am oversized buffer pool that is not as big as the whole dataset, all the most frequently pages stay in memory all the time. Other pages, depending on the running queries, may stay there forever, or be regularly evicted to make room for other pages that are not frequently read (and will therefore be evicted). But how to check if this is what happens?

The information_schema database contains a table called INNODB_BUFFER_PAGE . It provides a list of the pages present in the buffer pool, and each page can be uniquely identified. My basic idea is to check for how much time the pages stay in memory.

For this purpose I wrote a stored procedure. It is based on a Morgan Tocker’s idea. What my procedure does is:

  • Scan information_schema.INNODB_BUFFER_PAGE for a specified number of times, at a specified interval.
  • Write each found page id’s into a temporary table.
  • After all observations, return the number of pages by frequency. Example:
+--------------+-------------+----------+
| occurrencies | page_number | bytes    |
+--------------+-------------+----------+
| 3            | 34          | 557056   |
| 4            | 57          | 933888   |
| 5            | 562         | 9207808  |
| <null>       | 687         | 11255808 |
+--------------+-------------+----------+

In this example 34 pages were found 3 times, 4 pages were found 57 times and 5 pages were found 562 times. The total number of found pages is 687. Looking at the size in bytes helps us understand if a certain number of pages is relevant or not.

The procedure is the following:

-- Example:
-- CALL _.show_working_set_size(5, 3);
CREATE PROCEDURE show_working_set_size(
IN in_observations_count INT UNSIGNED,
IN in_observation_interval INT UNSIGNED
)
MODIFIES SQL DATA
BEGIN
DROP TEMPORARY TABLE IF EXISTS innodb_used_pages;
CREATE TEMPORARY TABLE innodb_used_pages (
block_id INT UNSIGNED NOT NULL COMMENT 'Memory block id',
pool_id INT UNSIGNED NOT NULL COMMENT 'Bupper pool instance id',
occurrencies INT UNSIGNED NOT NULL COMMENT 'How many times the page was found in buffer pool',
-- block_id is probably not unique across instances
PRIMARY KEY (block_id, pool_id)
)
ENGINE MEMORY
COMMENT 'Stats on pages found in the buffer pool'
;
WHILE in_observations_count > 0 DO
INSERT IGNORE INTO innodb_used_pages
SELECT pool_id, block_id, 1 AS occurrencies
FROM information_schema.INNODB_BUFFER_PAGE
WHERE PAGE_STATE NOT IN ('NOT_USED', 'READY_FOR_USE')
AND PAGE_TYPE NOT IN ('IBUF_FREE_LIST', 'TRX_SYSTEM', 'UNDO_LOG')
ON DUPLICATE KEY UPDATE occurrencies := occurrencies + 1;
DO SLEEP(in_observation_interval);
SET in_observations_count = in_observations_count - 1;
END WHILE;
SELECT
occurrencies,
COUNT(*) AS page_number,
COUNT(*) * @@innodb_page_size AS bytes
FROM innodb_used_pages
GROUP BY occurrencies WITH ROLLUP;
DROP TEMPORARY TABLE innodb_used_pages;
END;

Notes:

  • The procedure works on both MySQL and MariaDB. I expect Morgan’s procedure to run on them both too.
  • The above procedure can probably be improved, and if you do, you are very welcome to leave a comment here.
  • Queries against information_schema.INNODB_BUFFER_PAGE cause contention. Don’t use this procedure on busy servers.
  • Morgan’s solution identifies the pages by (block_id) . I believe this was correct at the time, but not now we have multiple buffer pool instances, so I identify them by (block_id, pool_id) . I hope I’m correct; if not, I don’t think that the impact on the results is relevant.
  • I try to ignore irrelevant pages, but I don’t understand all values of PAGE_TYPE and PAGE_STATE .

Reference

MySQL

MariaDB

See also

Related articles

Related courses

Conclusions

An oversized InnoDB buffer pool is waste of resources, and possibly a performance problem. Not only it can be slower than necessary, but it takes memory that could be needed by other buffers.

We discussed the methods I use to check if a buffer pool is oversized. We recommend not to use the third method yourself unless you are a MariaDB/MySQL professional: as mentioned, it can cause contention, and probably you don’t know how to check that.

Checking the size of the free list and the ratio between memory reads and disk reads is part of our MariaDB Health Checks or MySQL Health Checks. Consider using a health check if you have a doubt that you are wasting your memory. Also, if you find out that your buffer pool is oversized, in my experience this means that there are many other aspects to tune in your MySQL servers.

If you know more way to check if the buffer pool is oversized, please comment!

Federico Razzoli

A proposito di Federico Razzoli

Federico is Vettabase Ltd director, and he's an expert database consultant specialised in the MariaDB and MySQL ecosystems. Federico is also a free software supporter. He believes in information sharing and reciprocal help. He thinks that multiple organisations contributing the same repositoy and benefiting from each other's work are better than many companies re-inventing the wheel. He thinks this view is not just compatible with business, but it's today and tomorrow's way of doing business. Federico also loves beers from the North of Europe, wines from Italy, bagpipes, nature hiking and the history of computing.

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *

*