Hints to optimise queries with a LIKE comparison

by Federico Razzoli | Sep 15, 2022 | MariaDB, MySQL, PostgreSQL

Need Help? Click Here for Expert Support

In SQL, using the LIKE operator is a powerful way to find strings that match a certain pattern. It’s suitable for most use cases, thanks to its two wildcard characters:

_ means any one character.
% means any sequence of zero or more characters.

However, many queries out there are slow because they use LIKE operations that can’t take advantage of an index. This article discusses how to optimise a LIKE operation.

Searching the beginning of a text

There are two ways to use an index for searching:

To search for a single value (WHERE id = 24).
To search for a range of values (a closed range: WHERE id BETWEEN 24 AND 42, or an open range: WHERE id < 24).

When we use LIKE to search the beginning of a text, in practice we perform a range search. In fact, these two queries are equivalent:

SELECT ... FROM ... WHERE name LIKE 'B%';
SELECT ... FROM ... WHERE name >= 'B' AND name < 'C';

An index can often be used to find the beginning of a text with LIKE.

Sometimes a LIKE search checks the beginning of a text but contains more wildcard characters. For example: name LIKE 'A%B%C'. They still use indexes, but keep reading to see if you can optimise them better.

Most queries are more complex than these examples, so they may combine several filters, an ORDER BY and a GROUP BY. To make sure that a complex query can use an index, you need to understand how indexes work.

Searching the middle of a text

A LIKE operation that searches the middle of a text cannot use an index. But let’s see if you can still do something to optimise your query.

Search the beginning of the text when possible

This may look like a dumb hint, but it isn’t. In the real world, I’ve seen many queries performing a '%middle of text%' search when they could have performed a 'beginning%of%text%' search. Probably the developers didn’t take the time to verify if the beginning of text version was enough, or maybe they just made a mistake. Whatever the reason is, it’s a good idea to double-check this kind of searches.

Using the single-character wildcard

I have already mentioned that LIKE has two wildcard characters: _ and %.

Use _ or a sequence of _ rather than %, whenever possible. This will reduce the amount of characters to search. Take the following examples:

SELECT ... FROM ... WHERE code LIKE '%x%';
SELECT ... FROM ... WHERE code LIKE '__x%';

Let’s suppose that our code has 20 characters. The first version of the query will search all of the characters. The second version will only search the third character. Even if both queries can’t use indexes, the second will cause less work, so it will be faster.

Merging searches

Sometimes you need to search for multiple strings in the middle of a text, but you know in which order they appear if they are present. You should combine them into one search to make a query faster.

Let’s take the following query as an example:

SELECT ... FROM ... WHERE code LIKE '%XX%' AND code LIKE '%YY%';

If you know that “XX” cannot appear after “YY”, you can optimise the query in this way:

SELECT ... FROM ... WHERE code LIKE '%XX%YY%';

This may be an optimisation because the search for “YY” will only be performed after the first appearance of “XX”, and won’t happen at all if “XX” is not present.

You can optimise the query even more if you know how many characters separate the two strings, as an exact number or as a minimum:

-- exactly 3 characters in the middle:
SELECT ... FROM ... WHERE code LIKE '%XX__YY%';
-- at least 3 characters in the middle:
SELECT ... FROM ... WHERE code LIKE '%XX__%YY%';

SIMILAR TO

Some DBMSs, like PostgreSQL, support the SIMILAR TO syntax, which is more flexible than LIKE but less powerful than REGEXP. Some argue that SIMILAR TO is useless, but we’re going to use it here for simplicity – nothing prevents you from using a regular expression, if you prefer to do so.

SIMILAR TO can occasionally be used to combine two LIKE conditions. See the example below:

SELECT ... FROM ... WHERE code LIKE '%XX%' OR code LIKE '%YY%';
SELECT ... FROM ... WHERE code SIMILAR TO '%(XX|YY)%';

Flags

Sometimes we use LIKE to search for text (probably codes) containing a certain flag. In this case, there is usually a small number of possible flags we may need to look for. This is very different from a LIKE search based on whatever our application users may type.

If the number of important flags is small (I’d say 3, just to give you a number), you may consider creating boolean columns that are TRUE if the flag is present, and FALSE otherwise. You may use generated columns to enforce the boolean value correctness:

ALTER TABLE table_name
    ADD COLUMN is_tkk BOOLEAN
    AS (product_code LIKE '%tkk%')
    STORED;

This expression will be calculated when a row is INSERTed or the product_code column is UPDATEd.

GIN indexes

I mentioned that the problem with LIKE searches in any position of the text is that no index can be used, even if a suitable index exists. Actually, it’s only true for Btree indexes, because they are ordered data structures. Now, think about another ordered data structure familiar to everyone: a book index. You can quickly find the topic “Belgium”, or all topics starting with B, or with Be. But if you want to find all topics containing “giu”, you’ll have to search the whole index.

PostgreSQL supports many index types, and one of these types is GIN. An explanation of GIN indexes would be out of the scope of this article, but let’s just say that they are suitable for searches in the middle of a text.

CREATE INDEX idx_gin_code ON code USING gin (code);

A common misconception is that GIST indexes are a valid alternative. But for a case like this, they’re not. I’ve seen situations when their usage is slower than a table scan.

Searching the end of a text

A LIKE operation that searches the end of a text cannot use an index. However, there a workaround.

If we reverse both the text in the column and the LIKE pattern, we’ll have a search at the beginning of the text. Here’s the example:

SELECT ... FROM ... WHERE code LIKE '%abc';
-- will become:
SELECT ... FROM ... WHERE code_rev LIKE REVERSE('%abc');

code_rev can either be populated by the application, or it can be a generated column. See the example below:

ALTER TABLE table_name
    ADD COLUMN code_rev VARCHAR(200)
        AS (REVERSE(code))
        STORED,
    ADD INDEX idx_code_rev (code_rev);

If the codes aren’t short, we should ask ourselves if we need to have the whole reversed text in code_rev. To speed up writes and reduce the index size, we can reverse only the first N characters instead like this: REVERSE(LEFT(code, 10)).

Optimising regular expressions with LIKE

It’s worth mentioning that LIKE can also be added to a query with REGEXP, adding a redundant condition, to allow the DBMS to use an index. I called this technique the regexp + LIKE optimisation.

Conclusions

We discussed how to optimise a LIKE condition to make sure that it can take advantage of an index. To learn more about query optimisation, you may consider attending our MariaDB Query Optimisation or MySQL Query Optimisation training courses.

Federico Razzoli

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli

Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but loves to do it from Scotland, where he lives. Follow Federico on his personal blog: Federico's Thoughts.

SQL Savepoints and When to Use Them

Jan 22, 2026

Not many developers know about savepoints in relational databases. Even less of them know when to use them. It's not their fault: I can't remember seeing a good explanation of this feature. Let's try to clarify this lesser-known functionality. In this article I'm...

Distributed, Multi-Database Transactions Involving MariaDB and PostgreSQL

Jan 19, 2026

In some situations, an application needs to run a single logical transaction that involves multiple database technologies: in our example, they'll be MariaDB and PostgreSQL. This is not an optimal scenario and I'm not recommending to design systems in this way. But...

MariaDB Underrated Features: Zero Dates and Partial Dates

Dec 30, 2025

How do you represent information like this in a database? This event happened in 2015/06, but we don't know in which day. This job is scheduled to happen on the first day of the month at 00:00:00, every month and every year. This never happened. There are many ways to...

Services



Email

Schedule Meeting

Phone

Hints to optimise queries with a LIKE comparison

Searching the beginning of a text

Searching the middle of a text

Search the beginning of the text when possible

Using the single-character wildcard

Merging searches

SIMILAR TO

Flags

GIN indexes

Searching the end of a text

Optimising regular expressions with LIKE

Conclusions

Recent Posts

SQL Savepoints and When to Use Them

Distributed, Multi-Database Transactions Involving MariaDB and PostgreSQL

MariaDB Underrated Features: Zero Dates and Partial Dates

Services

Database Automation

Database Training

Database Health Check

Monthly DBA Time

Database Upgrade

0 Comments

Submit a Comment Cancel reply

Email

Schedule Meeting

Phone

Quick Links

Recent Posts

SQL Savepoints and When to Use Them

Distributed, Multi-Database Transactions Involving MariaDB and PostgreSQL

MariaDB Underrated Features: Zero Dates and Partial Dates

Policies & Licenses

Follow Us on Social Media