Schedule Meeting

a

To BLOB or not to BLOB? The image storage dilemma

by | Jul 6, 2025 | SQL Language

Need Help?  Click Here for Expert Support

Years ago, it was extremely common to store user-uploaded images and other multimedia contents in a database. This practice became less common over time, but I still hear questions like: would it be a bad idea to store images in our database? Or, more specifically: how bad would it be to store images in a database? In this article, we’ll discuss the pros and cons. We’ll briefly discuss the right ways to store user-uploaded contents. And we’ll discuss what you can do to mitigate problems, if you have to store images or big files in a database.

Pros

Yes, there are obvious pros in storing images in a database:

  • Simple code: You don’t have to implement the code to store images in some other technology.
  • Single transaction: You add a database row and the related images together. In this way, it’s not possible to have a partial write, where only the row is created, or only the images are created. The same applies to deletions and updates.
  • Referential integrity: Foreign keys and CHECK constraints apply, ensuring referential integrity between images and other database rows.
  • Security: Provided that your database is secure (and it should be for a number of reasons!), your images should be safe. Adding a new technology means to expand the attack surface of your IT stack.
  • Simple infrastructure: If you store images in another technology, even the simplest (AKA filesystem), your company will need to implement a lot of automated operations to handle it:
    • Monitoring.
    • Backups.
    • Redundancy and failover.
    • Periodical cleanup.

On a similar topic, see The costs of using multiple database technologies.

Cons

Despite the above-mentioned pros, storing images in a database is not optimal and (personal opinion) inelegant. Here’s the list of counter-arguments:

  • Database caches: Most databases have a data and index cache called buffer pool. PostgreSQL uses the OS caches for the same purpose. A server’s memory is not infinite, and by the way it has a cost. Polluting the buffer pool with images and other multimedia objects might prevent other contents from being cached, making the database generally slower.
  • External caches: If you use an external cache like Redis for databases, this cache will be used for images, too. Apparently this is not a problem, but you might want to set a separate TTL or a separate redundancy strategy for images. You don’t necessarily need a separate Redis instance: key prefixes are often sufficient.
  • Transaction logs: A database uses transaction logs to be able to repeat the data writes if the server crashes before all the relevant data files are updated (these files are written in a lazy way for performance reasons). MariaDB and MySQL have storage engine specific logs, plug a global binary log. Similar logs exist even for databases that don’t support transactions, to make them crash-safe. In some cases, a big log might fill the disk. In any case, writing images in this log will slow down writes.
  • Replication slowdown: Since transaction logs are bigger, replication will be slower. Asynchronous replication solutions might lag, and synchronous replication solutions might slow down the whole cluster.
  • Network bottlenecks: When an image is requested, it must be sent to the client. This takes some time, as the packet size is limited. While less frequent, if many concurrent connections request images, this can saturate the network bandwidth.
  • Backup overhead: Large images make backups slower to generate, slower to restore, larger and more expensive in the cloud.
  • Storage costs: If you use a managed database solution, the storage costs are higher than all alternatives.

The Alternatives

But which alternatives do we have? Well, this goes a bit beyond the scope of Vettabase consulting, but if you need at least a quick overview, here it is.

The simplest ways to make images accessible to web servers is to store them on a filesystem or on a CDN. It’s also possible to cache them with a technology like Redis, and make them available via HTTP through a web service.

This doesn’t mean that filesystems or CDNs have to be the primary storages for images. I think it’s a good idea to persist them into a more appropriate technology and, from there, send them to a filesystem or a CDN.

You might want to persist images into an S3 object storage. S3 is an open protocol originally designed for the AWS S3 service. Other cloud platforms include it, like GCP Cloud Storage. Open source implementations include MinIO and OpenShift Swift (with a middleware).

A less common alternative is NAS. NAS solutions are designed for file sharing within a company, but if we have one and it’s under-utilised, nothing prevents us from making it accessible from a web application for persistence of user-uploaded files. Since NAS servers typically come with their own light web server, we might be tempted to make images directly available from NAS. But I don’t recommend this, as NAS is not optimised for a real-time usage.

Mitigating the Performance Problems

Maybe you realised that you shouldn’t store images in a database, but for your project you’ll have to do so anyway. Or maybe you already store images in a database and you realise that this needs to change, but it will take time. There are some things that you can to do mitigate the performance problems cause by images in your database.

Normalise tables

Maybe you have an article column, with an image column. That’s the worse case for at least these reasons:

  1. Many developers and ORMs tende to use SELECT *, so the images will always be sent to the application, even when not needed (for example, even in the page that only shows the titles and dates).
  2. Applications change over time. At some point you might be told that a second image should be added, then a third, then a PDF. Even assuming that you use a DBMS that can effortlessly add a column, the table will become more problematic.
  3. If the same image is used in more places, it will be loaded multiple times.

You need to store images in a separate table, and probably you even need a “many to many” relationship. See the following diagram (click to zoom):

ER diagram showing these tables: "article", "image", and "article_image". The latter is a many to many relationship.

Use invisible columns

If you don’t normalise the table, you want a way to exclude big columns from SELECT *. Extracting the column will still be possible, but the queries will have to name the column explicitly. To achieve this, some DBMSs support a feature called invisible columns. The following syntax is for MariaDB:

> CREATE OR REPLACE TABLE article(
    ->     id UUID DEFAULT UUID(),
    ->     title VARCHAR(200) NOT NULL,
    ->     image BLOB INVISIBLE DEFAULT 'TEST',
    ->     PRIMARY KEY (id)
    -> );

> INSERT INTO article (title)
    -> VALUES ('LLMs prefer talking to cats');

> SELECT * FROM article;
+--------------------------------------+-----------------------------+
| id                                   | title                       |
+--------------------------------------+-----------------------------+
| 00032f2f-55b1-11f0-a71d-06e7dafdc5e9 | LLMs prefer talking to cats |
+--------------------------------------+-----------------------------+

> SELECT *, image FROM article;
+--------------------------------------+-----------------------------+-------+
| id                                   | title                       | image |
+--------------------------------------+-----------------------------+-------+
| 00032f2f-55b1-11f0-a71d-06e7dafdc5e9 | LLMs prefer talking to cats | TEST  |
+--------------------------------------+-----------------------------+-------+

As you can see, the image column is only shown if you name it explicitly. But you can still use *, so minimal code changes are required.

Use the correct type

Every DBMS has a binary-safe text type for binary data. With MariaDB and MySQL they are BLOB and similar types. With PostgreSQL, it’s BYTEA. For other DBMSs, consult the documentation. Avoid TEXT and VARCHAR, because they might be less optimised and they might not be binary-safe.

Exceptions

I gave you several reasons why you shouldn’t store images in databases.

Are there exceptions? Yes, sometimes we have to do something less than optimal:

  • Trivial websites. Maybe your website is trivial. Maybe it’s just a blog, it’s not visited by a lot of people, and it’s not particularly valuable. Sure, in theory storing images in a database is still not great, but in practice any alternative is not worth the effort.
  • Too much work. Maybe the system or devops teams are undersized. They just can’t add one more technology, especially if they’re not familiar with it. This is more common than not.
  • Prototypes. Maybe you’re just developing a prototype. This is a dangerous reason, because you are creating technical debt. Which is not necessarily bad, but will you ever repay this debt?

If you decide to keep storing images in your database, be aware of the risks and try to mitigate the performance problems as I described above. If you need help, keep in mind that you can contact us.

Federico Razzoli

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli
Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but prefers to do it from Scotland where he lives.

Recent Posts

Why managers should learn and use SQL

Why managers should learn and use SQL

SQL is the language of data. You write a query, a single sentence in a language that resembles English, and you obtain the data you need. It's supported by the majority of databases, and the language is pretty standard. If you have a query that works on Oracle, most...

Making the case for stored procedures

Making the case for stored procedures

MariaDB stored procedures and why they should be improved was the title of my talk at MariaDB Server Fest 2023 in Helsinki. My point was that an improved stored procedures implementation would bring benefits to individual users, the MariaDB community, and the two...

Services

Need Help?  Click Here for Expert Support

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *