Years ago, it was extremely common to store user-uploaded images and other multimedia contents in a database. This practice became less common over time, but I still hear questions like: would it be a bad idea to store images in our database? Or, more specifically: how bad would it be to store images in a database? In this article, we’ll discuss the pros and cons. We’ll briefly discuss the right ways to store user-uploaded contents. And we’ll discuss what you can do to mitigate problems, if you have to store images or big files in a database.
Pros
Yes, there are obvious pros in storing images in a database:
- Simple code: You don’t have to implement the code to store images in some other technology.
- Single transaction: You add a database row and the related images together. In this way, it’s not possible to have a partial write, where only the row is created, or only the images are created. The same applies to deletions and updates.
- Referential integrity: Foreign keys and
CHECK
constraints apply, ensuring referential integrity between images and other database rows. - Security: Provided that your database is secure (and it should be for a number of reasons!), your images should be safe. Adding a new technology means to expand the attack surface of your IT stack.
- Simple infrastructure: If you store images in another technology, even the simplest (AKA filesystem), your company will need to implement a lot of automated operations to handle it:
- Monitoring.
- Backups.
- Redundancy and failover.
- Periodical cleanup.
On a similar topic, see The costs of using multiple database technologies.
Cons
Despite the above-mentioned pros, storing images in a database is not optimal and (personal opinion) inelegant. Here’s the list of counter-arguments:
- Database caches: Most databases have a data and index cache called buffer pool. PostgreSQL uses the OS caches for the same purpose. A server’s memory is not infinite, and by the way it has a cost. Polluting the buffer pool with images and other multimedia objects might prevent other contents from being cached, making the database generally slower.
- External caches: If you use an external cache like Redis for databases, this cache will be used for images, too. Apparently this is not a problem, but you might want to set a separate TTL or a separate redundancy strategy for images. You don’t necessarily need a separate Redis instance: key prefixes are often sufficient.
- Transaction logs: A database uses transaction logs to be able to repeat the data writes if the server crashes before all the relevant data files are updated (these files are written in a lazy way for performance reasons). MariaDB and MySQL have storage engine specific logs, plug a global binary log. Similar logs exist even for databases that don’t support transactions, to make them crash-safe. In some cases, a big log might fill the disk. In any case, writing images in this log will slow down writes.
- Replication slowdown: Since transaction logs are bigger, replication will be slower. Asynchronous replication solutions might lag, and synchronous replication solutions might slow down the whole cluster.
- Network bottlenecks: When an image is requested, it must be sent to the client. This takes some time, as the packet size is limited. While less frequent, if many concurrent connections request images, this can saturate the network bandwidth.
- Backup overhead: Large images make backups slower to generate, slower to restore, larger and more expensive in the cloud.
- Storage costs: If you use a managed database solution, the storage costs are higher than all alternatives.
The Alternatives
But which alternatives do we have? Well, this goes a bit beyond the scope of Vettabase consulting, but if you need at least a quick overview, here it is.
The simplest ways to make images accessible to web servers is to store them on a filesystem or on a CDN. It’s also possible to cache them with a technology like Redis, and make them available via HTTP through a web service.
This doesn’t mean that filesystems or CDNs have to be the primary storages for images. I think it’s a good idea to persist them into a more appropriate technology and, from there, send them to a filesystem or a CDN.
You might want to persist images into an S3 object storage. S3 is an open protocol originally designed for the AWS S3 service. Other cloud platforms include it, like GCP Cloud Storage. Open source implementations include MinIO and OpenShift Swift (with a middleware).
A less common alternative is NAS. NAS solutions are designed for file sharing within a company, but if we have one and it’s under-utilised, nothing prevents us from making it accessible from a web application for persistence of user-uploaded files. Since NAS servers typically come with their own light web server, we might be tempted to make images directly available from NAS. But I don’t recommend this, as NAS is not optimised for a real-time usage.
Mitigating the Performance Problems
Maybe you realised that you shouldn’t store images in a database, but for your project you’ll have to do so anyway. Or maybe you already store images in a database and you realise that this needs to change, but it will take time. There are some things that you can to do mitigate the performance problems cause by images in your database.
Normalise tables
Maybe you have an article
column, with an image
column. That’s the worse case for at least these reasons:
- Many developers and ORMs tende to use
SELECT *
, so the images will always be sent to the application, even when not needed (for example, even in the page that only shows the titles and dates). - Applications change over time. At some point you might be told that a second image should be added, then a third, then a PDF. Even assuming that you use a DBMS that can effortlessly add a column, the table will become more problematic.
- If the same image is used in more places, it will be loaded multiple times.
You need to store images in a separate table, and probably you even need a “many to many” relationship. See the following diagram (click to zoom):

Use invisible columns
If you don’t normalise the table, you want a way to exclude big columns from SELECT *
. Extracting the column will still be possible, but the queries will have to name the column explicitly. To achieve this, some DBMSs support a feature called invisible columns. The following syntax is for MariaDB:
> CREATE OR REPLACE TABLE article(
-> id UUID DEFAULT UUID(),
-> title VARCHAR(200) NOT NULL,
-> image BLOB INVISIBLE DEFAULT 'TEST',
-> PRIMARY KEY (id)
-> );
> INSERT INTO article (title)
-> VALUES ('LLMs prefer talking to cats');
> SELECT * FROM article;
+--------------------------------------+-----------------------------+
| id | title |
+--------------------------------------+-----------------------------+
| 00032f2f-55b1-11f0-a71d-06e7dafdc5e9 | LLMs prefer talking to cats |
+--------------------------------------+-----------------------------+
> SELECT *, image FROM article;
+--------------------------------------+-----------------------------+-------+
| id | title | image |
+--------------------------------------+-----------------------------+-------+
| 00032f2f-55b1-11f0-a71d-06e7dafdc5e9 | LLMs prefer talking to cats | TEST |
+--------------------------------------+-----------------------------+-------+
As you can see, the image
column is only shown if you name it explicitly. But you can still use *
, so minimal code changes are required.
Use the correct type
Every DBMS has a binary-safe text type for binary data. With MariaDB and MySQL they are BLOB
and similar types. With PostgreSQL, it’s BYTEA
. For other DBMSs, consult the documentation. Avoid TEXT
and VARCHAR
, because they might be less optimised and they might not be binary-safe.
Exceptions
I gave you several reasons why you shouldn’t store images in databases.
Are there exceptions? Yes, sometimes we have to do something less than optimal:
- Trivial websites. Maybe your website is trivial. Maybe it’s just a blog, it’s not visited by a lot of people, and it’s not particularly valuable. Sure, in theory storing images in a database is still not great, but in practice any alternative is not worth the effort.
- Too much work. Maybe the system or devops teams are undersized. They just can’t add one more technology, especially if they’re not familiar with it. This is more common than not.
- Prototypes. Maybe you’re just developing a prototype. This is a dangerous reason, because you are creating technical debt. Which is not necessarily bad, but will you ever repay this debt?
If you decide to keep storing images in your database, be aware of the risks and try to mitigate the performance problems as I described above. If you need help, keep in mind that you can contact us.
Federico Razzoli
0 Comments