MariaDB stored procedures and why they should be improved was the title of my talk at MariaDB Server Fest 2023 in Helsinki. My point was that an improved stored procedures implementation would bring benefits to individual users, the MariaDB community, and the two organisations behind it: the Foundation and the Corporation. Some interesting discussions followed that talk. But human memory works like InnoDB buffer pool: it tends to evict the least accessed elements, like the interesting discussions that never had a follow-up. So I decided to write one or more articles to advocate the use of stored procedures, not just with MariaDB, but with all database technologies that support them.
This article is a list of the main reasons and the benefits of stored procedures.
Avoid data roundtrip
If your application processes data that is read from a database, that data is sent over a network. Usually this is not the problem, because modern hardware is fast and data read is usually small.
There are exceptions. Imagine this situation: all DBMSs implement the arithmetic mean, but not other types of averages, such as the harmonic mean of the quadratic mean. You need one of them, so you implement it in your application. Then you need to read one million values and calculate a mean. One million rows will travel on a network just to get a single aggregate value. This is a perfect use case for a stored procedure: run the calculations in your database server, and then send a single number over the network.
Short lock time
Some operations run in a transactions that involve multiple SQL statements, and acquire locks that are held until the end of the transaction. This is often inevitable, but for medium to long transactions locking is a problem, especially if it occasionally leads to deadlocks.
If queries are fast, the overhead of the communications between the database server and the application makes the difference. But if the whole transaction (or a relevant part of it) can run inside a stored procedure, the locking time might remarkably reduce. This will also reduce the chances of a deadlock.
Query and transaction optimisation
Every application has slow queries that need to be optimised. Often, optimising a query means adding indexes, so there’s no need to touch application code. But sometimes queries need to be modified. Other times it’s necessary to edit the transaction commands, set variables, and so on. But most developers use ORMs, and sometimes can’t easily make the necessary changes. This slows down performance improvements, and generally ends up worsening the relationships between DBAs and developers.
What if those who can optimise the queries, namely the DBAs, could edit the SQL by themselves? This would fix the above mentioned problems. Using stored procedures definitely makes it possible. A certain query needs to be modified in a way that is not permitted by the ORM? Optimising a certain transaction requires changing some session variables? DBAs can do this in a set of stored procedures.
Many developers don’t like this approach. Until they are told what it is: a plain simple API. The DBAs will maintain this library, and they’ll call the procedures they need. Even the smallest block of their code uses at least one API. They like it, because it’s a magic functionality that they use, without even wondering how it works. Which is exactly the way they want to work with databases.
Cross-application reusable code
Sometimes the same database is accessed by multiple microservices, or multiple applications. This is the case, for example, when a monolithic application is mostly replaced by microservices, but a monolithic core with a central database remains. Or if an ORM produces data that is also used by a custom application. And sometimes different applications use separate databases, but they work with data of the same types, like strings in a certain format, or certain JSON objects.
In all these cases, different applications will have to do the same things with the same types of data: extracting certain information, validating data, transforming data, and so on. Having different teams re-inventing exactly the same wheel, possibly in different languages, is an unjustified waste of resources and a huge source of bugs. If you’re in this situation, you should consider maintaining a stored procedure library that is shared across multiple applications, and possibly multiple databases. Validation, transformations and information lookup will be implemented only once.
More granular security
Most databases allow to set permissions at various levels: globally, at the database level, at the table level, or even for individual columns. Some databases even allow to set row-level security policies so that, for example, a manager may query an
employees table but will only see the employees who work in her department. Other databases, like MariaDB and MySQL, don’t support row-level security.
If your security model is complex, it might be worth building a library of stored procedures that exposes the operations that applications can legitimately perform, and nothing else. Each application will be granted the permission to execute some or all stored procedures in the library, but not to interact with the tables directly.
This approach has some advantages over implementing complex security policies and permissions:
- More complex logic can be implemented;
- The stored procedures act as a documentation of which operations the applications are allowed to perform;
- No need to use an exotic feature that most DBAs are not familiar with and most tools don’t support.
Leveraging open source
I already mentioned that it can be a good idea to write stored procedure libraries that can be used from several applications or microservices. This is useful to implement logic that is internal to your company. But don’t forget that many problems are common to many organisations, and that is where open source kicks in!
Other organisations might publish open source libraries to easily do what you need to do. If no one did, you might want to publish an open source library. In this way, external contributors might find and fix bugs, improving the library quality, or implement new features, making the library even more useful.
Examples of common problems that can be solved by open source stored procedure libraries include data validation, importing from or exporting to different formats, and data mining.
Feel free to comment to let us know your motivations to develop or use stored procedures. I might have missed some good reasons here. If I see good contributions in the comments, I’ll be happy to expand this article and credit everyone who provided ideas.