Nowadays, there is a common misconceptions that we don’t need to optimise the technologies we use, because we can just scale out or scale up. Whereas scaling out means to add one or more instances of a technology, and scaling up means to run the technology on better servers. Both these options are made easy by the cloud.
But trying to scale a database that needs optimisation has many drawbacks, which I’m going to discuss here. Essentially:
- Scaling the number of size of instances might not improve performance;
- Adding instances can, in some cases, even make performance worse;
- The costs might levitate.
The drawbacks
This approach has several problems:
- Sometimes it just won’t work. You’ll have 2 slow servers instead of 1, or 4 CPUs with unreasonable IOWait time instead of 1.
- Costs will scale, too. I’ve heard many times that it’s ok, because instances are cheaper than human work hours. But is it true? Don’t forget that, when you upgrade an instance type, you might be increasing that cost for the whole lifetime of your company.
- Inefficiency is the difference between how an instance should perform, and how it actually performs because of poor configuration or improper usage. If the instances are configured to benefit from scaling, inefficiency might scale more than performance.
The “scale instead of optimising” is an exaggeration of the DevOps approach, which is a great methodology, but it works best for stateless technologies. This means that the drawbacks of this approach are bigger when you try use it with databases – the most stateful technology!
Let’s see some important examples of why scale instead of optimising often doesn’t work with databases.
Queries don’t scale automatically
The general rule is that a slow query remains slow even if you add nodes to a cluster, or even if you add vCPUs to instances. This is especially true for OLTP databases.
The type of queries that can potentially be split over multiple nodes of a cluster, or over multiple CPU cores, is analytical queries. For this reason, database technologies built for analytics tend to be capable of distributing queries in these ways. They can benefit both from scaling out, and scaling up.
However, distributing a query over multiple nodes or cores has a cost that is reasonable for analytics, but often unreasonable for OLTP. A node or thread needs to split the query over multiple work units, send it to other nodes or threads, receive partial results and store them in memory, and finally process the whole results to remove duplicates, aggregate values or order rows. Also, the communication between threads is fast, but the network communication between nodes is slow. However, for analytical queries that read several big tables the time saved can be several hours, and this is great for the poor human that run the query and needs the results to continue her job. But most OLTP queries are supposed to take less than 0.01 seconds, and this goal can hardly be achieved if a coordinator thread or node has to perform the activities described above.
It’s also a good idea to identify and eliminate useless queries, because they consume server resources, even if they always return zero rows or always fail with an error.
For an OLTP database, before adding new nodes or upgrading instance types, you should optimise your queries. Adding nodes won’t make slow queries faster, and will waster your budget.
Scaling your databases without optimising queries might not bring the benefits you expect. Typically, a query execution won’t span across multiple nodes.
Transactions don’t scale automatically
In a relational database, any read or write runs in a transaction. There is no such thing as “base version of the data”: data is versioned, and every version was generated by a transaction commit. It’s exactly like files in a git repository.
Under the hood, the database performs a lot of activities to give us some guarantees summarised by the ACID acronym:
- Atomicity: a transaction completely succedes or completely rolls back;
- Consistency: a transaction is applied altogether, bringing the data from state A to state B: no intermediate states are admitted;
- Isolation: each transaction can only see the data that were committed when the transaction started, as if transactions were executed sequentially by a single thread;
- Durability: committed data cannot be lost, even in the event of a database crash.
Developers shouldn’t see transactions as black magic that can’t be understood, or they don’t need to understand. The above guarantees are enforced by acquiring locks on rows so that they can’t be modified by two different transactions in inconsistent ways; and by acquiring snapshots, which means that old versions of data must be preserved while there are active transactions that use them. These activities can remarkably slow down a database, and can limit an application scalability. A special problem is deadlocks, when two or more transactions lock rows needed by each other. To avoid that clients wait forever, the database kills at least one of the involved transactions.
Optimising transactions means choosing the proper isolation level (or consistency level), keeping transactions as short as they can be, and various tricks that can be used to avoid or reduce deadlocks.
Scaling your databases without optimising transactions might not bring benefits. Locks can be even more harmful in a database cluster.
Storage space waste
A typical production database wastes disk space by storing items that at not useful anymore, or were never useful. The most typical are:
- Obsolete tables and columns;
- Unnecessary historical data;
- Tables that are sparse matrixes, with most columns unset for most rows;
- Unused indexes;
- Redundant indexes, even if used (another existing index could be used instead);
- The primary key size is important for technologies that add the primary key to every index (InnoDB, PostgreSQL for clustered tables, etc).
Storage waste affects databases in many ways:
- Bigger indexes and tables are slower;
- Storage has a cost;
- The time to take a backup increases (RTO);
- The time to restore a backup increases (MTTR);
- Backup storage cost increases;
- Backup archiving cost increases;
- Data pipelines are usually affected, in terms of speed, reliability, cost of moving data between clouds.
You’ll be able to scale your databases even if you’re wasting storage space. But you will spend more money for nothing.
Suboptimal configuration
Every dataset and every workload are different, and have different performance characteristics. Some workloads require massive consistency configuration, at the expense of performance if necessary. Some workloads require very high performance, at the price of small inconsistencies. Some datasets can entirely fit in cache. Many datasets don’t, but the most often read data should be in a cache. Sometimes complex joins on big tables require big buffers. Sometimes buffers should be small, to allow the highest amount of very simple queries per second. A read-ahead buffer is important for rotating disks, but not for SSDs.
It is important to configure the database servers properly. The configuration that makes a server fast with its workload might be terrible for another server, which runs a different workload. To make things more complicated, datasets and workloads change over time. A configuration that proves good today might not work well in two months.
For databases, several technologies need to be configured properly:
- The database server itself and the underlying OS;
- Load balancers and the underlying OS;
- Service discovery tools, like Consul or Keepalived.
Without configuration tuning, scaling might bring little or no benefits. In any case, you’ll pay for more instances, or bigger instances, potentially for the whole life of your database. But you might simply need some configuration tuning.
Distributing a workload is not trivial
Finally, even if you optimise everything else, you can only benefit from scaling your databases if you can distribute the workload properly amongst the nodes of a cluster.
You should use proper load balancers, rather than relying on an application library. In the MariaDB and MySQL world, it could be ProxySQL. In the PostgreSQL world, it could be pgpool.
Your load balancer should be SQL-aware. It should be able to understand transactions, because all the queries that are part of a transaction must be directed to the same backend node. Similarly, it should understand things like temporary tables and user variables, because they will only exist in one node, but a client that creates them will probably try to use them in a later query.
As a general rule, a load balancer should send writes to a master and equally distribute reads to replicas. When a transaction starts or a temporary table is created, the load balancer doesn’t know if one of the following queries will be a write, so it will send all queries to that client to the master.
Applications should be developed keeping these mechanisms in mind. Developers should follow some rules:
- Use transactions whenever it’s necessary, to avoid inconsistencies caused by the workload distribution.
- Try to keep transactions brief, splitting them when possible;
- Even better, when possible merge multiple SQL statements into one;
- Avoid ambiguous situations, when the load balancer has to send a sequence of queries to the master even if they don’t write anything.
In most cases, you shouldn’t try to distribute writes. Writes involve all nodes, so they can’t scale. Normally, this is not a problem. When it is, you might want to consider a data sharding solution. We’ll not discuss sharding here, because it’s a complex topic, outside of the scope of this article.
When you scale, check if your workload is well distributed. It’s not uncommon to see some replicas, or even all replicas, mostly unused. In this case, scaling can’t bring benefits.
Conclusions
Is scaling out and/or up the right thing to do for you? It might be or not, depending on many factors. But before considering scaling up or out, you should optimise your queries, your schemas and your configuration. After that, you might find out that you don’t need to pay for more or bigger instances. Even if you still need scaling, you can avoid paying for a crazy amount of replicas, or avoid paying for crazy big instances.
Federico Razzoli
0 Comments