Schedule Meeting

a

Cassandra: How to run a query with Maximum Consistency

by | Dec 8, 2022 | Cassandra

Need Help?  Click Here for Expert Support

Cassandra consistency levels are a tricky concept, until you familiarise with them. They’re based on a simple consideration: not all data and not all queries require the same level of correctness.

By default:

  • data changes are written to only one node for better performance;
  • queries involve only one node to improve performance;
  • so, queries only “see” the data changes sent to that particular node.

Sometimes this is not acceptable. That’s why we can change the consistency level for an individual read, for an individual write, or for the current session.

What if we want to make sure that some or all queries are completely correct, so that no stale data is returned? We can adjust the consistency levels. Read the rest of this article to learn how they can be configured.

Understanding the quorums and data centres

A quorum is the smallest majority of a group. It’s often described as 50% + 1. Or, if you like precise expressions, it should be described in this way:

floor(population / 2 + 1)

In a Cassandra cluster, a quorum is the 50% of the nodes + 1.

But we’re talking about logical nodes here. Because keyspaces have a replication factor. A replication factor of 3 means that every row must exist in 3 nodes. A higher replication factor increases data redundancy, but it slows down reads and writes that use a certain replication factor.

If you’re using a single data centre, the number of logical nodes is a keyspace replication factor. Let’s admit that you’re using multiple data centres, perhaps logical ones. Then the number of logical nodes is the sum of replication factors of all data centres.

The Maximum Consistency formula

What is the most intuitive way to obtain Maximum Consistency? Setting the ALL consistency level for all reads and writes. But this is the slowest option, as it implies a lot of unnecessary work under the hood.

Instead, you’ll need to consider:

  • the number of nodes affected by the writes you’re interested in;
  • the number of nodes participating in the read operation you’re interested in.

If you want to obtain Maximum Consistency, you need to make sure that the sum of these numbers is higher than the number of logical nodes.

Single data centre

Sacrificing write speed:

  • Writes consistency level: ALL
  • Reads consistency level: ONE

Sacrificing read speed:

  • Writes consistency level: ONE
  • Reads consistency level: ALL

Distributing the overhead between writes and reads:

  • Writes consistency level: QUORUM
  • Reads consistency level: QUORUM

Multiple data centres

The above solutions also work with multiple data centres.

However, if we can send related writes and reads to the same data centre, we can use this combination:

  • Writes consistency level: LOCAL_QUORUM
  • Reads consistency level: LOCAL_QUORUM

What if this solution satisfies our redundancy requirements but we can’t make sure to send related writes and reads to the same data centre? In this case writes can use LOCAL_QUORUM and reads can use EACH_QUORUM.

If this is not enough for our redundancy requirements, writes can use EACH_QUORUM and reads can use LOCAL_QUORUM.

If writes use ALL, reads can use LOCAL_ONE. Or the other way around. Note, this may be a bit risky. The query will fail if the local data centre doesn’t have a healthy node able to accept it.

One, two, three

You may know that Cassandra supports the ONE, TWO and THREE consistency levels. Can we use them to achieve the Maximum Consistency for clusters of 5 logical nodes or less? We can, but these consistency levels are not flexible. They specify a number that remains constant if the number of nodes increases or decreases. If it decreases, this may result in failures. If the number of nodes increases, it will lead to insufficient redundancy .

Should we aim for Maximum Consistency with Cassandra?

This is a reasonable question that we should ask ourselves.

Maximum Consistency means the maximum consistency that Cassandra can guarantee – this is a term I made up, you won’t find it elsewhere. But even if you use Consistency Levels as explained in this article, you might still occasionally face problems with lost Tombstones. In practice, some deletions may be “forgotten” under some circumstances, and erased data may resurrect. This is a topic for another post.

Cassandra is designed to be fast with some compromises that affect data correctness. Trying to achieve the Maximum Consistency means that we don’t take advantage of Cassandra unique characteristics, and we use it for a purpose it wasn’t designed for.

If this is an exception, it is not a problem. We can use relaxed consistency levels for most of our writes and/or queries, and occasionally achieve the Maximum Consistency.

For more practical advice on how to use Cassandra’s consistency levels, see my older post Choosing Cassandra consistency levels.

Federico Razzoli

All content in this blog is distributed under the CreativeCommons Attribution-ShareAlike 4.0 International license. You can use it for your needs and even modify it, but please refer to Vettabase and the author of the original post. Read more about the terms and conditions: https://creativecommons.org/licenses/by-sa/4.0/

About Federico Razzoli
Federico Razzoli is a database professional, with a preference for open source databases, who has been working with DBMSs since year 2000. In the past 20+ years, he served in a number of companies as a DBA, Database Engineer, Database Consultant and Software Developer. In 2016, Federico summarized his extensive experience with MariaDB in the “Mastering MariaDB” book published by Packt. Being an experienced database events speaker, Federico speaks at professional conferences and meetups and conducts database trainings. He is also a supporter and advocate of open source software. As the Director of Vettabase, Federico does business worldwide but prefers to do it from Scotland where he lives.

Recent Posts

Services

Need Help?  Click Here for Expert Support

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *