Cassandra login error: Cannot achieve consistency level LOCAL_ONE

Last updated on 18 September 2021

Do you get this error when applications try to establish a connection to Cassandra?

2019-08-10 14:03:51,698 ERROR [com.datastax.driver.core.Cluster] (Production Cluster-reconnection-0) Authentication error during reconnection to 10.0.10.10/10.0.10.10:9042, scheduling retry in 32000 milliseconds
com.datastax.driver.core.exceptions.AuthenticationException: Authentication error on host 10.0.10.10/10.0.10.10:9042: Error during authentication of user app-01 : org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_ONE

This is confusing. Maybe it’s true that one nodes is down. But this cluster has multiple nodes, and you are contacting one that is working fine. So why does it say that  LOCAL_ONE  consistency level cannot be achieved?

Even more: you didn’t ask for that consistency level in any part of the application code. Up to that point, you’re just trying to connect!

TL;DR: If you just want to copy/paste a solution, jump to How to fix the problem.

Blixy is fixing a Cassandra cluster that rejects any login saying UnavailableException: Cannot achieve LOCAL_ONE

Why it happens

Whenever a client connects, Cassandra runs a query against its system tables to check if the provided credentials are correct. This query is ran with the  LOCAL_ONE  consistency level.

This explains part of the error. But probably you still don’t understand why this query fails. LOCAL_ONE  tells the node that received the query: this query must be answered by a single node, which must be located in the same datacentre where you are. This differs from other consistency levels in these ways:

  • There is no need to contact other nodes to check if data is correct. So there is less latency, compared to any consistency level stricter than  ONE .
  • With consistency levels stricter than ONE , If the data to return is not fully consistent, a small repair is triggered. This operation is more lightweight if it doesn’t involve traffic between datacentres.
  • LOCAL_ONE  says that the node that answers your query must be local. With the  ONE  consistency level there is no such restriction, so the query could cause cross-datacentre traffic (higher latency).

This may still be confusing, if you’re new to Cassandra: the node that answers the query should be the one you contacted, right? Well, not necessarily.

As you know, in Cassandra all keyspaces have a replication factor – a number that indicates how many nodes contain each row. The rows are then automatically – and hopefully evenly – distributed across the nodes, in a way that you can’t control. If the replication factor is equal to the number of nodes, all nodes have all the data and can answer any query. But if it’s lower, often nodes cannot find the results themselves, and will send the query to a node that has the data. In this case, it is possible that all nodes containing the data you are looking for are down; thus, even if you contacted a running node, it cannot return results. This risk is higher with stricter consistency levels. The corner case is  ALL , which means that all nodes containing the desired rows must be up and agree on the results.

For example, check the figure below. Each node (1, 2, 3) contains three sets of rows (A, B, C). If a client contacts node 1 with a query that returns rows from set C, node 2 or node 3 will have to be involved in the query execution.

The data used for authentication is located in the  system_auth  keyspace. Its default replication factor is 1: so each data portion is located on one node, and if that node is down  LOCAL_ONE  cannot be satisfied.

Now check the figure below. If any node is down, part of the rows are not available.

Why is there no redundancy by default? Probably because at the very beginning of its life, every cluster consists of one node.

How to fix the problem

Just run this command:

ALTER KEYSPACE system_auth
WITH replication = {
'class': 'NetworkTopologyStrategy',
'<datacentre_name>': <number_of_nodes>
};

This CQL command will increase the replication factor to the number of nodes. Every node will contain a copy of all the rows in the  system_auth  keyspace, and therefore each node will allow clients to login – even in the unfortunate case that every other node is unreachable. This is typically what you want.

This keyspace is typically very small, so this is a lightweight operation.

If you are having this problem, it may be an indicator that you didn’t configure certain aspects of your Cassandra cluster optimally. You may want to consider a Cassandra Health Check from Vettabase.

Federico Razzoli

Did you like this article?

About Federico Razzoli

Federico is Vettabase Ltd director, and he's an expert database consultant specialised in the MariaDB and MySQL ecosystems.

Leave a Reply

Your email address will not be published. Required fields are marked *

*