The CAP Theorem's Common Misconception

A couple of days ago I tweeted about the CAP theorem which sparked some interest and a small conversation on Twitter. Since tweets tend to disappear in the void pretty quickly, I thought I also put this here:

The Misconception around the CAP Theorem

The CAP theorem in distributed systems states that you cannot satisfy more than two of these three properties:

Consistency
Availability
Partition Tolerance

But here’s what’s commonly misunderstood about the CAP theorem:

Being Partition Tolerant or not is not really a choice. In distributed systems, partitions may occur at any time. It’s more of a choice what to give up during a network partition: Consistency or Availability.
Giving up Availability implies a part of your system will become unresponsive or return an error during a network partition.
Giving up Consistency implies your system might return stale data (=dirty reads) because data was written to some nodes but not yet to all due to a network partition.
Consistency according to CAP is Linearizability – not what the ‘C’ in ACID means. Linearizability enforces a linear order to operations, e.g. read(key) should return the latest written value of key.
Consistency in A[C]ID is ensured through Serializability which is a mechanism to ensure concurrent transactions in databases can be split up into multiple serial executions. No strict order is required.
Whether a system is actually CP or CA is often very hard to tell. It depends on the configuration of the system, e.g. number of replicas, consensus algorithms, quorum sizes, bugs, etc.
The CAP theorem is a starting point for characterizing systems, but to properly understand guarantees of distributed systems, you have to dive deeper. Here’s an excellent article on this by @martinkl: martin.kleppmann.com/2015/05/11/ple…

Hope you found this useful. Let me know on Twitter if you have any remarks.