Eric Brewer on The CAP Theorem — Episode 227 on se-radio.net — tl;db series

Jaroslaw Kijanowski
SoftwareMill Tech Blog
6 min readFeb 5, 2020

--

The idea of the tl;db (too long; didn’t bother) blog series is to present blog posts, podcasts and conference talks in a distilled form. More about it here.

The first tl;db features the podcast about the CAP (consistency, availability, partition tolerance) theorem discovered by Eric Brewer.

What kind of problems led to discovering the CAP theorem in the year 2000?

At that time there was a shift towards focusing on availability. Before internet a big database system, like in a bank, typically got to go on maintenance mode at night to do batch processing — auditing records and generating reports. With the internet we started to see live services requiring higher availability.

The insights about availability were already known in the 80’s to the network community, but haven’t been articulated in the database literature.

How was the CAP theorem meant to be understood?

Availability tends to correspond with revenue and therefore systems were making trade-offs that favored availability over consistency.

When proposed on the PODC conference on a keynote, the CAP theorem stated:

You can have at most two of Consistency, Availability and Partition Tolerance properties for any shared-data system.

Building a distributed system we’d like to have:
- consistent data across replicas,
- the system being available,
- the system works when the network gets partitioned {groups of participants stop seeing each other; they’re living on islands}.

According to the CAP theorem, you have to pick two of these properties. Although this is not the full truth, it’s a required starting point to understand the nuances later on.

Having this two out of three constraint, there are three types of systems one can build: CP, AP and {not really} CA.

An AP system is highly available and not necessarily consistent — like the internet (stale web pages, DNS entries).

Databases with ACID are focused on consistency and forfeit availability to be partition tolerant — are therefore CP.

Having a database system with a primary and replicas a network partition either requires to take down the whole system to guarantee consistency (in case of a CP system) or the replicas become inconsistent (in case of an AP system). In other words, choosing A in favor of C, during a network partition, the system can loose data if the master goes down and the state is recovered from a replica, which due to the partition could not be up to date.

Nowadays vendors of different databases try to be explicit about where they position their product on the triangle of the CAP theorem.

This is the end of one size fits all database.

It depends on the type of data being stored what trade-offs can be made.

What thought process results in a more robust system?

Think of what happens on a timeout. Should the system try again to complete the operation (required for consistency) or give up and forfeit availability. A third option is to escalate the problem to the user to ask what to do.

How can a client distinguish between a timeout and a network partition?

You can’t reliably detect a network partition and it often doesn’t matter. If the timeout expires for whatever reason and you’re unable to complete an operation to achieve consistency, you have to decide wether:
- to delay, where delaying forever results in being unavailable or
- to give up typically forfeiting consistency.

What are BASE systems in contrast to ACID ones?

BASE stands for Basically Available Soft state Eventually consistent. These are the properties you get, when choosing availability over consistency. It’s the opposite of ACID.

Real systems are a mix of BASE and ACID. Each component might make different availability and consistency trade-offs. Billing needs to be consistent but user facing parts need to be highly available.

Also consistency may be given up for performance reasons, when setting up a cache which introduces a time window of inconsistency.

What is Eventual Consistency?

It means, the system can become temporarily inconsistent, typically during a network partition. The idea is that after reconnecting the system has enough data to reconstruct, what the right end state is.

Partitions should be rare and short. When favoring availability, it’s possible to create eventually consistent systems, which will be consistent most of the time.

What are offline or disconnected systems?

An ATM or Git source version control are examples of disconnected systems. They favor availability and try to provide eventual consistency by fixing it up, like syncing work done on local source code files when connectivity to the Git server can be established. It’s not guaranteed that all merges will always work — it’s not always possible to fix it up. Occasionally conflicts have to be resolved by hand. In such cases availability has prevented consistency.

How can a system recover from a network partition?

Joseph Hellerstein and Peter Alvaro in their paper Keeping CALM: When Distributed Consistency is Easy on monotonic systems claim that if the value of a variable goes only into one direction then it’s easy to reconcile it. When merging, the system picks the higher one.

And in the world of internet shopping carts:

… so long as the contents of the cart are represented as a set and the internal ordering of its elements is ignored. If two replicas disagree about the contents of the cart, their differing views can be reconciled simply by taking the union of their respective sets.

The idea is, that (in case of a failure) users rather want to remove items from the basket again, instead of adding them one more time.

Banking operations on the other hand are not commutative:

In mathematics, a binary operation is commutative if changing the order of the operands does not change the result.

Even though there are only plus and minus operations, the minus ones have a bounce check. When a customer withdraws money from a disconnected ATM, the bank still may allow for the transaction even not being able to prevent an overdraft. In the rare cases when this happens, there may be limits for the ATM working in offline mode and the bank resolves it later taking a calculated risk. This scenario again favors availability over consistency, or rather shows a system, that has been designed with eventual consistency in mind.

Modern distributed systems are achieving consistency based on audit records and compensating transactions.

What has changed during the years since the CAP theorem was proposed?

Lots of database projects emerged to explore the possibilities in terms of CAP.

Also further research by Peter Bailis and others shows, that ACID transactions do not have to harm performance that much.

While the CAP Theorem is often interpreted to preclude the availability of transactions in a partition-prone environment, we show that highly available systems can provide useful transactional semantics, often matching those of today’s ACID databases. We propose Highly Available Transactions (HATs) that are available in the presence of partitions. HATs support many desirable ACID guarantees for arbitrary transactional sequences of read and write operations and permit low-latency operation.

It’s a matter of weakening of consistency and it’s doable if invariants can be kept local. For example, providing a global unique number is based on a global invariant. But it can be made local by letting participants of a system return a number only from a given range. Applying this technique, two participants won’t be able to return colliding numbers.

Same approach is used by network hardware vendors issuing MAC addresses — consistency is achieved among them although they do not have to communicate with each other.

Still open questions remain like how to do eventual consistency and reconciliation and compensation.

My final thoughts:

The follow-up paper, CAP Twelve Years Later: How the “Rules” Have Changed summarizes the confusion around CAP:

Because partitions are rare, CAP should allow perfect C and A most of the time, but when partitions are present or perceived, a strategy that detects partitions and explicitly accounts for them is in order. This strategy should have three steps: detect partitions, enter an explicit partition mode that can limit some operations, and initiate a recovery process to restore consistency and compensate for mistakes made during a partition.

A different point of view can be found on Martin Kleppmann’s blog. Leaving following quote to increase your curiosity:

The CAP theorem is too simplistic and too widely misunderstood to be of much use for characterizing systems. Therefore I ask that we retire all references to the CAP theorem, stop talking about the CAP theorem, and put the poor thing to rest. Instead, we should use more precise terminology to reason about our trade-offs.

Credits go to:

Robert Blumen — host
Eric Brewer — guest

The full experience is available at se-radio.net.

Giving claps shows your interest in continuing my effort with tl;db series.

--

--

Java consultant having experience with the Kafka ecosystem, Cassandra as well as GCP and AWS cloud providers. https://pl.linkedin.com/in/jaroslawkijanowski