Confused CAP Arguments

| 6 Comments | No TrackBacks

A bit of a Twittergasm lately over the CAP Theorem, with relevant blog posts

I am confused. I've narrowed it down to 7 matters.

Confusion #1: Changing system scope from underneath our feet
The biggest confusion I have is that the scope of "distributed system" under the CAP theorem seems to have changed, silently.

If you look at the original papers:

All of them seem to scope the CAP theorem to the server side of a client/server relationship. That is, it's assumed that the distributed system under consideration is a client/server architecture, but the server-side is distributed in some way. This is a reasonable assumption, since the Web is a client/server architecture variation, and CAP was originally about the tradeoffs in huge scale web sites.

But practically speaking, this is particularly important when considering Partition Tolerance. In the original papers, CA (Consistent, Available) was absolutely common, and in fact, the most common scenario, with Brewer giving examples like cluster databases (on a LAN), single-site databases, LDAP, etc. The most popular commercial cluster databases today arguably retain this definition of CA, such as Oracle's RAC.

But now, AP (availability, partition tolerance) advocates are questioning if CA distributed systems are even possible, or merely just something that insane people do. It's a very strange, extreme view, and I don't think it offers any enlightenment to this debate. It would be good to clarify or at least offer two scopes of CAP, one within a server scope, and another in a system-wide scope.

Confusion #2: What is Partition Intolerance?

This is never well defined. Like lactose intolerance, the results, can be, err, unpredictable. It seems to imply one of: complete system unavailability, data inconsistency, or both. It's also unclear if this is recoverable or not. Generally it seems to depend on the implementation of the specific system.

But because of this confusion, it's easy to let one's mind wander and say that a CA system "acts like" an AP or CP system when experiencing a partition. That leads to more confusion as to whether a CA system really was just a CP system in drag all along.

Confusion #3: Referring to Failures Uniformly

It seems to be a simplifying assumption to equate failures with network partitions, but the challenge is that there are many kinds of failures: node failures, network failures, media failures, and each are handled differently in practice.

Secondly, there are systems with many different kinds of nodes, where they tolerate partitions in one kind of node, but not another kind, or just maybe just not a "full" partition" across two distinct sets of nodes. Surely this is an interesting property, worthy of study?

Taking Oracle RAC or IBM DB2 pureScale, as an example. It actually tolerates partitions of many sorts - cluster interconnect failures, node failures, media failures, etc. What it doesn't tolerate is a total network partition of the SAN. That's a very specific kind of partition. Even given such a partition, the failure is generally predictable in that it still prefers consistency over availability.

Confusion #4: Rejecting Weak vs. Strong Properties

Coda hale sez: "You cannot, however, choose both consistency and availability in a distributed system [that also tolerates partitions]".

Yet this seems very misleading. Brewer's original papers and Gilbert/Lynch talk about Strong vs. Weak consistency and availability. The above statement makes it sound as if there is no such thing.

Furthermore, most distributed systems are layered or modular. One part of that system may be CA-focused, the other may be AP-focused. How does this translate to a global system?

Confusion #5: Conflating Reads & Writes

Consider the Web Architecture. The Web is arguably an AP system for reads, due to caching proxies, but is a CP system for writes.

Is it useful to categorize a system flatly as binary one or the other?

Confusion #6: Whether AP systems can be a generic or the "default" database for people to choose

AP systems to my understanding are application-specific, as weak consistency is difficult to correct without deep application knowledge or business involvement.

The question and debate is, what's the default database for most IT systems... should it be a traditional RDBMS that's still a CA system, like MySQL or Oracle? Or should it be an AP system like a hacked-MySQL setup with Memcached (as AP), or Cassandra?

When I see claims that CA advocates as "not getting it", or "not in their right mind", I have a severe case of cognitive dissonance. Most IT systems don't require the scale of an AP system. There are plenty of cases of rather scalable, highly available CA systems, if one looks at the TPC benchmarks. They're not Google or Amazon scale, and I completely agree that an AP system is necessary under those circumstances. I even agree that consistency needs to be sacrificed, especially as you move to multi-site continuous availability. But it's very debatable that the basic database itself should be CA by default or AP by default.

For example, many IT shops have built their mission-critical applications in a CA manner within a single LAN/rack/data-centre, and an AP manner across racks or data-centres via asynchronous log shipping. Some can't tolerate any data loss and maintain a CP system with synchronous log shipping (using things like WAN accelerators to keep latency down). And not just for RDBMS - this is generally how I've seen Elastic Data Cache systems (Gigaspaces or Coherence or IBM WebSphere ExtremeScale) deployed. They deal with the lack of partition tolerance at a LAN level through redundancy.

It's useful to critique the CA single-site / AP multi-site architecture, since it's so common, but it's still not clear this should be replaced with AP-everywhere as a useful default.

Confusion #7: Why people are being a mixture of dismissive and douchey in their arguments with Stonebraker

Stonebraker's arguments make a lot of sense to IT practitioners (I count myself as one). He speaks their language. Yes, he can be condescending, but he's a rich entrepreneur - they get that way. Yes, he gets distracted and talks about somewhat off-topic matters. Neither of those things implies he's wrong overall. Neither of these things imply that you're going to get your point across more by being dismissive or douchey, however good it feels. (I'm not referring to anyone specifically here, just a general tone of twitter and some of the snark in the blog posts. )

One particular complaint I find rather naive - that he's characterizing CAP theorem (more AP system) proponents as only using CAP theorem to evaluate distributed system availability. Clearly that's not true, but on the other hand, it is kind of true.

Here's the problem: when I hear venture capitalists discussing the CAP theorem over cocktails because of their NoSQL investments, or young newly-graduated devs bugging their IT managers to use Cassandra, you can bet that plenty of people are making decisions based on a twisted version of the new fad, rather than reality. But that's how every technology fad plays out. I think Stonebraker has seen his share of them, so he's coming from that angle, from my view. Or he could just be an out of touch dinosaur like many seem fond of snarking. I personally doubt it.

No TrackBacks

TrackBack URL:


Problems with CAP link seems broken.

I mostly agree with what you say here. As I'm sure you've noticed, I'm among those who believe that CA systems are possible, but that they have highly undesirable behavior when partitions do occur. The key is that they do. I used to work on HACMP, specifically on the network monitoring and routing pieces, and in even the most carefully designed local networks there would still be glitches that resulted in actual or effective partitions. With sufficient redundancy these could be reduced to a low enough frequency that the cost of such failures was less than the cost of implementing a different kind of system. IBM wouldn't have sold the tens of thousands of HACMP licenses that they did if CA weren't real and useful.

Nonetheless, the possibility of partition does still exist and tradeoffs that have changed in the fifteen years since then leave a lot more people in the "AP world" than the database die-hards seem to think. It's not just Facebook and Google. As I pointed out in one of my posts, there are dozens of applications *within* Facebook with a million-plus monthly users, which surely puts them at a scale where the tradeoffs change. Those are matched by many more free-standing web applications, which in turn are matched by many more private ones. This might not be the majority model yet, but it is most certainly a common one.

All that said, I think the model of CA within a data center and AP between is not a bad one. In a sense, you're treating a site as a single node with a very low probability of failure and internal latency, but with a much higher latency and probability of disconnection from other nodes.

I'm working with a customer whose major system uses HACMP (and a couple other technologies, leading to a "not very HA" result). They've had major outages lately
Yeah, everything went to hell in a handbasket after I left. ;) Doesn't every engineer think that? Seriously, though, IBM really messed it up. They threw away most of the Clam codebase, which was ugly but worked, in favor of "Phoenix" which came out of some politically connected group within IBM. I'm sure it was more elegant and so on, but the package didn't include developers who knew how to make things work in the real world. We'd all gone to Revivio by then.
Brewer said the same thing back in PODC 2000, they use CA systems to hold the money.
Hm. I have the slides in front of me right now. The closest I can find is where he says, on slide 18, "we use ACID for user profiles and logging." ACID != CA. In fact, I think most of the systems you characterize as CA are actually CP underneath. With regard specifically to Oracle, for example, we ran it on top of HACMP all the time. The result might have looked like CA to the Oracle folks, but I'd call it "CP in drag" when they're relying on a separate cluster layer (nowadays it'd be virtualization) to make sure quorum enforcement and failover occur before resource-blocking chains engulf the system.

Leave a comment

About this Entry

This page contains a single entry by Stu published on October 27, 2010 11:13 AM.

Still Hunting for that Silver Lining was the previous entry in this blog.

Put on your Thinking CAP is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

About Me
(C) 2003-2011 Stuart Charlton

Blogroll on Bloglines

Disclaimer: All opinions expressed in this blog are my own, and are not necessarily shared by my employer or any other organization I am affiliated with.