cap: All content tagged as cap in NoSQL databases and polyglot persistence
Friday, 7 January 2011
Google Megastore: Scalable, Highly Available Storage for Interactive Services
A new paper from Google:
Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability.
We provide fully serializable ACID semantics within fine-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters.
This paper describes Megastore’s semantics and replication algorithm.
Megastore seems to be the solution behind the Google App Engine high replication datastore.
Emphases are mine.
Original title and link: Google Megastore: Scalable, Highly Available Storage for Interactive Services (NoSQL databases © myNoSQL)
via: http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf
Wednesday, 5 January 2011
Google App Engine High Replication Datastore
The High Replication Datastore provides the highest level of availability for your reads and writes, at the cost of increased latency for writes and changes in consistency guarantees in the API. The High Replication Datastore increases the number of data centers that maintain replicas of your data by using the Paxos algorithm to synchronize that data across datacenters in real time.
Still not completely decentralized a la Amazon Dynamo.
Original title and link: Google App Engine High Replication Datastore (NoSQL databases © myNoSQL)
via: http://googleappengine.blogspot.com/2011/01/announcing-high-replication-datastore.html
Tuesday, 14 December 2010
Errors in Database Systems Still Must Consider Network Partitions
Dan Weinreb on Michael Stonebraker’s network partition scenarios from the overly discussed paper about ☞ Erros in Database Systems, Eventual Consistency, and the CAP Theorem :
If you mean a few computers connected together by an Ethernet, with redundant hardware all around, then the chance of a failure of the network itself is relatively low. However, real-world data centers with a relatively large number of servers rarely work this way. The problem is that a real network is very complicated. It depends on switches at both level 3 (routers) and level 2 (hubs). Situations can arise in which pieces of the network are mis-configured by accident; these can be hard to find due, ironically, the very redundancy that was added to avoid failures. In particular, there is no way to make any kind of guarantee about the latency within the network, nor the likelihood that a packet will make it from its source to its destination.
John Hugg (VoltDB) made an interesting comment:
Step 1: Figure out how much the network partitions that will cause your system to be unavailable will cost.
Step 2: Figure out how much giving up consistency will cost.
Step 3: Make your CAP decision base on these estimated values.
Original title and link: Errors in Database Systems Still Must Consider Network Partitions (NoSQL databases © myNoSQL)
via: http://danweinreb.org/blog/errors-in-database-systems-still-must-consider-network-partitions
Friday, 22 October 2010
CAP Continued: Someone is Wrong on the Internet
Once again closing the circle on this last week’s CAP theorem discussions, Jeff Darcy comments on Stonebraker’s clarifications on the CAP theorem:
In my experience, network partitions do not happen often. Specifically, they occur less frequently than the sum of bohrbugs, application errors, human errors and reprovisioning events. So it doesn’t much matter what you do when confronted with network partitions. Surviving them will not “move the needle” on availability because higher frequency events will cause global outages. Hence, you are giving up something (consistency) and getting nothing in return.
So, because network partitions occur less than some other kind of error, we shouldn’t worry about them? Because more people die in cars than in planes, we shouldn’t try to make planes safer? Also, notice how he says that network partitions are rare in his experience. His experience may be vast, but much of it is irrelevant because the scale and characteristics of networks nowadays are unlike those of even five years ago. People with more recent experience at higher scale seem to believe that network partitions are an important issue, and claiming that partitions are rare in (increasingly common) multi-datacenter environments is just ridiculous. Based on all this plus my own experience, I think dealing with network partitions does “move the needle” on availability and is hardly “nothing in return” at all. Sure, being always prepared for a partition carries a cost, but so does the alternative and that’s the whole point of CAP.
Original title and link: CAP Continued: Someone is Wrong on the Internet (NoSQL databases © myNoSQL)
MongoDB and CAP Theorem
In the light of this week’s topic, the CAP theorem, the weekend video is… MongoDB and CAP theorem:
Original title and link: MongoDB and CAP Theorem (NoSQL databases © myNoSQL)
Thursday, 21 October 2010
The CAP Theorem… Again
Today looks to be (again) the day of the CAP theorem[1]
[2]
, so let’s do a quick summary:
-
We had Coda Hale’s ☞ You can’t sacrifice partition tolerance:
Of the CAP theorem’s Consistency, Availability, and Partition Tolerance, Partition Tolerance is mandatory in distributed systems. You cannot not choose it. Instead of CAP, you should think about your availability in terms of yield (percent of requests answered successfully) and harvest (percent of required data actually included in the responses) and which of these two your system will sacrifice when failures happen.
-
Jeff Darcy followed up with ☞ Another CAP article:
It seems to me that there is a consensus emerging. Even if Gilbert and Lynch only formally proved a narrower version of Brewer’s original conjecture, that conjecture and the tradeoffs it implies are still alive and well and highly relevant to the design of real working systems that serve real business needs.
and ☞ Reactions to Coda’s CAP post:
The last point is whether CAP really boils down to “two out of three” or not. Of course not, even though I’ve probably said that myself a couple of times. The reason is merely pedagogical. It’s a pretty good approximation, much like teaching Newtonian physics or ideal gases in chemistry. You have to get people to understand the basic shape of things before you start talking about the exceptions and special cases, and “two out of three” is a good approximation. Sure, you can trade off just a little of one for a little of another instead of purely either/or, but only after you thoroughly understand and appreciate why the simpler form doesn’t suffice. The last thing we need is people with learner’s permits trying to build exotic race cars. They just give the doubters and trolls more ammunition with which to suppress innovation.
-
Henry Robinson’s ☞ CAP Confusion: Problems with ‘partition tolerance’ popped up too:
Not ‘choosing’ P is analogous to building a network that will never experience multiple correlated failures. This is unreasonable for a distributed system – precisely for all the valid reasons that are laid out in the CACM post about correlated failures, OS bugs and cluster disasters – so what a designer has to do is to decide between maintaining consistency and availability. Dr. Stonebraker tells us to choose consistency, in fact, because availability will unavoidably be impacted by large failure incidents. This is a legitimate design choice, and one that the traditional RDBMS lineage of systems has explored to its fullest, but it implicitly protects us neither from availability problems stemming from smaller failure incidents, nor from the high cost of maintaining sequential consistency.
-
Many of the above articles were referring to Michael Stonebraker’s ☞ Errors in Database Systems, Eventual Consistency, and the CAP Theorem:
In summary, one should not throw out the C so quickly, since there are real error scenarios where CAP does not apply and it seems like a bad tradeoff in many of the other situations.
So, we pretty much went full circle. I just hope that Eric Brewer will do ☞ follow up:
I really need to write an updated CAP theorem paper
- Michael Stonebraker’s clarifications on the CAP theorem and the older but related Daniel Abadi’s Problems with CAP (↩)
- Nati Shalom’s ☞ NoCAP and my own NoCAP… is wrong (nb make sure you also read the comments) (↩)
Original title and link: The CAP Theorem… Again (NoSQL databases © myNoSQL)
Problems with CAP
Michael Stonebraker’s clarifications on the CAP theorem reminded me of Daniel Abadi’s PACELC:
To me, CAP should really be PACELC – if there is a partition (P) how does the system tradeoff between availability and consistency (A and C); else (E) when the system is running as normal in the absence of partitions, how does the system tradeoff between latency (L) and consistency (C)?
Original title and link: Problems with CAP (NoSQL databases © myNoSQL)
via: http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
Clarifications on the CAP Theorem and Data-Related Errors
Michael Stonebraker detailing his perspective on CAP theorem:
In my experience, network partitions do not happen often. Specifically, they occur less frequently than the sum of bohrbugs, application errors, human errors and reprovisioning events. So it doesn’t much matter what you do when confronted with network partitions. Surviving them will not “move the needle” on availability because higher frequency events will cause global outages. Hence, you are giving up something (consistency) and getting nothing in return.
It is difficult to argue against the idea that in an environment where your application and devops and bugs kill your data storage more often than system and network failures would, the CAP theorem remains one of your system priorities.
Original title and link: Clarifications on the CAP Theorem and Data-Related Errors (NoSQL databases © myNoSQL)
via: http://voltdb.com/blog/clarifications-cap-theorem-and-data-related-errors
Saturday, 16 October 2010
NoCAP... Is Wrong
Nati Shalom (Gigaspaces):
The diagram below illustrates one of the examples by which we could achieve write scalability and throughput without compromising on consistency.
As with the previous examples we break our data into partitions to handle our write scaling between nodes. To achieve high throughput we use in-memory storage instead of disk. As in-memory device tend to be significantly faster and concurrent then disk and since network speed is no longer a bottleneck we can achieve high throughput and low latency even when we use synchronous write to the replica.
Unfortunately, a couple of things are wrong:
- CAP theorem is not about write scalability. If we replace consistency, availability, or partition tolerance with another characteristic we will have to talk about a different theorem.
- Network speed is a bottleneck for cross data center communications. Plus it can go down.
- Considering the diagram below, with a partition in the network between the 2 in memory data grid nodes, please tell me how can you achieve maintain both consistency and availability.

My conclusion is simple: whatever solution/product/vendor you’re going to use the CAP theorem stands for all distributed systems.
Original title and link: NoCAP… Is Wrong (NoSQL databases © myNoSQL)
Wednesday, 10 March 2010
MySQL and MongoDB Sitting In a Boat
An interesting post from lunar logic guys about using MySQL and MongoDB for their Kanban product, how that get there and the tools they are using.
As a personal note, I thought how this system would be characterized in terms of CAP. It should be quite clear that we cannot speak about consistency over the two systems as MongoDB doesn’t really support transactions (you can check these notes on MongoDB for more details). So, in case their system would be using master-master MySQL replication and replica-pairs for MongoDB, and the internal tools would know how to work with this setup, we could probably say that we have an AP system. But if any of these preconditions are not fulfilled, I’d say both A and P are lost.
via: http://lunarlogicpolska.com/blog/2010/02/15/mysql-and-mongodb-working-together-in-kanbanery
Thursday, 28 January 2010
Characterizing Enterprise Systems using the CAP theorem
When building your next distributed system, you will have to make sure that all subsystems are able to deliver the combination of consistency-availability-partition tolerance that you are looking for.
Taylor’s article is a great start for categorizing according to the CAP theorem some of the (enterprise) systems out there: Terracota, Oracle Coherence, GigaSpaces, but also RDBMS and a couple of NoSQL solutions like Amazon Dynamo, BigTable, Cassandra, CouchDB and Project Voldemort.
Another interesting aspect of the article is that it tries to identify how these systems are coping with the missing CAP dimension. Unfortunately, there are a couple of things in the RDBMS analysis that I do not agree with.
An RDBMS provides availability, but only when there is connectivity between the client accessing the RDBMS and the RDBMS itself.
[…] there are several well-known approaches that can be employed to compensate for the lack of Partition tolerance. One of these approaches is commonly referred to as master/slave replication.
RDBMS are not available by themselves. Leaving aside the connectivity issue, RDBMS can become busy performing complex operations or run out of resources and so they can be unavailable.
What the article identifies as a solution for dealing with partition tolerance, master/slave setups are meant in fact to provide some level of availability. But with master/slave consistency becomes only “eventual consistency”.
The other approach mentioned — sharding — is indeed a solution meant to provide some level of partition tolerance. But without replication it gives up to availability.
As side notes:
- it was interesting to learn that GigaSpaces can behave as either an CA or AP system, depending on the configurable replication scheme (sync vs async).
- I am wondering if there are any CP solutions out there. I’d speculate that financial services would probably be required to be CP (if distributed).
via: http://javathink.blogspot.com/2010/01/characterizing-enterprise-systems-using.html
Wednesday, 6 January 2010
Notes on Distributed Programming and CAP
Firstly, an interesting presentation by Paulo Gaspar (@paulogaspar7) on ☞ Distributed programming and data consistency
Key take-aways:
The network falacies:
- The network is reliable
- Latency is zero
- Bandwith is infinite
- The network is secure
- Topology doesn’t change
- There is one administrator
- Transport cost is zero
- The network is homogenous
-
CAP Trade-offs:
- CA without P: Databases providing distributed transactions can only do it while their network is up
- CP without A: While there is a partition, transactions to an ACID db may be blocked until the partition heals
- AP without C: Caching provides client-server partition resilience by replicating data, even if the partition prevents verifying if a replica is fresh
Another interesting post on this topic, is ☞ The CAP Theorem Distilled by Sid Anand (@r39132). Under the assumption that “any system needs to support ‘P’” (nb I am not sure why the article is limiting the analysis to this case only), the article compares ‘A’ vs ‘C’ in CAP:
If you choose ‘C’, your system might implement 2-phase commit (a.k.a 2PC) . […]
On the other hand, if you opt for an AP system, you are opening the door to potential data inconsistencies. […] AP systems can get quite complicated (relative to CP systems)
In two subsequent articles, Sid is explaining “eventual consistency” for ☞ non-techies and for ☞ techies . I liked the way Paulo visually represented inconsistency across time in his slides:
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling
