NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



BASE: All content tagged as BASE in NoSQL databases and polyglot persistence

Banks and the Ethernal Consistency Example or What trumps consistency

Todd Hoff extracts and expands on some thoughts about BASE vs ACID from Eric Brewer’s NoSQL: Past, Present, Future published on InfoQ:

Consistency it turns out is not the Holy Grail. What trumps consistency is:

  • Auditing
  • Risk Management
  • Availability

But the cornerstone of the availability vs consistency conversation is:

Availability correlates with revenue and consistency generally does not.

✚ Over time Michael Stonebraker has been the most prominent supporter of exactly the opposite argument.

✚ Remember Emin Gün Sirer’s The NoSQL Partition Tolerance Myth? He used the bank example too.

Original title and link: Banks and the Ethernal Consistency Example or What trumps consistency (NoSQL database©myNoSQL)


9 Things to Acknowledge about NoSQL Databases

Excellent list:

  1. Understand how ACID compares with BASE (Basically Available, Soft-state, Eventually Consistent)
  2. Understand persistence vs non-persistence, i.e., some NoSQL technologies are entirely in-memory data stores
  3. Recognize there are entirely different data models from traditional normalized tabular formats: Columnar (Cassandra) vs key/value (Memcached) vs document-oriented (CouchDB) vs graph oriented (Neo4j)
  4. Be ready to deal with no standard interface like JDBC/ODBC or standarized query language like SQL; every NoSQL tool has a different interface
  5. Architects: rewire your brain to the fact that web-scale/large-scale NoSQL systems are distributed across dozens to hundreds of servers and networks as opposed to a shared database system
  6. Get used to the possibly uncomfortable realization that you won’t know where data lives (most of the time)
  7. Get used to the fact that data may not always be consistent; ‘eventually consistent’ is one of the key elements of the BASE model
  8. Get used to the fact that data may not always be available
  9. Understand that some solutions are partition-tolerant and some are not

Print it out and distribute it among your colleagues.

Original title and link: 9 Things to Acknowledge about NoSQL Databases (NoSQL databases © myNoSQL)


On NoSQL Databases and ACID

Dan Weinreb:

The “NoSQL” systems are ACID, as long as you accept that a transaction can only perform one operation, in the sense that the only thing that gets in the way of being ACID is when there are network partitions and the system is called upon to perform operations while the partition is still there.

That’s why they are said to be ☞ BASE (basically available, soft state, eventually consistent)

Original title and link for this post: On NoSQL Databases and ACID (published on the NoSQL blog: myNoSQL)


Notes on Distributed Programming and CAP

Firstly, an interesting presentation by Paulo Gaspar (@paulogaspar7) on ☞ Distributed programming and data consistency

Key take-aways:

  • The network falacies:

    1. The network is reliable
    2. Latency is zero
    3. Bandwith is infinite
    4. The network is secure
    5. Topology doesn’t change
    6. There is one administrator
    7. Transport cost is zero
    8. The network is homogenous
  • CAP Trade-offs:

    • CA without P: Databases providing distributed transactions can only do it while their network is up
    • CP without A: While there is a partition, transactions to an ACID db may be blocked until the partition heals
    • AP without C: Caching provides client-server partition resilience by replicating data, even if the partition prevents verifying if a replica is fresh

Another interesting post on this topic, is ☞ The CAP Theorem Distilled by Sid Anand (@r39132). Under the assumption that “any system needs to support ‘P’” (nb I am not sure why the article is limiting the analysis to this case only), the article compares ‘A’ vs ‘C’ in CAP:

If you choose ‘C’, your system might implement 2-phase commit (a.k.a 2PC) . […]

On the other hand, if you opt for an AP system, you are opening the door to potential data inconsistencies. […] AP systems can get quite complicated (relative to CP systems)

In two subsequent articles, Sid is explaining “eventual consistency” for ☞ non-techies and for ☞ techies . I liked the way Paulo visually represented inconsistency across time in his slides: