NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Addressing the NoSQL Criticism

Bill Karwin’s comment is spot on[1]:

First, NoSQL is a marketing term, not a technology term. Once we acknowledge that, all the arguments about “what exactly is NoSQL” are moot.

Second, non-relational data management does optimize better than relational data management — but only for a subset of usage of the data. The process of designing a non-relational database is the same as designing a denormalized relational database, e.g. a Data Warehouse. You need to itemize the queries you run against the non-relational data store, and design the data store with those queries in mind.

It’s not surprising that NoSQL can be faster and more scalable for those queries, if you do this right. So can DW be very scalable, but with similar limitations on the queries that are served by a given DW schema.

It’s also very easy to get your NoSQL database design wrong if you skip your query analysis step, because one believes the marketing message of “just start putting data in.” This explains some of the disappointments over the past couple of years when companies tried to use NoSQL as a drop-in replacement for RDBMS.

One workaround is to store data redundantly, in different document collections optimized to serve different query patterns. This is also similar to DW, materialized views, or other denormalizations.

The problem with NoSQL is not the technology. It’s fine technology when used appropriately, and it fits an important specialty role in data management. The problem is the hype, the advocacy, and the claims that one can get optimization for free without doing analysis. TANSTAAFL is still true

  1. Bradley Holt’s post is partially missing the big picture by focusing too much on CouchDB features.  

Original title and link: Addressing the NoSQL Criticism (NoSQL database©myNoSQL)