NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Data Grid or NoSQL? What are the common points? The main differences?

A great post by Olivier Mallassi on a topic that comes up very often: how do data grids and NoSQL databases compare?

  • Data Grids enable you controlling the way data is stored. They all have default implementation (Gigaspaces offers RDBMS by default, Gemfire offers file and disk based storage by default….) but in all cases, you can choose the one that fits your needs: do you need to store data, do you need to relieve the existing databases….
  • In order to minimize the latency, data grids enable you to store data synchronously (write-through) or asynchronously (write-behind) on disk. You can also define overflow strategies. In that case, data is store in memory up to a treshold where data is flushed on disk (following algorithms like LRU …). NoSQL solutions have not been designed to provide these features.
  • Data grids enable you developing Event Driven Architecture.
  • Querying is maybe the point on which pure NoSQL solutions and data grids are merging.
  • Data grids enable near-cache topologies.

Taking a step back you’ll notice that there are actually more similarities than differences. While Oliver Mallasi lists the above points as features that prove data grids as being more configurable and so more adaptable, some of these do exist also in the NoSQL databases taking different forms:

  1. pluggable storage backends. Not many of the NoSQL databases have this feature,but Riak and Project Voldemort are offering different solutions that are optimized for specific scenarios.
  2. replicated and durable writes. Not the same as synchronous vs asynchronous writes, but a different perspective on writes.
  3. Notification mechanisms. Once again not all of the NoSQL databases support notification mechanisms, but a couple of them have offer some interesting approaches:
    1. CouchDB: _changes feed with filters
    2. Riak: pre-commit and post-commit hooks
    3. HBase coprocessors
  4. Most of the NoSQL database have local per-node caches.

With these, I’ve probably made things even blurrier. But let me try to draw a line between data grids and NoSQL databases. Data grids are optimized for handling data in memory. Everything that spills over is secondary. On the other hand, NoSQL databases are for storing data. Thus they focus on how they organize data (on disk or in memory) and optimize access to it. Data grids are a processing/architectural model. NoSQL databases are storage solutions.

Original title and link: Data Grid or NoSQL? What are the common points? The main differences? (NoSQL database©myNoSQL)