NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Paper: Principles of Distributed Data Management in 2020?

Patrick Valduriez, co-author of the “Principles of Distributed Database Systems” book, has published a paper Principles of Distributed Data Management in 2020? (pdf) translating the main topic into the following 3 questions:

  1. What are the fundamental principles behind the emerging solutions?
  2. Is there any generic architectural model, to explain those principles?
  3. Do we need new foundations to look at data distribution?

Wrt (2), I showed that emerging solutions can still be explained along the three main dimensions of distributed data management (distribution, autonomy, heterogeneity), yet pushing the scales of the dimensions high up. However, I raised the question of how generic should distributed data management be, without hampering application-specific optimizations. Emerging NOSQL solutions tend to rely on a specific data model (e.g. Bigtable, MapReduce) with a simple set of operators easy to use from or with a programming language. It is also interesting to witness the development of algebras, with specific operators, to raise the level of abstraction in a way that enables optimization [9]. What is missing to explain the principles of emerging solutions is one or more dimensions on generic/specific data model and data processing.

What I think this paper does is actually looking at two different questions, a bit less generic but still useful in proving that the new generation of distributed database systems was clearly triggered by the new requirements and the evolution of the current applications:

  1. Is there a need for new approaches in distributed data management systems?
  2. What are some of the approaches used by the emerging solution to deal with the challenges posed by today’s data-intensive applications?

You can read or download Patrick Valduriez’s paper here:

Original title and link: Paper: Principles of Distributed Data Management in 2020? (NoSQL database©myNoSQL)