Paper: Principles of Distributed Data Management in 2020?
Patrick Valduriez, co-author of the “Principles of Distributed Database Systems” book, has published a paper Principles of Distributed Data Management in 2020? (pdf) translating the main topic into the following 3 questions:
- What are the fundamental principles behind the emerging solutions?
- Is there any generic architectural model, to explain those principles?
- Do we need new foundations to look at data distribution?
Wrt (2), I showed that emerging solutions can still be explained along the three main dimensions of distributed data management (distribution, autonomy, heterogeneity), yet pushing the scales of the dimensions high up. However, I raised the question of how generic should distributed data management be, without hampering application-specific optimizations. Emerging NOSQL solutions tend to rely on a specific data model (e.g. Bigtable, MapReduce) with a simple set of operators easy to use from or with a programming language. It is also interesting to witness the development of algebras, with specific operators, to raise the level of abstraction in a way that enables optimization [9]. What is missing to explain the principles of emerging solutions is one or more dimensions on generic/specific data model and data processing.
What I think this paper does is actually looking at two different questions, a bit less generic but still useful in proving that the new generation of distributed database systems was clearly triggered by the new requirements and the evolution of the current applications:
- Is there a need for new approaches in distributed data management systems?
- What are some of the approaches used by the emerging solution to deal with the challenges posed by today’s data-intensive applications?
You can read or download Patrick Valduriez’s paper here:
Original title and link: Paper: Principles of Distributed Data Management in 2020? (©myNoSQL)