nosql theory: All content tagged as nosql theory in NoSQL databases and polyglot persistence
Scroll to minute 16:55 of this video to watch Jim Webber explain the benefits of polyglot persistence and how starting (again) the winner-takes-it-all war is just sending us back at least 10 years from the database Nirvana.
We’ve just come from the place where one-size-fits-all and we don’t want to go back there. There is a huge wonderful ecosystem of stores. Pick the right one. Don’t just assume that the one you find the easiest or the one that shouts the loudest is the one you’re going to use. Pick the one that suits your data model.
It doesn’t matter what flavor of relational or NoSQL database you prefer or have experience with or if a small or large database vendor is paying your bills. You really need to get this right as otherwise we’re just going to destroy a lot of valuable options we’ve added to our toolboxes.
Original title and link: The Database Nirvana ( ©myNoSQL)
Besides the many practical lessons emphasized in Jack Clark’s interview with Adrian Cockcroft on ZDNet—luckly I’ve had the chance to see some of Cockcroft’s presentations about Netflix architecture and also talk to him directly—one thing that sticked with me was the ending paragraph:
The thing I’ve been publicly asking for has been better IO in the cloud. Obviously I want SSDs in there. We’ve been asking cloud vendors to do that for a while. With Cassandra, we’ve had to go onto horizontal scale and use the internal disks and triple replicate across availability zones, so you end up with a triple-redundant data store that is careful not to overload the disks.
That reminded me of this old ACM article authored by Brendan Gregg:
When I/O latency is presented as a visual heat map, some intriguing and beautiful patterns can emerge. These patterns provide insight into how a system is actually performing and what kinds of latency end-user applications experience. Many characteristics seen in these patterns are still not understood, but so far their analysis is revealing systemic behaviors that were previously unknown.
I was wondering if in the NoSQL databases space (and data storage space in general) are there any of the monitoring tools that provide such advanced visualization of latency data. Do you know any?
Original title and link: Visualizing System Latency ( ©myNoSQL)
Abel Perez in a post about Cassandra:
Horizontal scalability boils down to the ability to add new hardware to a system without any interruption or downtime. An ideal horizontally scalable system does not require reconfiguration and supports incremental addition of hardware.
Nope. This is the definition of elasticity.
Horizontal scalability is the capability of a system to accept adding or removing multiple nodes (independent units of resources) and making them work as a single system. The scalability of a system can be further categorized as: negative, sub-linear, linear, or supra-linear depending on the shape of the performace1/nodes curve
This is the part where things can get more complicated as there are multiple ways to characterize the performance of a system (e.g. throughput, latency, etc.) ↩
Original title and link: Horizontal Scalability vs Elasticity ( ©myNoSQL)
This question was posted on LinkedIn:
During my research I came across a new database technology called ‘NuoDB’. It seems to share many attributes of NoSQL databases and still maintain a SQL query interface. It uses a key value store as its data storage engine. It also promises the performance and scale of NoSQL databases.
Enterprises that have not yet embraced NoSQL, will be inclined to try this option before going NoSQL way in my opinion. Mainly because it does not require them to change their database interface layer drastically and also because NoSQL databases have not moved towards a standards based query interface yet.
For an outsider this comment might look extremely valid. I mean who in his right mind would give away all the expertise and tools and history of SQL for something like NoSQL?
But the real answer is in the details. “It seems to share many attributes of NoSQL databases” . Ask yourself what are these shared attributes:
- what is the supported data model? The relational model advantages have been discussed over and over for the last 30 years. But there are alternative data models that bring different
- what is the persistence model? Is it disk based, memory based, cluster based? IS it durable?
- what is the distribution model? Is it master-slave or master-master or peer-to-peer or masterless?
- what are the scalability characteristics of the system?
- what are the elasticity characteristics of the system?
In the only comment worth reading, Stefan Edlich correctly points to the tons of NewSQL solutions. Before asking if these systems “pose a threat” to NoSQL databases, I’d firstly ask if they are at least a threat to the existing relational databases first. And the answer is no.
Sid Anand wrote in the State of NoSQL 2012 post:
Many of the NoSQL vendors view the “battle of NoSQL” to be akin to the RDBMS battle of the 80s, a winner-take-all battle. In the NoSQL world, it is by no means a winner-take-all battle. Distributed Systems are about compromises.
I’d go even further and say that data storage is not anymore a winner-takes-all battle. Actually it’s not even a zero-sum game. We are living the polyglot persistence age.
Original title and link: Threat to NoSQL Database? ( ©myNoSQL)