NoSQL comparison: All content tagged as NoSQL comparison in NoSQL databases and polyglot persistence
A long and interesting discussion on comparing Redis and Memcached performance. It all started ☞ here:
After crunching all of these numbers and screwing around with the annoying intricacies of OpenOffice, I’m giving Redis a big thumbs down. My initial sexual arousal from the feature list is long gone. Granted, Redis might have its place in a large architecture, but certainly not a replacement to memcache. When your site is hammering 20,000 keys per second and memcache latency is heavily dependent on delivery times, it makes no business sense to transparently drop in Redis. The features are neat, and the extra data structures could be used to offload more RDBMS activity… but 20% is just too much to gamble on the heart of your architecture.
Salvatore Sanfilippo ☞ followed up:
[…] this is why the sys/toilet benchmark is ill conceived.
- All the tests are run using a single client into a busy loop.
- when you run single clients benchmarks what you are metering actually is, also: the round trip time between the client and the server, and all the other kind of latencies involved, and of course, the speed of the client library implementation.
- The test was performed with very different client libraries
But he also published a new benchmark. And Dormando ☞ published an update picking on the previous two:
The “toilet” bench and antirez’s benches both share a common issue; they’re busy-looping a single client process against a single daemon server. The antirez benchmark is written much better than the original one; it tries to be asyncronous and is much more efficient.
And it didn’t stop here, as Salvatore felt ☞ something was still missing:
The test performed by @dormando was missing an interesting benchmark, that is, given that Redis is single threaded, what happens if I run an instance of Redis per core?
I assume everyone is asking by now: which one of Redis and Memcached performed better? And the answer is: it depends (even if some would like to believe differently).
But why is this the “answer”? Firstly, because creating good benchmarks is really difficult. Most of the benchmarks are focusing on the wrong thing or they are covering not very real-life like problems.
This would be my very simple advise:
- basic benchmarks will not give you real answers
- you are better testing for your very specific scenario (data size, concurrency level,
Some say it is the right time to start having these around. Others are saying it’s way to early to start the “battle”. Users do want to see them and in case they’re lacking they create their own, most of the time using incomplete or wrong approaches.
But what am I talking about? As some of you might have guessed already:
But users are more interested in seeing cross product benchmarks, even if most of the time constructing these is extremely complicated and they end up comparing apples with oranges.
All these being said and accepting that most of the time someone will figure out a way to invalidate the results, lets see what cross product benchmarks do we have in the NoSQL space.
Yahoo! Cloud Serving Benchmark
The Yahoo! Cloud Serving Benchmark’s goal is to facilitate performance comparisons of the new generation of cloud data serving systems. The source code is available on ☞ GitHub and Yahoo! has also published ☞ the results of running this benchmark against Cassandra, HBase, Yahoo!’s PNUTS, and a simple sharded MySQL implementation.
VoltDB a new storage solution that calls itself the next-generation SQL RDBMS with ACID for fast-scaling OLTP applications has recently ☞ published the results of their benchmark comparing VoltDB and Cassandra.
It is worth noting that while being one of those apples to oranges comparisons (nb and the authors are well aware of it), there are still a couple of interesting and useful things to be learned from it (i.e. benchmarking procedure, tested scenarios, etc.)
Unfortunately at this time the source code is not yet available, but hopefully we will see it soon:
Going forward, we’re planning to release the code we used to do these benchmarks. We’d also like to try a few other storage layers
Hypertable and HBase Performance Evaluation
The guys behind Hypertable ☞ have published their results of comparing Hypertable with HBase using a benchmark based on the Google BigTable paper from which both HBase and Hypertable are inheriting their architecture.
Unfortunately, the benchmark code is not available at this moment.
So, as far as I could gather we have:
- ☞ Riak internal benchmark
- ☞ MongoDB internal benchmark
- ☞ Yahoo! Cloud Serving Benchmark
- results only of VoltDB Benchmark comparing VoltDB and Cassandra
- BigTable-inspired benchmark comparing Hypertable and HBase
Did I miss any?