benchmark: All content tagged as benchmark in NoSQL databases and polyglot persistence
A paper by Md. Borhan Uddin, Bo He, and Radu Sion:
Experiments were performed to benchmark the Amazon Relational Database Service (RDS) within a TPC-C benchmarking framework. The TPC-C benchmark is one of the most widely adopted database performance benchmarking frameworks comparing OLTP performance of online transaction processing systems. Two types of Amazon RDS services were tested, namely the standard RDS (single availability zone) and the Multi- AZ RDS (synchronous ‘standby’ replica in multiple availability zones). For each service type, five different RDS instances were tested: Small, Large, Extra Large (XLarge), Double Extra Large (2XLarge), and Quadruple Extra Large (4XLarge).
Results are interesting to say the least:
Overall, we observed that at a very low load, the resulting throughput was also relatively low; at medium load, the throughput increased to a peak; at very high loads, the throughput decreased again.
Redis ☞ SINTER (set intersection) operation benchmarked. An
O(N * M) op:
The complete set of benchmark results and the program i ran is at the bottom, but the results i care about are these:
taking the intersection of
- 50,000 x 5,000 took 4 ms
- 50,000 x 400 took 0.7 ms
- 50,000 x 30 took 0.4 ms
Question is: how many times do you need to perform set intersections in real-time/read time instead of pre-computing them.
Original title and link: Measuring Redis SINTER/Set Intersection Performance (NoSQL databases © myNoSQL)
A long and interesting discussion on comparing Redis and Memcached performance. It all started ☞ here:
After crunching all of these numbers and screwing around with the annoying intricacies of OpenOffice, I’m giving Redis a big thumbs down. My initial sexual arousal from the feature list is long gone. Granted, Redis might have its place in a large architecture, but certainly not a replacement to memcache. When your site is hammering 20,000 keys per second and memcache latency is heavily dependent on delivery times, it makes no business sense to transparently drop in Redis. The features are neat, and the extra data structures could be used to offload more RDBMS activity… but 20% is just too much to gamble on the heart of your architecture.
Salvatore Sanfilippo ☞ followed up:
[…] this is why the sys/toilet benchmark is ill conceived.
- All the tests are run using a single client into a busy loop.
- when you run single clients benchmarks what you are metering actually is, also: the round trip time between the client and the server, and all the other kind of latencies involved, and of course, the speed of the client library implementation.
- The test was performed with very different client libraries
But he also published a new benchmark. And Dormando ☞ published an update picking on the previous two:
The “toilet” bench and antirez’s benches both share a common issue; they’re busy-looping a single client process against a single daemon server. The antirez benchmark is written much better than the original one; it tries to be asyncronous and is much more efficient.
And it didn’t stop here, as Salvatore felt ☞ something was still missing:
The test performed by @dormando was missing an interesting benchmark, that is, given that Redis is single threaded, what happens if I run an instance of Redis per core?
I assume everyone is asking by now: which one of Redis and Memcached performed better? And the answer is: it depends (even if some would like to believe differently).
But why is this the “answer”? Firstly, because creating good benchmarks is really difficult. Most of the benchmarks are focusing on the wrong thing or they are covering not very real-life like problems.
This would be my very simple advise:
- basic benchmarks will not give you real answers
- you are better testing for your very specific scenario (data size, concurrency level,
There is a new commit to YCSB […] This fixes performance problems in the HBase DB adapter. In my own tests I found that my short scans, which were configured to read 100-column rows, 1-300 in zipfian, went from 60ms to 35ms.
Also there is column selection pushdown enabled, which will improve the speed of any tests that are doing single column gets on a wide row (eg: readallfields=false, fieldcount=X). This is all due to changing how YCSB uses the Result object. Check out the commit for some hints. I have a longer email and patch about this stuff coming really soon.
YCSB is probably the most complete and correct NoSQL benchmark. And that’s basically a 40% speed improvement.
Original title and link: New HBase YCSB changes - improves speed drastically (NoSQL databases © myNoSQL)