performance: All content tagged as performance in NoSQL databases and polyglot persistence
How many times have you got an answer that applies to your specific scenario when providing a short list of performance and scalability requirements? MySQL/InnoDB can do 750k qps, Cassandra is scaling linearly, MongoDB can do 8 mil ops/s. Is any of these the answer for your application?
How many times did you get all the requirements right at the spec time?
How many times did requirements remain the same during the development cycle?
How many times did production reality confirmed your bullet list requirements?
Original title and link: Asking for Performance and Scalability Advice on StackOverflow ( ©myNoSQL)
These slides have generated quite a reaction on Twitter. I’ll let you decide for yourself the reasons:
While there have been lots of retweets, here’s just a glimpse of what type of reactions I’m referring to:
Judging by the number of posts I’ve seen around I’d guess you’ve already heard about the MongoDB 1.4 release. Anyways, I definitely had to include it here as myNoSQL covers all major NoSQL projects and follows closely all things related to the NoSQL ecosystem.
- background indexing and indexing improvements
- concurrency improvements
- the lack of autosharding (still alpha, still pushing, still…)
- the lack of improvements or alternatives for the MongoDB durability tradeoff
Speaking of performance, the 10gen people have run some benchmarks comparing MongoDB 1.2 with MongoDB 1.4. Without a couple of exceptions, the performance haven’t improved radically, so I’d speculate that there is still a lot of locking involved. The benchmark source code was made available so you can dig deeper into it.
All in all, good and exciting news for the NoSQL world!
After posting about Scott Motte’s comparison of MongoDB and CouchDB, I thought there should be some more informative sources out there, so I’ve started to dig.
The first I came upon (thanks to Debasish Ghosh @debasishg) is an article about ☞ Raindrop requirements and the issues faced while attacking them with CouchDB and the pros and cons of possibly replacing CouchDB with MongoDB:
- Uses update-in-place, so the file system impact/need for compaction is less if we store our schemas in one document are likely to work better.
- Queries are done at runtime. Some indexes are still helpful to set up ahead of time though.
- Has a binary format for passing data around. One of the issues we have seen is the JSON encode/decode times as data passes around through couch and to our API layer. This may be improving though.
- Uses language-specific drivers. While the simplicity of REST with CouchDB sounds nice, due to our data model, the megaview and now needing a server API layer means that querying the raw couch with REST calls is actually not that useful. The harder issue is trying to figure out the right queries to do and how to do the “joins” effectively in our API app code.
- easy master-master replication. However, for me personally, this is not so important. […] So while we need backups, we probably are fine with master-slave. To support the sometimes-offline case, I think it is more likely that using HTML5 local storage is the path there. But again, that is just my opinion.
Anyway while some of the points above are generic, you should definitely try to consider them through the Raindrop requirements perspective about which you can read more here.
I’d also mention this ☞ benchmark comparing the performance of MongoDB, CouchDB, Tokyo Cabinet/Tyrant (note: the author of the benchmark is categorizing Tokyo Cabinet as a document database, while Tokyo is a key-value store) and uses MySQL results as a reference.
In case you have other resources that you think would be worth including do not hesitate to send them over.
Update: Just found a nice comparison matrix .
As a teaser, very soon I will introduce you to a new solution available in this space, so make sure to check MyNoSQL regularly.
Update: The main article about this new document store has been published: Terrastore: A Consistent, Partitioned and Elastic Document Database. I would strongly encourage you to check it, as Terrastore is looking quite promising.