bigcouch: All content tagged as bigcouch in NoSQL databases and polyglot persistence
A couple of most notable NoSQL databases targeting large scalable systems are written in Java: Cassandra, HBase, BigCouch. Then there’s also Hadoop. Plus a series of caching and data grid solutions like Terracotta, Gigaspaces. They are all facing the same challenge: tuning the JVM garbage collector for predictable latency and throughput.
Jonathan Ellis’s slides presented at Fosdem 2012 are covering some of the problems with GC and the way Cassandra tackles them. While this is one of those presentations where the slides are not enough to understand the full picture, going through them will still give you a couple of good hints.
For those saying that Java and the JVM are not the platform for writing large concurrent systems, here’s the quote Ellis is finishing his slides with:
Cliff Click: Many concurrent algorithms are very easy to write with a GC and totally hard (to down right impossible) using explicit free.
Enjoy the slides after the break.
When compared with LSM trees and Fractal trees, B+Trees do not show the highest write performance. And recently the Acunu research team has published a paper Stratified B-trees and versioning dictionaries about a new data structure, the “stratified B-tree“:
A classic versioned data structure in storage and computer science is the copy-on-write (CoW) B-tree — it underlies many of today’s file systems and databases, including WAFL, ZFS, Btrfs and more. Unfortunately, it doesn’t inherit the B-tree’s optimality properties; it has poor space utilization, cannot offer fast updates, and relies on random IO to scale. Yet, nothing better has been developed since. We describe the `stratified B-tree’, which beats all known semi-external memory versioned B-trees, including the CoW B-tree. In particular, it is the first versioned dictionary to achieve optimal tradeoffs between space, query and update performance.
With its pluggable storage backend (InnoDB, Bitcask, couch_btree, etc.), Riak might provide at some point a “stratified B-tree”implementation too.
Update: Here’s the Hacker News discussion about the “Stratified B-trees and versioning dictionaries” paper.
While CouchOne is focused on getting CouchDB on the mobiles — CouchDB is available on Android and probably coming to iOS, Cloudant, the other CouchDB oriented company, is focused on CouchDB horizontal scalability by providing as open source and hosting BigCouch.
Recently Cloudant hosted a webinar on scaling out CouchDB with BigCouch. You can watch the video and slides embedded below:
In a future post I’ll cover more details about how BigCouch is scaling CouchDB.