A couple of most notable NoSQL databases targeting large scalable systems are written in Java: Cassandra, HBase, BigCouch. Then there’s also Hadoop. Plus a series of caching and data grid solutions like Terracotta, Gigaspaces. They are all facing the same challenge: tuning the JVM garbage collector for predictable latency and throughput.
Jonathan Ellis’s slides presented at Fosdem 2012 are covering some of the problems with GC and the way Cassandra tackles them. While this is one of those presentations where the slides are not enough to understand the full picture, going through them will still give you a couple of good hints.
For those saying that Java and the JVM are not the platform for writing large concurrent systems, here’s the quote Ellis is finishing his slides with:
Cliff Click: Many concurrent algorithms are very easy to write with a GC and totally hard (to down right impossible) using explicit free.
Enjoy the slides after the break.
Original title and link: Dealing With JVM Limitations in Apache Cassandra ( ©myNoSQL)