NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Terrastore: A Consistent, Partitioned and Elastic Document Database

Terrastore is a very young Apache licensed document store solution built on top of the Terracotta (an in-memory clustering technology) that released its 0.2 version a couple of days ago.

I had the opportunity to chat with Sergio Bossa (@sbtourist) and have him answer a couple of questions about Terrastore.

Alex: What is it that made you create Terrastore in the first place?

Sergio: I wanted a scalable document store with consistency features, because I think that’s an uncovered topic/space in current implementations, which are all geared toward BASE.

Being a document database, Terrastore belongs to the same category as CouchDB, MongoDB, and Riak. In some regards (f.e. partitioning), Terrastore is similar to Riak. You should also check [1] to find out more about Terrastore and the CAP theorem.

Terracotta replication is not full, nor geared toward all nodes, but only those actually requiring the replicated data. This is more and more optimized in Terrastore, where, thanks to consistent hashing and partitioning, data is not duplicated at all. Terrastore also guarantees that data will never be duplicated among nodes, unless new nodes are joining or older nodes are leaving, thus requiring data redistribution. A Terrastore client doesn’t need to know where the data is: it can contact whatever Terrastore node and requests will be routed to the proper node holding the value (note: this is similar to the way Dynamo, Project Voldemort, Cassandra and other distributed stores are working)

At this point, more people have joined the chat and so more interesting questions and answers were coming up.

Alex: Considering Terrastore is built on top of Terracotta, is it an in-memory storage making it somehow similar to Redis?

Sergio: Correct, it stores everything in memory, but it is persistent as well. It is not as fast as Redis mainly due to some overhead related to its distributed features.

Paulo Gaspar: Terrastore looks very much like a persistent, transactional Memcached service.

Sergio: Persistent, transactional, and partitioned/sharded. An interesting difference is that afaik Memcached partitioning is done client side, while Terrastore has builtin support for data partitioning, distribution and access routing.

Terrastore is already HTTP and JSON friendly [2] and the future might bring support for the memcached protocol too.

Please see the following resources to learn more about Terrastore: