NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Northscale: All content tagged as Northscale in NoSQL databases and polyglot persistence

The Couchbase Genealogy

Looks like Matthew Aslett (the451group) had his own version of the Couchbase genealogy:

Couchbase genealogy

Credit Matt Aslett .

Original title and link: The Couchbase Genealogy (NoSQL database©myNoSQL)

History of Couch Projects

Just in case you thought someone made up the whole thing about the status of CouchDB being confusing:

History of Couch Projects

Found in Koji Kawamura‘s Introduction of CouchDB JP slides .

On the other hand I’m still trying to figure out if things in CouchDB land were more confusing than the various Hadoop versions out there. If you compare the two genealogy trees you’ll notice a reversed pattern.

Original title and link: History of Couch Projects (NoSQL database©myNoSQL)

Membase Releases Membase 1.0

Big day for Membase (the company, ex Northscale) announcing the release of Membase (the product) 1.0. First time we’ve heard about it was end of July when we took a look at what is Membase. At that moment we’ve also learned that one of the companies using Membase is Zynga (nb Zynga is also a Membase contributer).

Now, 3 months later we have Membase 1.0 release[1] coming in two flavors:

  • Membase Server Enterprise Edition is a certified distribution of Membase, available for download and purchase at Annual product subscriptions start at $999 per node, granting a software use license and access to the Membase Network, which delivers software upgrades, hot fixes, maintenance releases and product support.
  • Membase Server Community Edition is a community binary, downloadable at, where developers can also access and contribute to the source code.

By checking the “supporting quotes” section of the announcement, I’m also noticing a couple of other Membase users: ShareThis, NaviNet, Loggly. So hopefully soon we will also have some case studies.

Correction: This general availability release of Membase has version 1.6, but it is still the first production ready release of Membase.

  1. The official PR announcement can be found ☞ here.  ()

Original title and link: Membase Releases Membase 1.0 (NoSQL databases © myNoSQL)

Membase Success Story: Zynga

There’s no question that Zynga’s latest data numbers are impressive, with the company moving mountains of data per day, or roughly one petabyte per day. As Zynga’s CTO Cadir Leeexplained at this morning’s Oracle Openworld keynote, the company is adding as many as 1,000 servers each week to satisfy its growing user base (10% of the internet has now played a Zynga game) and increasing connectivity (there are 3 billion connections between its users).

This sounds like a solid case study for Membase — remember NorthScale, Zynga and NHN are listed as Membase contributors.

Original title and link: Membase Success Story: Zynga (NoSQL databases © myNoSQL)


Membase on VMWare

James Phillips of NorthScale about scaling out with Membase on VMWare (real interview starts at around 1’35”):

Considering Membase is persisting to disk (as opposed to its little brother memcached which is memory only)[1], I’m wondering if virtualized environments provide good enough IO.

  1. As many other DBMS, Membase keeps “hot data” in memory, but it also writes it to disk for durability.  ()

Original title and link: Membase on VMWare (NoSQL databases © myNoSQL)

What is Membase?

It is kind of difficult to figure out a complete description of what Membase is as the ratio of signal to noise in today’s announcement is still very low[1]. Anyways, here is what I’ve been able to put together:

  • a cache using memcached protocol
  • Apache licensed open source version of NorthScale Membase Server[2]
  • project homepage is and (some) code can be found on GitHub
  • can persist data
  • supports replication (note: source code repository contains a reference to master-slave setup)
  • elastic, allowing addition and removal of new nodes and automatic rebalancing
  • used by Zynga and NHN[3], which are also listed as project contributors

While details are extremely scarce, this sounds a lot like Gear6 Memcached.

Membase Persistency

According to this paper the execution of a write operation involves the following steps

  1. The set arrives into the membase listener-receiver.
  2. Membase immediately replicates the data to replica servers – the number of replica copies is user defined. Upon arrival at replica servers, the data is persisted.
  3. The data is cached in main memory.
  4. The data is queued for persistence and de-duplicated if a write is already pending. Once the pending write is pulled from the queue, the value is retrieved from cache and written to disk (or SSD).
  5. Set acknowledgment return to application.

There is also:

In membase 1.6, data migration is based on an LRU algorithm, keeping recently used items in low-latency media while “aging out” colder items; first to SSD (if available) and then to spinning media.

A couple of comments:

  1. it looks like a write operation is blocking until data is completely replicated
  2. it is not completely clear if “hot data” is persisted to disk on a write operation or only once it’s becoming “cold”

Membase Replication

Membase uses the notion of virtual buckets or vBucket (currently it supports up to 4096) which contains or owns a subset of the key space (note this is similar to Riak Vnodes[4]). Each vBucket replication can be configured independently, but at any time there is only 1 master node that coordinates reads and writes.

Membase Rebalancing

Membase runs on each node a couple of “processes” that are dealing with data rebalancing (part of a so called: cluster manager). Once it is determined that a master node (the coordinator for all reads and writes for a particular virtual bucket) becomes unavailable, a Rebalance Orchestrator process will coordinate the migration of the virtual buckets (note: both master and replica data of the virtual bucket will be moved).

When machines are scheduled to join or leave the cluster, these are placed in a pending operation set that is used upon the next rebalancing operation. I’m not sure, but I think it is possible to manually trigger a rebalancing op.

  1. Sources:  ()
  2. NorthScale Membase Server web page  ()
  3. While I read that “Membase is currently serving data for some of the busiest web applications on the planet.”, I couldn’t find any other users besides Zynga and NHN.  ()
  4. Riak is using a similar notion: vnode. While the terms are the same you should not confuse Riak buckets for membase buckets though.  ()