northscale: All content tagged as northscale in NoSQL databases and polyglot persistence
Just in case you thought someone made up the whole thing about the status of CouchDB being confusing:
On the other hand I’m still trying to figure out if things in CouchDB land were more confusing than the various Hadoop versions out there. If you compare the two genealogy trees you’ll notice a reversed pattern.
Original title and link: History of Couch Projects ( ©myNoSQL)
Big day for Membase (the company, ex Northscale) announcing the release of Membase (the product) 1.0. First time we’ve heard about it was end of July when we took a look at what is Membase. At that moment we’ve also learned that one of the companies using Membase is Zynga (nb Zynga is also a Membase contributer).
Now, 3 months later we have Membase 1.0 release
 coming in two flavors:
- Membase Server Enterprise Edition is a certified distribution of Membase, available for download and purchase at membase.com. Annual product subscriptions start at $999 per node, granting a software use license and access to the Membase Network, which delivers software upgrades, hot fixes, maintenance releases and product support.
- Membase Server Community Edition is a community binary, downloadable at membase.org, where developers can also access and contribute to the source code.
By checking the “supporting quotes” section of the announcement, I’m also noticing a couple of other Membase users: ShareThis, NaviNet, Loggly. So hopefully soon we will also have some case studies.
Correction: This general availability release of Membase has version 1.6, but it is still the first production ready release of Membase.
James Phillips of NorthScale about scaling out with Membase on VMWare (real interview starts at around 1’35”):
Considering Membase is persisting to disk (as opposed to its little brother memcached which is memory only)
, I’m wondering if virtualized environments provide good enough IO.
- As many other DBMS, Membase keeps “hot data” in memory, but it also writes it to disk for durability. (↩)
It is kind of difficult to figure out a complete description of what Membase is as the ratio of signal to noise in today’s announcement is still very low. Anyways, here is what I’ve been able to put together:
- a cache using memcached protocol
- Apache licensed open source version of NorthScale Membase Server
- project homepage is membase.org and (some) code can be found on GitHub
- can persist data
- supports replication (note: source code repository contains a reference to master-slave setup)
- elastic, allowing addition and removal of new nodes and automatic rebalancing
- used by Zynga and NHN, which are also listed as project contributors
While details are extremely scarce, this sounds a lot like Gear6 Memcached.
According to this paper the execution of a write operation involves the following steps
- The set arrives into the membase listener-receiver.
- Membase immediately replicates the data to replica servers – the number of replica copies is user defined. Upon arrival at replica servers, the data is persisted.
- The data is cached in main memory.
- The data is queued for persistence and de-duplicated if a write is already pending. Once the pending write is pulled from the queue, the value is retrieved from cache and written to disk (or SSD).
- Set acknowledgment return to application.
There is also:
In membase 1.6, data migration is based on an LRU algorithm, keeping recently used items in low-latency media while “aging out” colder items; first to SSD (if available) and then to spinning media.
A couple of comments:
- it looks like a write operation is blocking until data is completely replicated
- it is not completely clear if “hot data” is persisted to disk on a write operation or only once it’s becoming “cold”
Membase uses the notion of virtual buckets or vBucket (currently it supports up to 4096) which contains or owns a subset of the key space (note this is similar to Riak Vnodes). Each vBucket replication can be configured independently, but at any time there is only 1 master node that coordinates reads and writes.
Membase runs on each node a couple of “processes” that are dealing with data rebalancing (part of a so called: cluster manager). Once it is determined that a master node (the coordinator for all reads and writes for a particular virtual bucket) becomes unavailable, a Rebalance Orchestrator process will coordinate the migration of the virtual buckets (note: both master and replica data of the virtual bucket will be moved).
When machines are scheduled to join or leave the cluster, these are placed in a pending operation set that is used upon the next rebalancing operation. I’m not sure, but I think it is possible to manually trigger a rebalancing op.
- NorthScale Unleashes Membase Server (NorthScale blog)
- NothScale, Zynga team up on NoSQL (CNET)
- Open Sourced Membase Joins NoSQL Party (GigaOm)
- NorthScale Releases High-Performance NoSQL Database (marketwire.com)
- NorthScale Membase Server web page (↩)
- While I read that
“Membase is currently serving data for some of the busiest web applications on the planet.”, I couldn’t find any other users besides Zynga and NHN. (↩)
- Riak is using a similar notion: vnode. While the terms are the same you should not confuse Riak buckets for membase buckets though. (↩)