ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Migrating a Membase Cluster

Shawn Chiao documents the migration of a 8 nodes Membase cluster storing 240mil. key-value pairs for a total of 160GB—part 1 and part 2:

After up all night babysitting the rebalance process, I am happy to report that it was a rather uneventful night of maintenance.  The rebalance itself took 8-9 hours to complete, and then took another hour for all the replicas to get saved to the disk also.  Theoretically, I didn’t need to take the site down while the rebalance was happening, but I took the game down just to be safe and not compromise the game experience.

Question is if the application was stopped, wasn’t there any other migration approach that would reduce the time window for completing the migration?

What I’m thinking of is that if there are no new writes to the system then one could:

  1. add the new nodes as “slaves” for existing nodes (also change the replication factor)
  2. once these have caught up, change the master to one of the new nodes
  3. kill old nodes

This would basically avoid reshuffling the data across the cluster.

Another thing that causes this warm-up to take a long time is the fact that membase uses sqlite3 engine for persisting data to the disk.  Sqlite3 uses btree to store its data, and when items are deleted, the underlying btree pages are merely marked as “free”.  Later on when new items are stored, their content can be spread over different pages, causing fragmentation.  So if the membase cluster is seeing a lot of delete or expiration, which ours does, the warm-up time will slowly increase overtime.  This fragmentation issue will be addressed in the next major release Couchbase 2.0, since it will be replacing sqlite3 with CouchDB.  But in the mean time, this is a real problem that we will need to deal with in production.

Questions:

  1. is Membase using 1 sqlite3 engine per node or per bucket?
  2. isn’t sqlite3 single threaded thus making all writes and reads sequential?

Original title and link: Migrating a Membase Cluster (NoSQL database©myNoSQL)