membase: All content tagged as membase in NoSQL databases and polyglot persistence
Wednesday, 15 February 2012
Polyglot persistence at Pinterest: Redis, Membase, MySQL

I’ve created the diagram above based on this very brief answer on Quora:
We use python + heavily-modified Django at the application layer. Tornado and (very selectively) node.js as web-servers. Memcached and membase / redis for object- and logical-caching, respectively. RabbitMQ as a message queue. Nginx, HAproxy and Varnish for static-delivery and load-balancing. Persistent data storage using MySQL. MrJob on EMR for map-reduce.
Data from October 2011 showed Pinterest having over 3 million users generating 400+ million pageviews. There are plently of questions to be answered though:
-
what is node.js used for? what is RabbitMQ used for?
Note: the whole section in the diagram about node.js and RabbitMQ is speculative.
-
is Amazon Elastic MapReduce used for clickstream analysis only (log based analysis) or more than that?
-
how is data loaded in the Amazon cloud?
Note: if Amazon Elastic MapReduce is used only for analyzing logs, these are probably uploaded regularly on Amazon S3.
-
why the need for both Redis and Membase?
Original title and link: Polyglot persistence at Pinterest: Redis, Membase, MySQL (©myNoSQL)
Wednesday, 8 February 2012
The Couchbase Genealogy
Looks like Matthew Aslett (the451group) had his own version of the Couchbase genealogy:

Credit Matt Aslett .
Original title and link: The Couchbase Genealogy (©myNoSQL)
Monday, 30 January 2012
History of Couch Projects
Just in case you thought someone made up the whole thing about the status of CouchDB being confusing:

Found in Koji Kawamura‘s Introduction of CouchDB JP slides .
On the other hand I’m still trying to figure out if things in CouchDB land were more confusing than the various Hadoop versions out there. If you compare the two genealogy trees you’ll notice a reversed pattern.
Original title and link: History of Couch Projects (©myNoSQL)
Monday, 23 January 2012
Couchbase Server 1.8 Released, Rebranding and Some Improvements in Cluster Rebalancing
Couchbase Server 1.8 replaces Membase Server 1.7 as our “flagship” database offering. In addition to the obvious rebranding, we’ve made substantial improvements in the cluster rebalancing process and fixed a number of nagging issues in 1.7.
In case you feel lost with which Couchbase products are which, read my 5 bullet points explanation.
Original title and link: Couchbase Server 1.8 Released, Rebranding and Some Improvements in Cluster Rebalancing (©myNoSQL)
via: http://blog.couchbase.com/membase-server-now-couchbase-server
Couchbase: Clarifying Confusions in 5 Bullet Points
Here are the 5 bullet points that would helped Couchbase clarify all the confusion about Couchbase, Membase, CouchDB:
- We are working on Couchbase server 2.0. This is our next major release and the only product we will be focusing next. It represents the continuation of our current Membase server product.
- Until Couchbase server 2.0 is out, we might release one or two updates to our Membase server that are addressing the most important issues.
- We will provide a migration path to users of Membase server to Couchbase server 2.0
- We will not support anymore our distribution of CouchDB known as Couchbase Single Server. Damien Katz, creator of CouchDB, has decided to step away from the Apache CouchDB project and focus on Couchbase development.
- Due to the major changes in Couchbase server 2.0, we will not offer a migration path for the users of Couchbase Single Server to Couchbase server 2.0.
Original title and link: Couchbase: Clarifying Confusions in 5 Bullet Points (©myNoSQL)
Tuesday, 13 December 2011
Unintentional Market Confusion... Membase, CouchDB, or Couchbase
Not everything went as we hoped or expected, however. Unfortunately, we confused the heck out of many of our potential users. In addition to Membase Server and our new mobile products we also offered Couchbase Single Server which was a packaged “distribution” of Apache CouchDB. On top of that we began releasing developer previews of Couchbase Server 2.0, which incorporated CouchDB technology into Membase Server – but this product was not compatible with Couchbase Single Server (or CouchDB). If you are confused just reading this you get the point – and so do we.
Finally.
Original title and link: Unintentional Market Confusion… Membase, CouchDB, or Couchbase (©myNoSQL)
Monday, 12 December 2011
Migrating a Membase Cluster
Shawn Chiao documents the migration of a 8 nodes Membase cluster storing 240mil. key-value pairs for a total of 160GB—part 1 and part 2:
After up all night babysitting the rebalance process, I am happy to report that it was a rather uneventful night of maintenance. The rebalance itself took 8-9 hours to complete, and then took another hour for all the replicas to get saved to the disk also. Theoretically, I didn’t need to take the site down while the rebalance was happening, but I took the game down just to be safe and not compromise the game experience.
Question is if the application was stopped, wasn’t there any other migration approach that would reduce the time window for completing the migration?
What I’m thinking of is that if there are no new writes to the system then one could:
- add the new nodes as “slaves” for existing nodes (also change the replication factor)
- once these have caught up, change the master to one of the new nodes
- kill old nodes
This would basically avoid reshuffling the data across the cluster.
Another thing that causes this warm-up to take a long time is the fact that membase uses sqlite3 engine for persisting data to the disk. Sqlite3 uses btree to store its data, and when items are deleted, the underlying btree pages are merely marked as “free”. Later on when new items are stored, their content can be spread over different pages, causing fragmentation. So if the membase cluster is seeing a lot of delete or expiration, which ours does, the warm-up time will slowly increase overtime. This fragmentation issue will be addressed in the next major release Couchbase 2.0, since it will be replacing sqlite3 with CouchDB. But in the mean time, this is a real problem that we will need to deal with in production.
Questions:
- is Membase using 1 sqlite3 engine per node or per bucket?
- isn’t sqlite3 single threaded thus making all writes and reads sequential?
Original title and link: Migrating a Membase Cluster (©myNoSQL)
Tuesday, 6 December 2011
Membase Cluster on EC2 or Amazon ElastiCache?
While there are some advatanges for using a Membase cluster on EC2 instead of an ephemeral Memcached-based service like Amazon ElastiCache, one question remains: self-managed vs managed? Answering it is essential to undertand the final OPEX.
Advantages of using a Membase cluster instead of ElastiCache:
- persistent vs ephemeral data
- backup & restore
- SASL authentication
- using reserved instances
- cluster elasticity with automatic rebalancing and no need to cache warming
When calculating the OPEX for each of these solutions, one would need to account for:
- licensing fees [1]
- monitoring, maintenance, repairs
- salary and wages
In terms of service fees here is a quick comparison:

-
Membase has both a Community and Enterprise editions ↩
Original title and link: Membase Cluster on EC2 or Amazon ElastiCache? (©myNoSQL)
via: http://forecastcloudy.net/2011/12/06/membase-cluster-instead-of-elasticache-in-5-minutes/
Friday, 4 November 2011
How to Cache PHP Sessions in Membase
Why Membase is the next step after Memcached:
Memcache is great, but once you start running low on memory (as you cache more info) lesser-used items in the cache will be destroyed to free up more space for new items. This can result in users getting logged out. Also, if one of the servers in the pool fails or gets rebooted, all the data it was holding is lost, and then the cache must get “warmed up” again.
Membase is memcache with data persistence. The improvement of having data persistence is that if you need to bring down a server, you don’t have to worry about all that dainty, floaty data in memory that is gonna get burned. Since membase has replication built-in, you can feel free to restart a troublesome server with fear of your database getting pounded as the caches need to refill, or that a set of unlucky users will get logged out. I’ll let you read about all the many other advantages of membase here. It’s much more than I’ve mentioned here.
Original title and link: How to Cache PHP Sessions in Membase (©myNoSQL)
via: http://www.startupnextdoor.com/2011/11/how-to-cache-php-sessions-in-membase/
Thursday, 22 September 2011
From Memcached to Membase Memcached Buckets
SaltwaterC:
But this post isn’t about switching from a volatile cache to a persistent solution. It is about removing the dumb part from the memcached setup.
So I thought I’ll read about the advantages of virtual nodes/buckets and elastic clusters, cold vs warm caches, cluster recoverability, the widely used memcached protocol and the possibility to use extensions in future versions, etc. Instead I’ve learned about Moxy-based cluster configuration discoverability and how stupid the memcached PHP libraries are.
But I really enjoyed Matt Ingenthron’ quote:
at Membase Inc they view Memcached as a rabbit. It is fast, but it is pretty dumb and procreates quickly. Before you know it, it will be running wild all over your system.
Original title and link: From Memcached to Membase Memcached Buckets (©myNoSQL)
Monday, 5 September 2011
Use Membase and You'll Never Want to Mess With Memcached Servers Again
All I can say is WOW. I’ll never use stand alone memcached server(s) again.
crazy easy to install and make a cluster.
0 changes to your app code. Operates seamlessly with memcached protocol. If you want to take advantage of advanced features, you need to modify app code.
you can dynamically add and remove nodes without losing all your keys/data.
2 bucket types:
- Membase: supports data persistence (writes them ionicely to disk) and replication (one node dies, you dont lose your key/value pairs). It sends data to disk as fast as it can (while giving priority to getting data back from disk). This is done asynchronously (with an option for synchronous), so clients shouldn’t be able to perceive a difference between Membase and memcached data buckets.
- Memcached: no persistence or replication. all in memory. I would highly recomend going membase bucket unless you have some I/O concerns (like you get charged for I/O in the cloud).
Awesome admin web UI.
lots of documentation
helpful community
The only concern I could think one would have to replace memached with Membase is the maturity of the cluster solution. But on this front, things will only get better, probably before memcached will get an auto-scaling solution.
Original title and link: Use Membase and You’ll Never Want to Mess With Memcached Servers Again (©myNoSQL)
via: http://rynop.com/use-membase-and-youll-never-want-to-mess-with
Wednesday, 6 July 2011
Zynga, Data Centers, Polyglot Persistence, and Big Data
Cadir Lee (CTO Zynga) quoted in a VentureBeat post:
It’s not the amount of hardware that matters. It’s the architecture of the application. You have to work at making your app architecture so that it takes advantage of Amazon. You have to have complete fluidity with the storage tier, the web tier. We are running our own data centers. We are looking more at doing our own data centers with more of a private cloud.
Couple of thoughts:
- Zynga is going the opposite direction than Netflix. While Netflix is focusing (by using Amazon for most of their infrastructure), Zynga is diversifying (building their own data centers) .
- Zynga’s applications are great examples of where fully distributed NoSQL databases fit. Availability is key.
- My answer to the question: “how many Zyngas are out there” would be: “enough to ensure some good business for the most reliable and scalable distributed databases”
- Zynga has contributed and is an investor in Membase, the company that merged with CouchOne to form Couchbase. But Zynga was using a custom version of Membase.
- Zynga also operates a large MySQL cluster.
- Zynga processes over 15 terabytes of game data every day (according to their SEC filing ). That’s Hadoop sweet spot.
PS: I’d love to talk to someone from Zynga about their data storage approach. If you have any connections I’d really appreciate an introduction.
Original title and link: Zynga, Data Centers, Polyglot Persistence, and Big Data (©myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling