Memcached: All content tagged as Memcached in NoSQL databases and polyglot persistence
Monday, 28 November 2011
Rails Caching Benchmarked: MongoDB, Redis, Memcached
A couple of Rails caching solutions—file, memcached, MongoDB, and Redis—benchmarked firstly here by Steph Skardal and then here by Thomas W. Devol. Thomas W. Devol concludes:
Though it looks like mongo-store demonstrates the best overall performance, it should be noted that a mongo server is unlikely to be used solely for caching (the same applies to redis), it is likely that non-caching related queries will be running concurrently on a mongo/redis server which could affect the suitability of these benchkmarks.
I’m not a Rails user, so please take these with a grain of salt:
-
without knowing the size of the cached objects, at 20000 iterations most probably neither MongoDB, nor Redis have had to persist to disk.
This means that all three of memcached, MongoDB, Redis stored data in memory only[1]
-
if no custom object serialization is used by any of the memcached, MongoDB, Redis caches, then the performance difference is mostly caused by the performance of the driver
-
it should not be a surprise to anyone that the size of the cached objects can and will influence the results of such benchmarks
-
there doesn’t seem to be any concurrent access to caches. Concurrent access and concurrent updates of caches are real-life scenarios and not including them in a benchmark greatly reduces the value of the results
-
none of these benchmarks doesn’t seem to contain code that measure the performance of cache eviction
-
Except the case where any of these forces a disk write ↩
Original title and link: Rails Caching Benchmarked: MongoDB, Redis, Memcached (©myNoSQL)
Friday, 25 November 2011
Griffon and NoSQL Databases
Andres Almiray:
The following list enumerates all NoSQL options currently supported by Griffon via plugins:
- BerkeleyDB
- CouchDB
- Memcached
- Riak
- Redis
- Terrastore
- Voldemort
- Neo4j
- Db4o
- Neodatis
The first 7 are Key/Value stores. Neo4j is a Graph based database. The last two are object stores. All of them support multiple datasources, data bootstrap and a Java friendly API similar to the one shown earlier.
Griffon is a Groovy-based framework for developing desktop applications. While the coolness factor of Java-based desktop apps is close to zero, having some multi-platform management utilities for these NoSQL databases might be interesting.
Original title and link: Griffon and NoSQL Databases (©myNoSQL)
via: http://www.jroller.com/aalmiray/entry/griffon_to_sql_or_nosql
Tuesday, 15 November 2011
Twitter's Real-Time URL Fetcher Using Cassandra and Memcached
Twitter’s real-time URL fetcher, code named SpiderDuck, is an excellent example of how NoSQL databases fit in the architecture of today’s systems:
Metadata Store: This is a Cassandra-based distributed hash table that stores page metadata and resolution information keyed by URL, as well as fetch status for every URL recently encountered by the system. This store serves clients across Twitter that need real-time access to URL metadata.
SpiderDuck is also using memcached:
Memcached: This is a distributed cache used by the fetchers to temporarily store robots.txt files.

Original title and link: Twitter’s Real-Time URL Fetcher Using Cassandra and Memcached (©myNoSQL)
via: http://engineering.twitter.com/2011/11/spiderduck-twitters-real-time-url.html
Friday, 4 November 2011
How to Cache PHP Sessions in Membase
Why Membase is the next step after Memcached:
Memcache is great, but once you start running low on memory (as you cache more info) lesser-used items in the cache will be destroyed to free up more space for new items. This can result in users getting logged out. Also, if one of the servers in the pool fails or gets rebooted, all the data it was holding is lost, and then the cache must get “warmed up” again.
Membase is memcache with data persistence. The improvement of having data persistence is that if you need to bring down a server, you don’t have to worry about all that dainty, floaty data in memory that is gonna get burned. Since membase has replication built-in, you can feel free to restart a troublesome server with fear of your database getting pounded as the caches need to refill, or that a set of unlucky users will get logged out. I’ll let you read about all the many other advantages of membase here. It’s much more than I’ve mentioned here.
Original title and link: How to Cache PHP Sessions in Membase (©myNoSQL)
via: http://www.startupnextdoor.com/2011/11/how-to-cache-php-sessions-in-membase/
Friday, 23 September 2011
Memcached and Sherpa for Yahoo! News Activity Data Service
Mixer, the recently announced Yahoo’s new data service for news activities, uses Memcached and Sherpa for its data backend. Plus a combination of asynchronous libraries and task execution tools:

The data processing model and the clear separation between read and write data solutions is not only compelling, but essential for maintaining the SLA (max. 250ms/response):
Memcache maintains two types of materialized views: 1) Consumer-pivoted, and 2) Producer-pivoted. Consumer-pivoted views (e.g. user’s friends’ latest read activity) are refreshed at query time by refresh tasks. Producer-pivoted views (e.g. user’s latest read activity) are refreshed at update time (i.e. when “read” event is posted). And producer-pivoted views are used to refresh consumer-pivoted views.
Sherpa is Yahoo!’s cloud-based NoSql data store that provides low-latency reads and writes of key-value records and short range scans. Efficient range scans are particular important for the Mixer use cases. The “read” event is stored in the Updates table. The Updates table is a Sherpa Distributed Ordered Table that is ordered by “user,timestamp desc”. This provides efficient scans through a user’s latest read activity. A reference to the “read” record is stored in the UpdatesIndex table to support efficient point lookups. UpdatesIndex is a Sherpa Distributed Hash Table
Original title and link: Memcached and Sherpa for Yahoo! News Activity Data Service (©myNoSQL)
Thursday, 22 September 2011
From Memcached to Membase Memcached Buckets
SaltwaterC:
But this post isn’t about switching from a volatile cache to a persistent solution. It is about removing the dumb part from the memcached setup.
So I thought I’ll read about the advantages of virtual nodes/buckets and elastic clusters, cold vs warm caches, cluster recoverability, the widely used memcached protocol and the possibility to use extensions in future versions, etc. Instead I’ve learned about Moxy-based cluster configuration discoverability and how stupid the memcached PHP libraries are.
But I really enjoyed Matt Ingenthron’ quote:
at Membase Inc they view Memcached as a rabbit. It is fast, but it is pretty dumb and procreates quickly. Before you know it, it will be running wild all over your system.
Original title and link: From Memcached to Membase Memcached Buckets (©myNoSQL)
Monday, 5 September 2011
Use Membase and You'll Never Want to Mess With Memcached Servers Again
All I can say is WOW. I’ll never use stand alone memcached server(s) again.
crazy easy to install and make a cluster.
0 changes to your app code. Operates seamlessly with memcached protocol. If you want to take advantage of advanced features, you need to modify app code.
you can dynamically add and remove nodes without losing all your keys/data.
2 bucket types:
- Membase: supports data persistence (writes them ionicely to disk) and replication (one node dies, you dont lose your key/value pairs). It sends data to disk as fast as it can (while giving priority to getting data back from disk). This is done asynchronously (with an option for synchronous), so clients shouldn’t be able to perceive a difference between Membase and memcached data buckets.
- Memcached: no persistence or replication. all in memory. I would highly recomend going membase bucket unless you have some I/O concerns (like you get charged for I/O in the cloud).
Awesome admin web UI.
lots of documentation
helpful community
The only concern I could think one would have to replace memached with Membase is the maturity of the cluster solution. But on this front, things will only get better, probably before memcached will get an auto-scaling solution.
Original title and link: Use Membase and You’ll Never Want to Mess With Memcached Servers Again (©myNoSQL)
via: http://rynop.com/use-membase-and-youll-never-want-to-mess-with
Tuesday, 23 August 2011
Powered by Redis: PUMA.com
Puma.com and other related web properties are using Redis’ hashes, lists, and sets (sorted and unsorted) for fragment caching and third party responses caching:
We used Redis as our cache store for two reasons. First, we were already using it for other purposes, so reusing it kept the technology stack simpler. But more importantly, Redis’ wildcard key matching makes cache expiration a snap. It’s well known that cache expiration is one of two hard things in computer science, but using wildcard key searching, it’s dirt simple to pull back all keys that begin with “views” and contain the word “articles” and expire them everytime an article is changed. Memcached has no such ability.
Original title and link: Powered by Redis: PUMA.com (©myNoSQL)
Memcached in the Cloud: Amazon ElastiCache
Amazon announced today a new service Amazon ElastiCache or Memcached in the cloud. The new service is still in beta and available only in the US East (Virginia) Region.
While many will find this new service useful, it is a bit of a disappointement that Amazon took the safe route and went with pure Memcached. The only notable feature of Amazon ElastiCache is automatic failure detection and recovery. But compared with Membase (and the soon to be released Couchbase 2.0) it is missing clustering, replication, support for virtual nodes, etc. Even if advertising a push-button scaling, ElastiCache will lose cached data on adding or removing instances.
The pace at which Amazon is launching new services is indeed impressive. I’m wondering what will be the first NoSQL database that will get official Amazon support.
Original title and link: Memcached in the Cloud: Amazon ElastiCache (©myNoSQL)
Monday, 6 June 2011
RethinkDB Launches 1.0 Version With Memcached Compatibility Only
Just as I speculated , RethinkDB has finally launched the 1.0 version with Memcached compatibility only. Jason Kincaid (Techcrunch) writes:
RethinkDB has just launched its 1.0 release to the public, and it’s offering a product geared toward NoSQL installations — and it will work on SSDs, traditional drives, and cloud-based services like AWS. The startup has also moved away from MySQL and now fully supports Memcached.
But RethinkDB is not the first product providing a Memcached compatible (disk) persistent storage engine. One year ago Membase was launched promising not only a persistent Memcached compatible solution, but also elastic scalability.
RethinkDB has also published a performance report (PDF) demonstrating RethinkDB speed compared to Membase and MySQL. But if I’m reading those numbers correctly, while RethinkDB leads the majority of query-per-second (QPS) benchmarks, MySQL is consistently showing better latency numbers (which is kind of weird). For a strong durability scenario, the benchmark shows MySQL delivering 2x QPS compared to RethinkDB.
Another interesting aspect of the RethinkDB 1.0 release is the licensing model —which I don’t fully get:
RethinkDB Basic is currently identical in feature-set to RethinkDB Premium and Enterprise. However, the paid versions of RethinkDB include phone and email support, access to all future updates, and volume licensing options.
Or spelled out on the TechCrunch post :
Akhmechet says that the free version will get security updates, but that it won’t necessarily receive new features in the future, whereas the premium version will.
Original title and link: RethinkDB Launches 1.0 Version With Memcached Compatibility Only (NoSQL databases © myNoSQL)
Thursday, 7 April 2011
Optimizing Memcached Performance on a Rapidly Growing Site
Predicting operational growth by monitoring the correct metrics:
Such simple data can reveal a wealth of insights. Most important is the cache’s miss rate: how frequently do we need to regenerate data? It is the miss rate that ultimately impacts site performance. Using such data, we were shocked to discover that we were caching a lot less than we thought, and that our cache actually behaved quite erratically, with a greater than 2x difference between peak and trough miss rates
The story reminded me of the Foursquare accident.
Original title and link: Optimizing Memcached Performance on a Rapidly Growing Site (NoSQL databases © myNoSQL)
via: http://technology.posterous.com/planning-and-engineering-your-cache-for-maxim-0
Tuesday, 29 March 2011
How Digg is Built? Using a Bunch of NoSQL technologies
The picture should speak for Digg’s polyglot persistency approach:

But here is also a description of the data stores in use:
Digg stores data in multiple types system depending on the type of data and the access patterns, and also for historical reasons in some cases :)
Cassandra: The primary store for “Object-like” access patterns for such things as Items (stories), Users, Diggs and the indexes that surround them. Since the Cassandra 0.6 version we use does not support secondary indexes, these are computed by application logic and stored here. […]
HDFS: Logs from site and API events, user activity. Data source and destination for batch jobs run with Map-Reduce and Hive in Hadoop. Big Data and Big Compute!
MySQL: This is mainly the current store for the story promotion algorithm and calculations, because it requires lots of JOIN heavy operations which is not a natural fit for the other data stores at this time. However… HBase looks interesting.
Redis: The primary store for the personalized news data because it needs to be different for every user and quick to access and update. We use Redis to provide the Digg Streaming API and also for the real time view and click counts since it provides super low latency as a memory-based data storage system.
Scribe: the log collecting service. Although this is a primary store, the logs are rotated out of this system regularly and summaries written to HDFS.
I know this will sound strange, but isn’t it too much in there?
Original title and link: How Digg is Built? Using a Bunch of NoSQL technologies (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling