ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Memcached: All content tagged as Memcached in NoSQL databases and polyglot persistence

MongoDB, memcached, EHCache: Compared as Distributed L2 Caches

As can be seen, whether the off-host process that manages the cache-data is MongoD or MemcacheD or Terracotta-Server, architecturally they all look equivalent - i.e. a pure-L2 with no-L1 - so that all data needs to be retrieved from over the network and then massaged into a POJO for consumption by the application.

MongoDB, memcached, EHCache compared

When speaking about caching systems, I’d also include criteria like:

  • warm up strategy
  • locking strategy
  • single-machine memory spill strategy

Original title and link: MongoDB, memcached, EHCache: Compared as Distributed L2 Caches (NoSQL databases © myNoSQL)

via: http://javamuse.blogspot.com/2011/03/nosql-document-based-or-distributed.html


Redis: Le système de cache parfait

I love how this sounds in French:

Après 3 ans d’une histoire d’amour fidèle avec Memcached; le serveur de cache notamment utilisé par Facebook, Youtube ou Twitter; je suis au bord de la rupture après avoir rencontré redis.

The author, Julien Crouzet, mentions three key features of Redis:

  • non-volatile data
  • performance
  • support for data types

On these points:

But Redis’ support for data types (lists, sets, sorted sets, and hashes) is not up for debate.

Original title and link: redis : Le système de cache parfait (NoSQL databases © myNoSQL)

via: http://blog.juliencrouzet.fr/484/redis-le-systeme-de-cache-parfait/


How to Maintain a Set in Memcached

Could you imagine a solution for storing a set into memcached satisfying these requirements:

  • must minimize round trips to the servers
  • O(1) add (for both current size and new items coming in)
  • O(1) remove (for both current size and items being removed)
  • O(1) fetch
  • lock and wait free
  • easy to use
  • easy to understand
  • no required explicit maintenance

Dustin Sallings describes how to achieve it using just three memcached operations and some clever but artificial data encoding.

Nuno Job suggests the right answer is using Redis.

Original title and link: How to Maintain a Set in Memcached (NoSQL databases © myNoSQL)

via: http://dustin.github.com/2011/02/17/memcached-set.html


From No Cache to Membase: The Knot

Jason Sirota is telling the story of how The Knot (a media company) went from no cache to Membase passing through memcached and Gear6.

In talking to Membase and through our own research, we found that Membase solved all of our original problems, plus our new problems with Gear6.

  1. Membase provides a rich set of both GUI and programmatic tools to manage and monitor the cache.

  2. Membase not only runs on multiple physical nodes but balances keys across those nodes using the vBuckets

  3. Membase runs on Windows and can handle quite a bit more capacity (evidenced by Zynga) than we could possibly use.

  4. Membase uses both HA replication and distributed nodes for different solutions, in our case, it easily supports the 5 node-configuration

  5. Membase provides Buckets that can be configured by Port to allow different teams to have a set amount of space

  6. Hardware can be added both horizontally and vertically to a Membase cluster. However, one limitation is that all nodes have to run the same cache limit so you do need to think carefully about your node size

  7. No company is immune to going under but, in addition to their strong financial state, the risk for Membase is mitigated by two factors:

If you want the simplified version:

  • a typical story where to maintain the quality of the service, caching had to used
  • a typical story where with scale came also the need for better administration and monitoring tool
  • a typical story where op costs should be kept as much under control and even reduced if possible

What made Membase the winning solution for The Knot?

Some would say the feature set, which I’ll probably agree — pointing out though that such features can be found in other NoSQL databases too.

I’d say it’s Membase usage of a well-established protocol. That didn’t require The Knot to completely rewrite the whole persistence layer. Even if Membase would not have had all required features, using the memcached protocol made it the easiest solution to try out as no application changes were needed.

Original title and link: From No Cache to Membase: The Knot (NoSQL databases © myNoSQL)

via: http://jasonsirota.com/the-knot-cache-architecture-part-i-choosing-a?c=1


Tarantool/Silverbox: Another In-Memory Key-Value Store from Mail.Ru

Mail.ru, one of the most popular Russian web sites, has open sourced ☞ Tarantool which among other components includes also (another) in-memory key-value store.

From the ☞ project home:

  • The system is optimized for work with large volumes of data;
  • Tarantool uses snapshot files, which contain the state of the database at the time of copy to disk;
  • Transaction logging in binary log files preserves all changes to database state, allowing automatic restoration of information after system reboot;
  • The system provides high availability, automatic switchover to an available replica in case of crash of any part of the system;
  • The system is fully compatible with the memcached protocol;
  • Local replicas allow system update without interruption to client services;
  • The system provides data replication over the network;
  • Tarantool supplies a simply binary protocol for replication, supporting the creation of additional logic.

It sounds like an improved, HA memcached, which would place it close to products like Membase[1].

@igrigorik


  1. Details about Tarantool are still scarce, so I’m not 100% about it.  ()

Original title and link: Tarantool/Silverbox: Another In-Memory Key-Value Store from Mail.Ru (NoSQL databases © myNoSQL)


Microsoft coaches NoSQL options for Azure cloud

The Register writing about Microsoft initiative to bring NoSQL databases to the Azure cloud, Membase and MongoDB being mentioned in the article[1]:

The addition of NoSQL suits Microsoft - by bringing more people to Azure - and it suits the NoSQLers, because they get more Windows devs to support.

You can run NoSQL options like Mongo and Memcached on Azure after some fiddling and configuring. The goal now is to deliver a development, deployment, and management experience already familiar to those on Windows, SQL Server, and Visual Basic.

Is VMWare/Spring making the same bet for the Java world? Judging by the Spring Data initiative, plus Grails support for Redis, Grails support for MongoDB, I’d say they are.

A question that I’d like to clarify to myself is how popular is memcached in the Java world? My impression is that Java people have stayed away from memcached so far, using Java based solutions like EHCache or Terracotta, but I might be completely wrong.

Original title and link: Microsoft coaches NoSQL options for Azure cloud (NoSQL databases © myNoSQL)

via: http://www.theregister.co.uk/2010/11/12/windows_azure_nosql/


Why Redis? And Memcached, Cassandra, Lucene, ElasticSearch

Why do we keep jumping from one storage engine to another? Can’t we make up our minds already and settle with the “best” storage engine that meets our needs?

In short, No.

A common misconception is the belief that all storage engines are created equal, all designed to simply “store stuff” and provide fast access to your data. Unless your application performs one clearly defined simple task, it is a dire mistake to expect a single storage engine will effectively fulfill all of your data warehousing and processing needs.

I don’t think I need to say that I’m a proponent of polyglot persistence. And that I believe in Unix tools philosophy. But while adding more components to your system, you should realize that such a system complexity is “exploding” and so will operational costs grow too (nb: do you remember why Twitter started to into using Cassandra?) . Not to mention that the more components your system has the more attention and care must be invested figuring out critical aspects like overall system availability, latency, throughput, and consistency.

Original title and link: Why Redis? And Memcached, Cassandra, Lucene, ElasticSearch (NoSQL databases © myNoSQL)

via: http://www.softwareprojects.com/resources/programming/t-redis-a-persistent-key-value-store-2021.html


Memcached/Membase: Writing Your Own Storage Engine

Not sure how many will need to implement their own storage engine, but knowing there’re a couple of projects that support pluggable engines (Project Voldemort, Riak) it might be that for special scenarios special engines could perform better. Now you can learn how to do it for Membase:

Right now we’ve got an engine capable of running get and set load, but it is doing synchronous filesystem IO. We can’t serve our client faster than we can read the item from disk, but we might serve other connections while we’re reading the item off disk.

Original title and link: Memcached/Membase: Writing Your Own Storage Engine (NoSQL databases © myNoSQL)

via: http://blog.membase.com/writing-your-own-storage-engine-memcached-part-3


Terrastore and ElasticSearch to Replace MySQL, Memcached and Sphinx

Currently we are using PHP, MySQL, Sphinx, and Memcached to serve up pages so quick. […]

[…] Our (MY) final decision was to use Terrastore. I’m not sure if it is the fastest, but it is fast. The main reason is how easy it is to scale with growth, how well it protects the data and keeps multiple copies always available, and the fast release cycle which means it is always improving.

As a replacement for Sphinx , we have considered many, but have landed on ElasticSearch, which just so happens to have a direct integration with Terrastore. A no-brainer for us to choose ElasticSearch for our search and ranking algorithms.

While each piece is important, sometimes it is also about the combo.

Original title and link: Terrastore and ElasticSearch to Replace MySQL, Memcached and Sphinx (NoSQL databases © myNoSQL)

via: http://blog.blogreaction.com/terrastorephp-a-lightweight-php5-class-interfacing-terrastore,15759/


Redis and Memcached Benchmarks

A long and interesting discussion on comparing Redis and Memcached performance. It all started ☞ here:

After crunching all of these numbers and screwing around with the annoying intricacies of OpenOffice, I’m giving Redis a big thumbs down. My initial sexual arousal from the feature list is long gone. Granted, Redis might have its place in a large architecture, but certainly not a replacement to memcache. When your site is hammering 20,000 keys per second and memcache latency is heavily dependent on delivery times, it makes no business sense to transparently drop in Redis. The features are neat, and the extra data structures could be used to offload more RDBMS activity… but 20% is just too much to gamble on the heart of your architecture.

Salvatore Sanfilippo ☞ followed up:

[…] this is why the sys/toilet benchmark is ill conceived.

  • All the tests are run using a single client into a busy loop.
  • when you run single clients benchmarks what you are metering actually is, also: the round trip time between the client and the server, and all the other kind of latencies involved, and of course, the speed of the client library implementation.
  • The test was performed with very different client libraries

But he also published a new benchmark. And Dormando ☞ published an update picking on the previous two:

The “toilet” bench and antirez’s benches both share a common issue; they’re busy-looping a single client process against a single daemon server. The antirez benchmark is written much better than the original one; it tries to be asyncronous and is much more efficient.

And it didn’t stop here, as Salvatore felt ☞ something was still missing:

The test performed by @dormando was missing an interesting benchmark, that is, given that Redis is single threaded, what happens if I run an instance of Redis per core?

I assume everyone is asking by now: which one of Redis and Memcached performed better? And the answer is: it depends (even if some would like to believe differently).

But why is this the “answer”? Firstly, because creating good benchmarks is really difficult. Most of the benchmarks are focusing on the wrong thing or they are covering not very real-life like problems.

This would be my very simple advise:

  • basic benchmarks will not give you real answers
  • you are better testing for your very specific scenario (data size, concurrency level,

Original title and link: Redis and Memcached Benchmarks (NoSQL databases © myNoSQL)


Membase Releases the 4th Beta, Featuring Memcached Buckets

The forth beta release of Membase, the scalable Memcached big brother, featuring a memcached buckets:

You now can create buckets in your Membase Server cluster that behave exactly like memcached, which means you can use Membase Server as a drop-in replacement for your existing memcached setup. In a single cluster you can now share the resources between memcached buckets and membase buckets.

The compatibility with memcached is a clear statement that Membase is after memcached users. But so is Redis!

Original title and link: Membase Releases the 4th Beta, Featuring Memcached Buckets (NoSQL databases © myNoSQL)

via: http://blog.northscale.com/northscale-blog/2010/09/membase-server-beta4-with-memcached-buckets.html