NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



memcacheDB: All content tagged as memcacheDB in NoSQL databases and polyglot persistence

Redis: Le système de cache parfait

I love how this sounds in French:

Après 3 ans d’une histoire d’amour fidèle avec Memcached; le serveur de cache notamment utilisé par Facebook, Youtube ou Twitter; je suis au bord de la rupture après avoir rencontré redis.

The author, Julien Crouzet, mentions three key features of Redis:

  • non-volatile data
  • performance
  • support for data types

On these points:

But Redis’ support for data types (lists, sets, sorted sets, and hashes) is not up for debate.

Original title and link: redis : Le système de cache parfait (NoSQL databases © myNoSQL)


Another NoSQL Comparison: Evaluation Guide

The requirements were clear:

  • Fast data insertion.
  • Extremely fast random reads on large datasets.
  • Consistent read/write speed across the whole data set.
  • Efficient data storage.
  • Scale well.
  • Easy to maintain.
  • Have a network interface.
  • Stable, of course.

The list of NoSQL databases to be compared: Tokyo Cabinet, BerkleyDB, MemcacheDB, Project Voldemort, Redis, and MongoDB, not so clear.

The methodology to evaluate and the results definitely not clear at all.

NoSQL Comparison Guide / A review of Tokyo Cabinet, Tokyo Tyrant, Berkeley DB, MemcacheDB, Voldemort, Redis, MongoDB

And the conclusion is quite wrong:

Although MongoDB is the solution for most NoSQL use cases, it’s not the only solution for all NoSQL needs.

There were a couple of people asking for more details about my comments on this NoSQL comparison, so here they are:

  1. the initial list of NoSQL databases to be evaluated looks at the first glance a bit random. It includes some not so used solutions (memcachedb), some that are not , while leaving aside others that at least at the high level would correspond to the characteristics of others in the list (Riak, Membase)
  2. another reason for considering the initial choice a bit random is that while scaling is listed as one of the requirements, the only truly scalable in the list would be Project Voldemort. The recently added auto-sharding and replica sets would make MongoDB a candidate too, but a search on the MongoDB group would show that the solution is still young
  3. even if the set of requirements is clear, there’s no indication of what kind of evaluation and how was it performed. Without knowing what and how and it is impossible to consider the results as being relevant.
  4. as Janl was writing about benchmarks, most of the time you are doing it wrong. Creating good, trustworthy, useful, relevant benchmarks is very difficult
  5. .
  6. the matrix lists characteristics that are difficult to measure. And there are no comments on how the thumbs up were given. Examples: what is manageability and how was that measured? Same questions for stability and feature set.
  7. because most of it sounds speculative here are a couple of speculations:
    1. judging by the thumbs up MongoDB received for insertion/random reads for large data set, I can assume that data hasn’t overpassed the available memory. But on the other hand, Redis was dismissed and received less votes due to its “more” in-memory character
    2. Tokyo Cabinet and Redis project activity and community were ranked the same. When was the last release of Tokyo Cabinet?
  8. I’m leaving up to you to decide why the conclusion — “Although MongoDB is the solution for most NoSQL use cases”” is wrong.

Original title and link: Another NoSQL Comparison: Evaluation Guide (NoSQL databases © myNoSQL)


MemcacheDB History at Reddit

Steve Huffman (co-founder and programmer of Reddit) speaking at ☞ FOWA Miami 2010 (around min.18:30)[1]:

And then there is another software that is really handy MemcacheDB, which is like memcached but is persistent. […] It’s very very fast, super-handy, we store far more data in MemcacheDB than we do in Postgres

Then bam! MemcacheDB bursting blocking writes leading Reddit to switch to Cassandra as friends from Digg or Twitter did.

Lesson learned: take such pieces of advise with a grain of salt and always test your scenario.

  1. It looks like Steve was not working at Reddit anymore at the time the presentation was made and so he might not have been aware of the problems related to MemcacheDB.  ()

Memcachedb Bursting Blocking Writes

I read that Reddit guys are using Memcachedb and its bursting blocking writes behavior was causing them a lot of problems lately [1].

Memcachedb is Memcached with a built-in permanent storage system using BDB. One of the “features” of this system is that it saves up its disk writes and then bursts them to the disk. Unfortunately, the single EBS volumes they were on could not handle these bursting writes. Memcachedb also has another feature that blocks all reads while it writes to the disk. These two things together would cause the site to go down for about 30 seconds every hour or so lately.

I’d really love to hear how other NoSQL solutions are behaving in this particular scenario. Based on the characteristics of Memcachedb, I’m looking for an answer at least from the key-value stores: Project Voldemort, Redis, SimpleDB, Tokyo Cabinet, M/DB, etc. But if others want to jump in and present their solution that would be great. You can either post your reply as a comment or send it over an emai and I’ll make sure it will get included here. Thanks!

Update: I already got answers from Redis and Terrastore (document store).

Update: CouchDB and Tokyo Cabinet (note it is a bit too generic) answers are in. Still a couple are missing so please keep them coming!

Update: FleetDB (note: a data store that I haven’t covered yet, but I can promise you it is coming) has posted details about its behavior. I have also received some hints about Project Voldemort behavior but it looks like I’ll have to dig a bit deeper myself (I’ll do the same for Tokyo Cabinet). I am still awaiting comments from Riak and MongoDB. At least until now, nobody I know from Cassandra and HBase has offered to comment.

NoSQL: Distributed and Scalable Non-Relational Database Systems

NoSQL makes it in the Linux Magazine:

There’s an interesting shift happening in the world of Web-scale data stores. A whole new breed of scalable data stores is gaining popularity very quickly. The traditional LAMP stack is starting to look like a thing of the past. For a few years now, memcached has often appeared right next to MySQL, and now the whole “data tier” is being shaken up.

The article enumerates a couple of the solutions we came to become very interested in: Redis, Tokyo Cabinet, CouchDB, Riak, etc.