ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Measuring Redis Storage Overhead

Jeremy Zawodny (of craigslist.com) has published two articles, ☞ here and ☞ here, sharing his experiments on measuring the recently released Redis 2.0.0 RC3 storage overhead for two scenarios:

  • simple key-values
  • hashes, a new data type that will be available with the upcoming Redis 2.0

The experiment is interesting as it shares the code used and so you’ll be able to run it for your particular scenarios. Do keep in mind that the results will vary as they depend heavily on the size of the stored values.

This tells me that on a 32GB box, it’s not unreasonable to host 200,000,000 keys (if their values are sufficiently small). […] The resulting dump file (dump-0.rdb) was 1.8GB in size.

[…]

If you do the math, that yields 1.25 billion (1,250,000,000) key/value pairs stored. […] So it took about 2 hours and 40 minutes to complete. The resulting dump file (.rdb file) was 13GB in size (compared to the previous 1.8GB) and the memory usage was roughly 17GB.

Salvatore Sanfilippo (@antirez), Redis creator and main developer, has a good explanation about the storage overhead:

If you turn a txt file with a list of “common surnames -> percentage of population” into a binary tree it will get more or less an order of magnitude bigger in memory compared to the raw txt file.

This is a common pattern: when you add a lot of metadata, for fast access, memory management, “zero-copy” transmission of this information, expires, …, the size is not going to be the one of concatenating all this data like in a unique string.

[…]

But for now our reasoning is: it’s not bad to be able to store 1 million of keys with less than 200 MB of memory (100 MB on 32bit systems) if an entry level box is able to serve this data at the rate of 100k requests/second, including the networking overhead. And with hashes we have a much better memory performance compared to top level keys. So… with a few GB our users can store ten or hundred of millions of stuff in a Redis server.

☞ Hacker News