NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



MongoDB Usecase: Genome Data Experiments

Jan Aerts, genetics researcher at Cambridge, is using MongoDB to run some experiments on the 1000genomes project. I am not sure the motivation that brought Jan to MongoDB is the best, but my purpose is not to stop any NoSQL experiments, but report them:

However we would end up with a lot of NULLs in that table. […] This is where you can start thinking of using a document-oriented database for storing these SNP data: each document will be tailored to a specific SNP and will e.g. not refer to the JPTCHB population if it it not present in that population. Enter mongodb.

The post includes code for loading data into MongoDB and also applying MapReduce for getting some results out. Some additional notes from the post:

This script (nb the MapReduce) takes 50 minutes to run using a mongo database on my MacBook laptop.


Unfortunately, you have to define the map and reduce functions in javascript, which is a bit unsightly within a ruby script, but so be it.

He also points to the excellent MongoDB aggregation tutorial.