NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



An Introduction to MongoDB

Good slidedeck from Chris Westin (10gen engineer)—I particularly liked the slides summarizing some of the limitations in the relational databases:

Schema Evolution

  • Applications are evolving all the time
    • Applications need new fields
    • Applications need new indexes
    • Data is growing — sometimes very fast
  • Users need to be abelt o alter their schemas without making their data unavailable

Write Rates

  • Replication is a solution for high read loads
  • Sooner or lager, writing becomes a bottleneck
  • Sharding
    • Joins and aggregation become a problem
    • Distributed transactions are too slow for the web
    • Manual management of shards
      • Choosing shard partitions
      • Rebalancing shards

Reading through these reminded me of the PNUTS paper, the datastorage solution used by Yahoo!, which seems to have good answers to many of these limitations. The BigTable paper led to creation of HBase and Hypertable and its data model is used in Cassandra too. The Dynamo paper led to the creation of Riak, Project Voldemort, and is used as the distribution model for Cassandra. But I don’t think there’s anything out there taking inspiration from the PNUTS paper.

Original title and link: An Introduction to MongoDB (NoSQL database©myNoSQL)