NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



MongoDB and Site Analytics

Within a few hours, using Sinatra and the MongoDB Ruby driver, I had a little prototype working. Each hit was a single MongoDB operation, an upsert based on the host, with year, month, day, and hour information stored in nested hashes. The nested hashes were updated in the operation using $inc. It did not do much, but it was pretty cool.

MongoDB being fast, fun, easy, web scale and having a “fire and forget” behavior convinces everyone to build new site analytics tools using it.

Ignoring for a second insignificant aspects like business models and feature sets, I do have a technical question: how can you scale such a tool? Sharding by site or referrer or user will not work. Sharding by id or timestamp would spread your data all over and make aggregate functions painful. So what’s the plan?

Original title and link: MongoDB and Site Analytics (NoSQL databases © myNoSQL)