NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



foursquare: All content tagged as foursquare in NoSQL databases and polyglot persistence

MongoDB in Numbers: Foursquare, Wordnik, Disney

Derrick Harris:

If you’re wondering what kind of performance and scalability requirements forced these companies to MongoDB, and then to customize it so heavily, here are some statistics:

  • Foursquare:
    • 15 million users;
    • 8 production MongoDB clusters;
    • 8 shards of user data;
    • 12 shards of check-in data;
    • ~250 updates per second on user database, with maximum output of 46 MBps;
    • ~80 check-ins per second on check-in database, with maximum output of 45 MBps;
    • up to 2,500 HTTP queries per second.
  • Wordnik:
    • Tens of billions of documents with more always being added;
    • more than 20 million REST API calls per day;
    • mapping layer supports 35,000 records per second.
  • Disney:
    • More than 1,400 MongoDB instances (although “your eyes start watering after 30,” Stevens said);
    • adding new instances every day, via a custom-built self-service portal, to test, stage and host new games.

Add to these Viber Media numbers:

  • 30 million plus registered mobile users
  • 18 million active users talking 11 million minutes every day

I have an exclusive interview with Viber Media people queued up for the next days.

Original title and link: MongoDB in Numbers: Foursquare, Wordnik, Disney (NoSQL database©myNoSQL)


MongoDB at Foursquare: Practical Data Storage

The fine guys from 10gen have granted me access to publish here videos from their Mongo events organized across US and Europe—thanks Meghan.

I’ve decided to start this series with one high profile company in MongoDB portfolio: Foursquare—you’ll understand why I’m saying this if you check the content I’ve published before about Foursquare.

Without further ado, Harry Heymann’s‘ talk: Practical data storage: MongoDB at Foursquare.

Foursquare Setup for MongoDB Replica Sets

Foursquare describes the 3 different replica set setups for their MongoDB servers:

  1. Straight-up replica set with arbiter node
  2. Replica set with dedicated read slaves and single backup node
  3. Shard on top of replica set with dedicated read slaves

Their conclusion:

If this all looks simple, it’s because it is. We’ve been very pleased with how replica sets have smoothed out the bumps in handling operational emergencies as well as for general maintenance and scaling out. Our next step is to configure our replica sets for data center awareness, which is not fully supported but workable with the latest version of MongoDB.

Original title and link: Foursquare Setup for MongoDB Replica Sets (NoSQL databases © myNoSQL)


Rogue: Type-Safe Scala DSL for MongoDB

Foursquare’s post about Rogue:

Rogue is our newly open-sourced, badass (read: type-safe) library for querying MongoDB from Scala.

Rogue is available on Github.

Original title and link: Rogue: Type-Safe Scala DSL for MongoDB (NoSQL databases © myNoSQL)


Rogue: MongoDB Scala-based Query DSL


Rogue is a type-safe internal Scala DSL for constructing and executing find and modify commands against MongoDB in the Lift web framework. It is fully expressive with respect to the basic options provided by MongoDB’s native query language, but in a type-safe manner, building on the record types specified in your Lift models

Open sourced by Foursquare.

Original title and link: Rogue: MongoDB Scala-based Query DSL (NoSQL databases © myNoSQL)

Foursquare MongoDB Outage Post Mortem

We are finally getting some details about how and why MongoDB brought down Foursquare:

On Monday morning, the data on one shard (we’ll call it shard0) finally grew to about 67GB, surpassing the 66GB of RAM on the hosting machine. Whenever data size grows beyond physical RAM, it becomes necessary to read and write to disk, which is orders of magnitude slower than reading and writing RAM. Thus, certain queries started to become very slow, and this caused a backlog that brought the site down.


  • have writes been configured to use the MongoDB default fire and forget behavior? In that case it wouldn’t matter so much for the request processing that the write would go to disk
  • if replicas were available have reads been distributed among these?
  • if no replicas were available, how quick and what would be the quickest approach to bring up read only replicas?

We first attempted to fix the problem by adding a third shard. We brought the third shard up and started migrating chunks. Queries were now being distributed to all three shards, but shard0 continued to hit disk very heavily. When this failed to correct itself, we ultimately discovered that the problem was due to data fragmentation on shard0. In essence, although we had moved 5% of the data from shard0 to the new third shard, the data files, in their fragmented state, still needed the same amount of RAM.


  • why the 3rd shard could accommodate only 5% of the data?

This can be explained by the fact that Foursquare check-in documents are small (around 300 bytes each), so many of them can fit on a 4KB page. Removing 5% of these just made each page a little more sparse, rather than removing pages altogether.[2]


  • I might be wrong, but it sounds like the problem here is that chunks of data are not using contiguous space. Considering that MongoDB is supposed to work with any size of documents what solutions are planned for addressing this issue?

There have been a separate discussion, in which Dwight Merriman (10gen) provided ☞ more details:

  • If your shard key has any correlation to insertion order, I think you are ok.
  • If you add new shards very early s.t. the source shard doesn’t have high load, i think you are ok.
  • If your objects are fairly large (say 4KB+), i think you are ok.
  • If the above don’t hold, you will need the defrag enhancement which we will do asap.

First point above, seems to confirm my last comment on this subject: sharding keys.

Since repairing shard0 and adding a third shard, we’ve set up even more shards, and now the check-in data is evenly distributed and there is a good deal of extra capacity.

Based on the fact that the sharding key is user ids, how exactly can you guarantee even distribution? As long as bringing up more shards doesn’t address immediately the issue of automatic balancing, wouldn’t you better shard based on data that grow continuously and that can show unpredictable evolution?

While it’s extremely interesting to hear all these details and I highly appreciate that Foursquare and 10gen engineers have decided to share these information (nb I’ve been trying to convince 10gen about this myself), I think there are still a few open questions.

Update: ☞ the Hacker News thread

Original title and link: Foursquare MongoDB Outage Post Mortem (NoSQL databases © myNoSQL)


MongoDB Auto-sharding and Foursquare Downtime

Foursquare is one of the companies that we’ve previously mentioned as being excited about MongoDB geo support added with version 1.4. It looks like they’ve also been the users of MongoDB auto-sharding feature included in the 1.6 series, but that didn’t work so well for them:

Starting around 11:00am EST yesterday, we noticed that one of these shards was performing poorly because a disproportionate share of check-ins were being written to it. For the next hour and a half, until about 12:30pm, we tried various measures to ensure a proper load balance. None of these things worked. As a next step, we introduced a new shard, intending to move some of the data from the overloaded shard to this new one.

There are quite a few questions about what happened there with Foursquare’s sharded MongoDB:

  • why adding a new shard brought down the whole system?
  • what is the best approach of migrating live data?
  • why a shard reindexing was needed?
  • was there any data loss or data corruption?
  • are there ways to predict possible overloads on specific shards?

Once things will calm down I hope to be able to find some answers from Foursquare and 10gen (if you have contacts please hook me up).

Original title and link: MongoDB Auto-sharding and Foursquare Downtime (NoSQL databases © myNoSQL)