NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Fun With Numbers: How Much Data Is Facebook Ingesting

A recent GigaOM article provides some interesting data points about how much data Facebook is handling:

  • 2.5 bil. content items shared per day
  • 2.7bil. likes per day
  • 300mil. uploaded photos
  • 500+ terabytes of ingested data per day

The numbers above do not include any details about how many data points Facebook is collecting for analytic purposes. But I don’t think I’d be off by assuming this number should probably be a good multiplier of the above numbers. We’ll go with 10 to keep things simple.

A couple of days ago, James Hamilton posted an analysis of Facebook’s Carbon and Energy Impact:

Using the Facebook PUE number of 1.07, we know they are delivering 54.27MW to the IT load (servers and storage). We don’t know the average server draw at Facebook but they have excellent server designs (see Open Compute Server Design) so they likely average at or below as 300W per server. Since 300W is an estimate, let’s also look at 250W and 400W per server:

  • 250W/server: 217,080 servers
  • 300W/server: 180,900 servers
  • 350W/server: 155,057 servers

It’s difficult to determine how many of the 180k servers are databases, but if considering a 1:10 ratio for databases to front end + cache servers, that would give us an approximate number of 18k database servers ingesting 500+ terabytes of data through a guestimated 50+ billion calls.

There’s also something that confuses me about these numbers. If Facebook is getting 300mil. photo uploads per day and ingests 500+ terabytes that could mean that either 1) the average photo size is very low; or 2) Facebook doesn’t count photos when mentioning the ingested data size.

Original title and link: Fun With Numbers: How Much Data Is Facebook Ingesting (NoSQL database©myNoSQL)