NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Automating MySQL Backups at Facebook Scale

Eric Barrett (Facebook) describes the process used for backing up Facebook’s MySQL cluster1:

Backups are not the most glamorous type of engineering. They are technical, repetitive, and when everything works, nobody notices. They are also cross-discipline, requiring systems, network, and software expertise from multiple teams. But ensuring your memories and connections are safe is incredibly important, and at the end of the day, incredibly rewarding.

If you’d want to make it sound simple, just enumerate the steps:

  1. Binary logs and mysqldump
  2. Hadoop DFS
  3. Long-term storage

Then start asking how you’d accomplish this. With 1 server. With more servers. With more servers while maintaining the availability of the system. See how far you’d be able to answer these questions. At least theoretically.

  1. As a side note, in Fun with numbers: How much data is Facebook ingesting, I’ve guestimated the number of MySQL servers in the 20k range. This post mentions: “thousands of database servers in multiple regions”. 

Original title and link: Automating MySQL Backups at Facebook Scale (NoSQL database©myNoSQL)