ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Big and Small Data at Twitter: MySQL CE 2011

Twitter DBA Lead at Twitter, Jeremy Cole‘s talk about MySQL at Twitter from MySQL CE 2011:

Roland Bouman had some interesting notes (nb: actually tweets) from the talk:

  • 115 mln tweets a day, 1 bln tweets a week, about 50.000 new accounts / day

  • random server uptime 212d, 127 bln questions (6943/s) rows read: 1.36 mln/s

  • Use MySQL when it works, something else when not - fortunately MySQL often does work

  • MySQL is used by twitter because it’s robust, replication works and it’s easy to use and run

  • MySQL doesn’t work good for graphs, auto_increment, replication lag is a problem

  • MySQL replication improvements like crash safe multi-threaded slave exactly what they need

  • Twitter open sourced snowflake (id generation system) and Gizzard distributed data storage

  • Use soft launches: new code is launched in a disabled state, turn up slowly, back down if needed

  • Gizzard builds in MySQL/InnoDB handles sharding, replication, job scheduling

  • Twitter uses Cassandra too for some projects. high velocity writes, schemaless design

  • Twitter uses Hadoop for analyzing extremely large datasets: 10 to 100 blns rows (http logs)

  • Twitter also uses vertica for analysis, 100M - 10Blns of rows. Runs 100x faster than MySQL

  • MySQL’s happy place: <= 1.5 TB datasets, archive store for larger sets.

Original title and link: Big and Small Data at Twitter: MySQL CE 2011 (NoSQL databases © myNoSQL)