ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Learning NoSQL from Twitter’s Experience

Leaving aside the tons of NoSQL Twitter applications — and if that is not enough here are more NoSQL-based Twitter apps and even more, Twitter seems to be having a lot of fun (nb read work and innovation) in the NoSQL space.

It all started with the problem of handling big data in real-time. Nick Kallen’s (@nk) slides below are explaining the problems faced and the way Twitter tackled them:

Then it was the time to consider Cassandra at Twitter:

We have a lot of data, the growth factor in that data is huge and the rate of growth is accelerating. We have a system in place based on shared mysql + memcache but its quickly becoming prohibitively costly (in terms of manpower) to operate. We need a system that can grow in a more automated fashion and be highly available

and scale Twitter with Cassandra (Ryan King (@rk) presentation):

But storing data is not enough and Twitter had to put the NoSQL data to work . For that Twitter is using Hadoop, Pig and HBase as “Cassandra is OLTP and HBase is OLAP“. Kevin Weil (@kevilweil) slides, presented at nosql:eu and Dmitriy Ryaboy (@squarecog) are giving a lot of details about the HBase, Hadoop and Pig usage:

That’s a ton to learn from NoSQL at Twitter!