A new great article from Todd Hoff dissecting the DataSift architecture:
In terms of data store, DataSift architecture includes:
- MySQL (Percona server) on SSD drives
- HBase cluster (currently, ~30 hadoop nodes, 400TB of storage)
- Memcached (cache)
- Redis (still used for some internal queues, but probably going to be dismissed soon)
Leave whatever you were doing and go read it now.
Original title and link: DataSift Using MySQL, HBase, Memcached to Deal With Twitter Firehose ( ©myNoSQL)