ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Cassandra, Zookeeper, Scribe, and Node.js Powering Rackspace Cloud Monitoring

Paul Querna describes the original architecture of Cloudkick and the one that powers the recently announced Rackspace Cloud Monitoring service:

Development framework: from Twisted Python and Django to Node.js

Cloudkick was primarily written in Python. Most backend services were written in Twisted Python. The API endpoints and web server were written in Django, and used mod_wsgi. […] Cloud Monitoring is primarily written in Node.js.

Storage: from master-slave MySQL to Cassandra

Cloudkick was reliant upon a MySQL master and slaves for most of its configuration storage. This severely limited both scalability, performance and multi-region durability. These issues aren’t necessarily a property of MySQL, but Cloudkick’s use of the Django ORM made it very difficult to use MySQL radically differently. The use of MySQL was not continued in Cloud Monitoring, where metadata is stored in Apache Cassandra.

Even more Cassandra:

Cloudkick used Apache Cassandra primarily for metrics storage. This was a key element in keeping up with metrics processing, and providing a high quality user experience, with fast loading graphs. Cassandra’s role was expanded in Cloud Monitoring to include both configuration data and metrics storage.

Event processing: from RabbitMQ to Zookeeper and a bit more Cassandra

RabbitMQ is not used by Cloud Monitoring. Its use cases are being filled by a combination of Apache Zookeeper, point to point REST or Thrift APIs, state storage in Cassandra and changes in architecture.

And finally Scribe:

Cloudkick used an internal fork of Facebook’s Scribe for transporting certain types of high volume messages and data. Scribe’s simple configuration model and API made it easy to extend for our bulk messaging needs. Cloudkick extended Scribe to include a write ahead journal and other features to improve durability. Cloud Monitoring continues to use Scribe for some of our event processing flows.

Original title and link: Cassandra, Zookeeper, Scribe, and Node.js Powering Rackspace Cloud Monitoring (NoSQL database©myNoSQL)

via: http://journal.paul.querna.org/articles/2011/12/17/technology-cloud-monitoring/