The reason why we use multiple NoSQL solutions is because each one is best suited for a specific set of use cases. For example, HBase is naturally integrated with the Hadoop platform, whereas Cassandra is best for cross-regional deployments and scaling with no single points of failure. Adopting the non-relational model in general is not easy, and Netflix has been paying a steep pioneer tax while integrating these rapidly evolving and still maturing NoSQL products. There is a learning curve and an operational overhead. Still, the scalability, availability and performance advantages of the NoSQL persistence model are evident and are paying for themselves already, and will be central to our long-term cloud strategy.
Summarizing the pros for each of the 3 solutions:
Amazon SimpleDB Pros
- highly durable, writes spanning multiple availability zones
- handy query and data formats
- batch operations
- consistent reads
- hosted solution
- dynamic partitioning model
- built-in support for compression
- range queries
- support for distributed counters
- strong consistency
- interoperability with Hadoop
- no dedicated name nodes
- no practical architectural limitations on data sizes, row/column counts, etc.
- flexible data model
- no underlying storage format requirements like HDFS
- uniquely flexible consistency and replication models
- cross-datacenter and cross-regional replication
I hope the next post will be about the “small” issues Netflix ran into when adopting each of these systems. In the past they’ve shared some of the challenges of an Oracle - Amazon SimpleDB hybrid solution.
Original title and link: Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra (NoSQL databases © myNoSQL)