NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



HBase on EC2 using EBS volumes : Lessons Learned

There lies the answer! We have a requirement of recreating the cluster in case we accidentally delete entire data or if we loose our master. In such a case the reliable backup can only be taken if your HDFS data does not reside on the root devices. A reliable backup of the root device cannot be taken without rebooting the device. Furthermore it’s stored as an AMI which mean you have to create a new AMI every day and delete the old one. This means to solve all of our problems we need HBase installation and data both stored on attached EBS volumes that are not the root devices.

Update: after reading the post both Bradford Stephens[1] and Andrew Purtell[2] recommended using instance store instead of EBS:

EBS adds complexity, failure risk, and cost

  1. CEO of Drawn to Scale  

  2. Systems architect and HBase committer, @akpurtell  

Original title and link: HBase on EC2 using EBS volumes : Lessons Learned (NoSQL databases © myNoSQL)