Your Hadoop in Amazon's Cloud
Adam Horwich of metabroadcast shares their experience of running a Hadoop cluster on Amazon taking advantage of availability zones, spot instances and other tricks:
Oh Hadoop, how you infuriate me with your spurious failures and endless bugs, but how fantastic you can actually be when it comes down to it. I’ve been fighting with Hadoop a lot this past year, from a Region Server domino apocalypse, to the seemingly impossible job of duplicating a cluster. […] But to make the most of what you’ve got, I’ve been researching better ways of using resources available. There’s, of course, always been the option of using Amazon’s EMR service, but we originally built our cluster before that existed as a product, and have built our services around a standardised Hadoop cluster, with local DataNodes. This blog post will be about adding in some nice EMR style features to your dedicated Hadoop cluster running in AWS.
Original title and link: Your Hadoop in Amazon’s Cloud (©myNoSQL)