Automating Hadoop/HBase deployments with Puppet
The guys from the Adobe SaaS team — same guys that shared with us their experience and reasons for using HBase — have ☞ open sourced their Puppet[1] recipes for automating Hadoop/HBase deployments.
Right now we are open-sourcing on GitHub, Puppet recipes for:
- creating the user under which the entire hstack runs.
- changing system settings, like the ssh keys, authorizing machines to talk to each other, aliases for hadoop and hbase executables, /tmp rules.
- standalone puppet module to deploy Hadoop
- standalone puppet module to configure the Hadoop NameNode in High-Availability mode via DRBD, heartbeat and mon. For more details on this recipe check out the cloudera blog post on this topic.
- standalone puppet module to deploy HBase
- standalone puppet module to deploy Zookeeper.
Their ☞ announcement gives a lot of details of why they created these recipes and how to use them (nb it would be excellent if the ☞ GitHub project would point back to the article as part of the documentation).
Just to get an idea of how complex this process can be you can check the HBase/Hadoop MacOS Installation Guide, so I’d say that these recipes will definitely make things a lot easier!