VMWare Project Serengeti: Virtualization-Friendly Hadoop
Serengeti is an open source project initiated by VMware to enable the rapid deployment of an Apache Hadoop cluster (HDFS, MapReduce, Pig, Hive, ..) on a virtual platform.
Serengeti 0.5 currently supports vSphere, with the ability to support other platforms. The project is at an early stage, and is endorsed by all major Hadoop distributions including Cloudera, Greenplum, Hortonworks and MapR.
The Hadoop wiki has a page dedicated to running Hadoop in a virtual environment. And there’s also the recent post by Steve Loughran about pros and cons of Hadoop in the cloud and a paper authored by VMWare about virtualizing Apache Hadoop (pdf).
Original title and link: VMWare Project Serengeti: Virtualization-Friendly Hadoop (©myNoSQL)