virtualization: All content tagged as virtualization in NoSQL databases and polyglot persistence
Serengeti is an open source project initiated by VMware to enable the rapid deployment of an Apache Hadoop cluster (HDFS, MapReduce, Pig, Hive, ..) on a virtual platform.
Serengeti 0.5 currently supports vSphere, with the ability to support other platforms. The project is at an early stage, and is endorsed by all major Hadoop distributions including Cloudera, Greenplum, Hortonworks and MapR.
The Hadoop wiki has a page dedicated to running Hadoop in a virtual environment. And there’s also the recent post by Steve Loughran about pros and cons of Hadoop in the cloud and a paper authored by VMWare about virtualizing Apache Hadoop (pdf).
Original title and link: VMWare Project Serengeti: Virtualization-Friendly Hadoop ( ©myNoSQL)
James Phillips of NorthScale about scaling out with Membase on VMWare (real interview starts at around 1’35”):
Considering Membase is persisting to disk (as opposed to its little brother memcached which is memory only)
, I’m wondering if virtualized environments provide good enough IO.
- As many other DBMS, Membase keeps “hot data” in memory, but it also writes it to disk for durability. (↩)