Amazon Elastic MapReduce Upgrades Hadoop, Hive and Pig
Amazon upgraded the set of tools to work with NoSQL data (and not only):
Customers can now take advantage of improved Hadoop performance and the following new features:
- Multiple inputs class for reading multiple types of data.
- Multiple outputs class for writing multiple types of data.
- ChainMapper and ChainReducer which allows users to perform M+RM* within one Hadoop job. Previously customers could only run one mapper and one reducer per job.
- Skip bad records in the dataset that cause jobs to fail. This allows a job to complete even if some records in a dataset are erroneous.
- JVM reuse across task boundaries to increase performance when processing small files.
- Support for bzip2 compression.
via: http://developer.amazonwebservices.com/connect/ann.jspa?annID=697