ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Amazon Elastic MapReduce Upgrades Hadoop, Hive and Pig

Amazon upgraded the set of tools to work with NoSQL data (and not only):

Customers can now take advantage of improved Hadoop performance and the following new features:

  • Multiple inputs class for reading multiple types of data.
  • Multiple outputs class for writing multiple types of data.
  • ChainMapper and ChainReducer which allows users to perform M+RM* within one Hadoop job. Previously customers could only run one mapper and one reducer per job.
  • Skip bad records in the dataset that cause jobs to fail. This allows a job to complete even if some records in a dataset are erroneous.
  • JVM reuse across task boundaries to increase performance when processing small files.
  • Support for bzip2 compression.

via: http://developer.amazonwebservices.com/connect/ann.jspa?annID=697