NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Apache Pig 0.8: What is New

Dmitriy Ryaboy1 has a guest post on Cloudera blog covering the new features in Apache Pig 0.8.


  • Support for user defined functions (UDF) in scripting languages
  • Generic UDFs: allows invocation of static java methods
  • PigUnit: as the name suggests, a testing tool for Pig scripts
  • PigStats: once again the name should give you a hint of what it does: better visibility into Pig job through a series of stats, XML-based metadata injected into Map-Reduce jobs, and listeners for the Pig process
  • Scalar values: simplifying access to single-row relations
  • possibility to start a monitoring thread for long running executions
  • HBaseStorage: works with HBase 0.20 releases only
  • flow allows custom Map-Reduce jobs
  • automatic merge of small files
  • custom partitioners

The Pig 0.8 release includes a large number of bug fixes and optimizations, but at the core it is a feature release. It’s been in the works for almost a full year and the amount of time spent on 0.8 really shows.

You can also check Dmitriy’s presentations about the NoSQL ecosystem at Twitter: Twitter, Pig, and HBase and HBase and Pig: The Hadoop ecosystem at Twitter

  1. Dmitriy Ryaboy: Twitter engineer, @squarecog  

Original title and link: Apache Pig 0.8: What is New (NoSQL databases © myNoSQL)