Presentations on Hadoop, HBase, PIG and Cascalog from Hadoop Meet-Up
The Yahoo! Developer Network Blog has ☞ posted the materials presented at Hadoop’s monthly user group meeting. I’ve embedded these below for your convenience:
What’s New With Pig: Alan Gates
Pig is one of the solutions used for data processing/analysis in the NoSQL world. For example Pig is heavily used at Twitter.
Recently Pig has released ☞ two new versions (0.6.0 and 0.7.0) and this talk focuses on the new features included with these versions and a compatibility plan with Hadoop[1]
Cascalog: Powerful and easy-to-use data analysis tool for Hadoop: Nathan Marz
Cascalog is a Clojure-based query language solution for Hadoop-stored data analysis. Nathan Marz (BackType) is demoing this cool tool:
HBase and Pig: The Hadoop ecosystem at Twitter: Dmitriy Ryaboy
As already mentioned Twitter is extensively using HBase, Pig and Hadoop — in their words Cassandra is OLTP and HBase is OLAP — and Dmitriy provides an overview of their Hadoop-based ecosystem:
References
- [1] Yahoo! Developer Network Blog has an article on this topic ☞ Towards Enterprise-Class Compatibility for Apache Hadoop. Considering that after the last release HBase has become a top-level Apache project and that there’s a very strong userbase for HBase and Hadoop, ensuring a healthy ecosystem for all these projects is extremely important. (↩)