An Overview of Cascading

Earlier today I’ve posted Dean Wampler’s video Overview of Scalding. Scalding is a Scala API on top of Cascading1. Below you can find the video and slides from Paco Nathan’s Cascading presentation at Chicago Hadoop User Group:

In this video he will introduce Cascading, then examine the concept of a “workflow” as an abstraction for integrating Hadoop with other systems. We’ll show new features including support for SQL-92, PMML, plus an application manager.

✚ Leaving aside the Java vs. Scala part, I’m still not sure I see any major advantages of any of these libraries over the other. Besides tighter integration with an existing environment.

  1. Cascading: an application framework for Java developers to quickly and easily develop robust data analytics and data management applications on Apache Hadoop. 

