Scalding: All content tagged as Scalding in NoSQL databases and polyglot persistence
Earlier today I’ve posted Dean Wampler’s video Overview of Scalding. Scalding is a Scala API on top of Cascading1. Below you can find the video and slides from Paco Nathan’s Cascading presentation at Chicago Hadoop User Group:
In this video he will introduce Cascading, then examine the concept of a “workflow” as an abstraction for integrating Hadoop with other systems. We’ll show new features including support for SQL-92, PMML, plus an application manager.
✚ Leaving aside the Java vs. Scala part, I’m still not sure I see any major advantages of any of these libraries over the other. Besides tighter integration with an existing environment.
Original title and link: An Overview of Cascading ( ©myNoSQL)
“There’s not better way to write general-purpose Hadoop MapReduce programs when specialized tools like Hive and Pig aren’t quite what you need.”
Watch the video and slides after below.
✚ At Twitter, the creators of Scalding, different teams use different libraries for dealing with different scenarios.
✚ Dean Wampler is the co-author of the Programming Scala book so his preference for Scala is understandable.
✚ Do you know any other teams or companies using Scalding instead of Cascading or Cascalog?
Original title and link: An Overview of Scalding ( ©myNoSQL)
After posting Impressions About Hive, Pig, Scalding, Scoobi, Scrunch, Spark, I’ve found myself wondering why so many of these libraries are built in Scala and what’s their main purpose. A day later and I’ve found Age Mooij‘s presentation about Scoobi and Scalding which provide an answer to my question. Plus a quick intro to Scoobi1 and Scalding2. Check the slides after the break.