DataFu: All content tagged as DataFu in NoSQL databases and polyglot persistence
Wednesday, 30 January 2013
DataFu: A Collection of Pig UDFs for Data Analysis on Hadoop by LinkedIn
Sam Shah in a guest post on Hortonworks blog:
If Pig is the “duct tape for big data”, then DataFu is the WD-40. Or something. […] Over the years, we developed several routines that were used across LinkedIn and were thrown together into an internal package we affectionately called “littlepiggy.”
“a penetrating oil and water-displacing spray“? “littlepiggy”? Seriously?
How could one come up with these names for such a useful library of statistical functions, PageRank, set and bag operations?
Original title and link: DataFu: A Collection of Pig UDFs for Data Analysis on Hadoop by LinkedIn (©myNoSQL)