Phil Simon (Wired) covers some details of the Netflix’s “Big Data Platform as a Service @ Netlix” (alternatively titled “Watching Pigs Fly with the Netflix Hadoop Toolkit”):
At Netflix, comparing the hues of similar pictures isn’t a one-time
experiment conducted by an employee with far too much time on his hands.
It’s a regular occurrence. Netflix recognizes that there is tremendous
potential value in these discoveries. To that end, the company has created
the tools to unlock that value. At the Hadoop Summit, Magnusson and Smith
talked about how data on titles, colors, and covers helps Netflix in many
ways. For one, analyzing colors allows the company to measure the distance
between customers. It can also determine, in Smith’s words, the “average
color of titles for each customer in a 216-degree vector over the last N
While quite fascinating, I’m wondering how one could prove the value of such details. There’s no way you can run an A/B test or a predictive model or a historic model analysis.
Original title and link: Big Data lessons from Netflix