Two notable things in Denny Lee’s post about optimizing some of the Hive joins used by Microsoft’s Online Services Division:
- Microsoft is drinking their own HDInsight on Azure champaign. This will take HDInsight product far as they’ll always have first hand feedback about parts of the system that need improvement.
- Know the different types of JOINs supported by Hive and don’t be afraid of experimenting.
✚ An extra point for the link to Liyin Tang and Namit Jain’s Join strategies in Hive (PDF)
Original title and link: Optimizing Joins running on HDInsight Hive on Azure ( ©myNoSQL)