Curt Monash’s opinion about the various Hadoop distributions:
- For most enterprises, the Hadoop distribution you should go with is still CDH.
- I think Cloudera and Hortonworks are headed for a duopoly in general-purpose Hadoop distributions, and Hortonworks may achieve rough parity sooner than Cloudera likes. But at the moment Cloudera still seems well ahead.
- The same partners who root for Hortonworks to beat Cloudera also point out that they have worked with Cloudera for longer than Hortonworks has even existed. So while those partners are a plausibility argument for Hortonworks catching up with Cloudera in the future, they don’t show a Hortonworks advantage at this time.
- I think it’s already too late in the history of Hadoop to commit to other variants, such as MapR. But there can be credible and useful claims of Hadoop functionality in products like, for example, the DataStax/Cassandra stack.
- The wild card here is Amazon, which in some ways can be said to have majority Hadoop market share all by itself. One of the week’s announcements was some kind of optional integration between MapR and Elastic MapReduce.
I’m not an analyst and I haven’t been in the position to do it, but I’ve already shared where I’d start with choosing between Hadoop distributions.
Original title and link: An Analyst Perspective in Choosing a Hadoop Distribution ( ©myNoSQL)