NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



An Analyst Perspective in Choosing a Hadoop Distribution

Curt Monash’s opinion about the various Hadoop distributions:

  • For most enterprises, the Hadoop distribution you should go with is still CDH.
  • I think Cloudera and Hortonworks are headed for a duopoly in general-purpose Hadoop distributions, and Hortonworks may achieve rough parity sooner than Cloudera likes. But at the moment Cloudera still seems well ahead.
  • The same partners who root for Hortonworks to beat Cloudera also point out that they have worked with Cloudera for longer than Hortonworks has even existed. So while those partners are a plausibility argument for Hortonworks catching up with Cloudera in the future, they don’t show a Hortonworks advantage at this time.
  • I think it’s already too late in the history of Hadoop to commit to other variants, such as MapR. But there can be credible and useful claims of Hadoop functionality in products like, for example, the DataStax/Cassandra stack.
  • The wild card here is Amazon, which in some ways can be said to have majority Hadoop market share all by itself. One of the week’s announcements was some kind of optional integration between MapR and Elastic MapReduce.

I’m not an analyst and I haven’t been in the position to do it, but I’ve already shared where I’d start with choosing between Hadoop distributions.

Original title and link: An Analyst Perspective in Choosing a Hadoop Distribution (NoSQL database©myNoSQL)