NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Michael Stonebraker: All content tagged as Michael Stonebraker in NoSQL databases and polyglot persistence

Hadoop Is the Best Thing Since Sliced Bread, Even if Doomed

I’m not the only one confused by Michael Stonebraker’s Hadoop is dead theme. Edward Capriolo:

Let me tell you a story of how I got into hadoop and hive. I was following advice like Stonebreaker’s that said Parallel DBs are the way to go. But I quickly found out Parallel Database are too rich for my blood.  Now, I am not telling you or anyone else that you should not spend money on Parallel DBs, because maybe you have the money, or maybe you need some of those things the parallel database provides. But for things I need to do:

  • store tons of data
  • processed it reasonably fast
  • be LOW on the cost scale

Hadoop and hive work fine for me.

Original title and link: Hadoop Is the Best Thing Since Sliced Bread, Even if Doomed (NoSQL database©myNoSQL)


Possible Hadoop Trajectories

According to Michael Stonebraker and Jeremy Kepner the future of Hadoop is doomed:

Computational space Data Management
Adopt Hadoop for pilot projects Adopt Hadoop for pilot projects
Scale Hadoop to production use Scale Hadoop to production use
Hit the wall, as the above problems become big issues Observer an unacceptable performance penalty
Morph to something that deals with our issues Morph to real parallel DBMS

Let me see if I get this right: you take 2 problem spaces, you generalize these to complete fields, try to use Hadoop, identify the mismatch but still go in production, ignore the solutions built on top of Hadoop/HDFS to address these problem spaces (Apache Hama or Twister) , then conclude by scientific generalization that these problems apply to everyone else, thus Hadoop is dead.

What’s wrong with all these companies using Hadoop for solving their problems? A bunch of stubborn people.

Original title and link: Possible Hadoop Trajectories (NoSQL database©myNoSQL)