NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



5 Top Misconceptions about Big Data and Hadoop

The MapR team analyzes the top 5 misconceptions in the Big Data/Hadoop market:

  1. Big Data is not simply about massive amounts of data — petabytes and beyond. Big Data represents a paradigm shift.
  2. Since Hadoop is a funny name and somewhat new to people they assume it must be risky.
  3. Another misconception about Hadoop, is that it is a batch process.
  4. Perhaps the biggest misconception is that Hadoop is a single, monolithic, component.
  5. With respect to open source, the question about a distribution is not a simple binary “open” or “closed”.

The first 4 points are indeed how things are seen from the outside.

While I do understand the nuance introduced by the last point—allowing to plug MapR—, things are black and white: it is either open source or not. But that’s just one dimension of the various components of the Hadoop stack. What really matters is how well a component integrates with the rest of the stack. The questions to be asked are: does it maintain the same interfaces? what’s the cost of replacing it? does it allow to use a 3rd party component? does it force me to get special components or hardware?

Original title and link: 5 Top Misconceptions about Big Data and Hadoop (NoSQL database©myNoSQL)