The MapR team analyzes the top 5 misconceptions in the Big Data/Hadoop market:
- Big Data is not simply about massive amounts of data — petabytes and beyond. Big Data represents a paradigm shift.
- Since Hadoop is a funny name and somewhat new to people they assume it must be risky.
- Another misconception about Hadoop, is that it is a batch process.
- Perhaps the biggest misconception is that Hadoop is a single, monolithic, component.
- With respect to open source, the question about a distribution is not a simple binary “open” or “closed”.
The first 4 points are indeed how things are seen from the outside.
While I do understand the nuance introduced by the last point—allowing to plug MapR—, things are black and white: it is either open source or not. But that’s just one dimension of the various components of the Hadoop stack. What really matters is how well a component integrates with the rest of the stack. The questions to be asked are: does it maintain the same interfaces? what’s the cost of replacing it? does it allow to use a 3rd party component? does it force me to get special components or hardware?
Original title and link: 5 Top Misconceptions about Big Data and Hadoop ( ©myNoSQL)