In a follow up post to SQL or Hadoop: What Tools Should I Use to Process My Data?, Gwen Shapira presents some reasons why, even if many things that fit into Hadoop better, could be done with Oracle, that’s not also a good idea:
But, do you really want to use Oracle to store millions of emails and scanned documents? I have few customers who do it, and I think it causes more problems than it solves. After you stored them, do you really want to use your network and storage bandwidth so the application servers will keep reading the data from the database? Big data is… big. It is best not to move it around too much and run the processing on the servers that store the data. After all, the code takes fewer packets than the data. But, Oracle makes cores very expensive. Are you sure you want to use them to run processing-intensive data mining algorithms?
Then there’s the issue of actually programming the processing code. If your big data is in Oracle and you want to process it efficiently, PL/SQL is pretty much the only option. […]
All these are very solid arguments.
Generalizing a bit the point Gwen’s making, I would say that this is exactly the history and what made relational databases successful. Providing decent solutions, up to a point, to a wide range of problems and covering more scenarios than alternative storage solutions existing at that time, made relational databases the de facto storage for the last 30 years. But during the last years, more and more problems crossed the boundaries of what could have been considered decent solutions leading to the need for specialized, better than good enough alternative solutions. And thus NoSQL databases.
Original title and link: Oracle Database or Hadoop? And What Led to NoSQL Databases ( ©myNoSQL)