The Beginning of an Interesting Friendship: MapReduce and RDBMS
In the spirit of “reconciliation” for this end of NoSQL vs SQL year, I thought it would be interesting to mention that I read that the list of traditional (and not only) RDBMS adding support for in-database MapReduce is getting longer by the day. Here is what I could gather so far (nb: please note that not all of the mentioned systems are RDBMS and that some of the linked articles are PR announcements, so they can contain inaccurate data)
- Sybase IQ: ☞ here, ☞ here and ☞ here
- IBM M2: ☞ IBM’s M2 corrals massive data sets with Hadoop
- Oracle: ☞ here
- Aster data: ☞ here
- Greenplum: ☞ here
- Vertica: ☞ here
- Teradata and *Netezza: ☞ here
I assume that is what Jeff Davis meant when writing in his post ☞ NoSQL can be fast, but what if SQL were fast and flexible?:
A unified database management system that integrates NoSQL processing models with a traditional SQL system is the real answer here; and streaming is one way to accomplish that. This integration allows a wide range of data processing strategies to work together – traditional tables offer recovery of streaming data, for instance — rather than forcing you to choose a single processing model.
In other words, the language and logical model should be separate from the processing model. And isn’t that what the relational model is all about?
While some may say that MapReduce adoption is just another validation of the NoSQL movement, I confess I find it quite a natural reaction. My concern is that currently MapReduce integration comes in too various forms, as Daniel Abadi remarks too:
Teradata, Microsoft, Sybase, and, to an extent, Netezza, all seem to believe that providing a library of preoptimized functions distributed with the software is the way to go.
[…]
The other school of thought is adopted by vendors that allow customers more freedom to implement their own functions, but constrain the language in which this code is written (such as MapReduce or LINQ) to facilitate the automatic parallelization of this code inside the DBMS.
And until we will see some consolidation in this space, hybrid NoSQL solutions may remain the way to go.
This is probably the last post for 2009, so I’d like to use this opportunity to thank all MyNoSQL readers for their contributions and to wish you all a great 2010!
I’d also like to share with you my wish to make MyNoSQL the place to read and learn about NoSQL in 2010 and my hope that you all will be with me in this endeavor.

