ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

The Beginning of an Interesting Friendship: MapReduce and RDBMS

In the spirit of “reconciliation” for this end of NoSQL vs SQL year, I thought it would be interesting to mention that I read that the list of traditional (and not only) RDBMS adding support for in-database MapReduce is getting longer by the day. Here is what I could gather so far (nb: please note that not all of the mentioned systems are RDBMS and that some of the linked articles are PR announcements, so they can contain inaccurate data)

I assume that is what Jeff Davis meant when writing in his post ☞ NoSQL can be fast, but what if SQL were fast and flexible?:

A unified database management system that integrates NoSQL processing models with a traditional SQL system is the real answer here; and streaming is one way to accomplish that. This integration allows a wide range of data processing strategies to work together – traditional tables offer recovery of streaming data, for instance — rather than forcing you to choose a single processing model.

In other words, the language and logical model should be separate from the processing model. And isn’t that what the relational model is all about?

While some may say that MapReduce adoption is just another validation of the NoSQL movement, I confess I find it quite a natural reaction. My concern is that currently MapReduce integration comes in too various forms, as Daniel Abadi remarks too:

Teradata, Microsoft, Sybase, and, to an extent, Netezza, all seem to believe that providing a library of preoptimized functions distributed with the software is the way to go.

[…]

The other school of thought is adopted by vendors that allow customers more freedom to implement their own functions, but constrain the language in which this code is written (such as MapReduce or LINQ) to facilitate the automatic parallelization of this code inside the DBMS.

And until we will see some consolidation in this space, hybrid NoSQL solutions may remain the way to go.

This is probably the last post for 2009, so I’d like to use this opportunity to thank all MyNoSQL readers for their contributions and to wish you all a great 2010!

I’d also like to share with you my wish to make MyNoSQL the place to read and learn about NoSQL in 2010 and my hope that you all will be with me in this endeavor.