If MapReduce were formulated differently, many problems would be easier to code. The high-level tools that could arise from it would also be easier to code.
This is the motivation that has led us to pose and formulate Tuple MapReduce.
No earlier than today I wrote that I don’t believe in the arguments introducing MapReduce as a complex algorithm. The Map-Reduce model is simple. What is complicated is correctly and efficiently decomposing problems in sequences of Map-Reduce phases.
But if Tuple MapReduce could make it simpler why not?
Now we will show an extended MapReduce model, Tuple MapReduce, which we can formalize as:
In this case, the map function processes a tuple as input and emits a certain number of tuples as output. These tuples are made up of “n” fields out of which “s” fields are used to sort and “g” fields are used to group by. This diagram shows how sorting and grouping is done in greater detail:
In the reduce function, for each group, we receive a group tuple with “g” fields and a list of tuples for that group. Finally we’ll emit a certain number of tuples as output.
Generalized? Maybe. Simpler? Neah.
Original title and link: Tuple MapReduce: Beyond the Classic MapReduce ( ©myNoSQL)