Big Data Implications for IT Architecture and Infrastructure
Teradata’s Martin Willcox:
From an IT architecture / infrastructure perspective, I think that the key thing to understand about all of this is that, at least for the foreseeable future, we’ll need at least two different types of “database” technology to efficiently manage and exploit the relational and non-relational data, respectively: an integrated data warehouse, built on an Massively Parallel Processing (MPP) DBMS platform for the relational data, and the relational meta-data that we generate by processing the non-relational data (for example, that a call was made at this date and time, by this customer, and that they were assessed as being stressed and agitated); and another platform for the processing of the non-relational data, that enables us to parallelise complex algorithms - and so bring them to bear on large data-sets - using the MapReduce programming model. Since the value of these data are much greater in combination than in isolation – and because we may be shipping very large volumes of data between the different platforms - considerations of how best to connect and integrate these two repositories become very important.
One of the few corporate blog posts that do not try to position Hadoop (and implicitely MapReduce) in a corner.
This sane perspective could be a validation of my thoughts about the Teradata and Hortwonworks partnership.
Original title and link: Big Data Implications for IT Architecture and Infrastructure (©myNoSQL)
via: http://blogs.teradata.com/emea/What-is-meant-by-the-idea-of-big-data/