Big Data, Unstructured Data, and In-Memory Analytics
Two interesting quotes from Teradata’s CTO Stephen Brobst interview with Vinita Gupta (InformationWeek):
Structured vs unstructured data:
I don’t believe that any data is unstructured. We have to overcome this myth that anything that is not in rows or columns is unstructured. The blogs and videos are structured, but non-traditional data.
I think of unstructured data as:
-
data from which various different structured data can be extracted
The simplest example is web logs. They contain various bits of information that could be each used for different investigations.
-
data about the same entities taking various forms
The simplest example is click streams coming from different sources (e.g. a shared video on YouTube/Vimeo/Twitter etc.). All this data is needed for analysis, but it comes back in different forms.
In-memory analytics:
Some of our competitors, who talk about in-memory analytics in India, do not understand analytics because the cost per terabyte of in-memory is at least 50 times the cost of mechanical disk drives. […] From the massive data available, we frequently access only 20 percent of the data. So, customers want that 20 percent of data to be in high-performance storage and the remaining 80 percent of the data to be in low-cost storage. CIOs want an environment that allows both — optimization for price and performance and optimization for price and storage.
This sounds extremely familiar.
Original title and link: Big Data, Unstructured Data, and In-Memory Analytics (©myNoSQL)