Grant Ingersoll (LucidWorks) in an article on GigaOm asking for a different term than “unstructured” data to characterize natural form text:
Text is easily one of the most highly structured data types we face, filled
with misspellings, misdirection, flowery language, ambiguity and implicit
knowledge. Text is so often misunderstood that researchers in the field even
have a metric (inter-annotator agreement) that tracks how often two people
examining the same piece of text agree on the answer to some question on the
Some random thoughts:
- Unstructured data doesn’t refer only to natural form text.
- Speaking of text, from the 4 years of (Romanian) grammar I’ve learned in school, the thing I remember the best is the countless exceptions. To me that sounds like lack of structure.
- I’m pretty sure there are analysts out there that have come up with different terms, but sometimes having everyone understand the meaning of a term is more important than the term itself.
Original title and link: Can We Please Stop Saying “Unstructured” Data? ( ©myNoSQL)