ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Types of data that land in Hadoop

Jim Walker in The Business Value of Hadoop as seen through the Big Data:

While every organization is different their big data is often very similar. For the most part, Hadoop is collecting massive amounts of data across six basic types of data: social media activity, clickstream data, server logs, unstructured (videos, docs, etc) and machine/sensor data from equipment in the field.

From these categories, I think only machine/sensor data can be considered critical data. Actually if you think of it, server logs, clickstreams, and even social media activity are also sensor data; originated in servers and respectively humans.

The future of data processing is platforms that would be able to bring together all critical data disregarding their main storage location. Some call this federated databases. Some call this logical data warehouses. The specific term doesn’t matter though. It’s the core principles that will make the difference:integration and integrated processing in close to real time.

Original title and link: Types of data that land in Hadoop (NoSQL database©myNoSQL)