NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



HBase at TellApart for Log Event Processing

Educative post from TellApart explaining how HBase—with the right data modeling and filtering processes—had them covered for the following requirements coming from the need of log analysis:

  • Data must be ingested into the system incrementally – one day or so worth of data at a time.
  • Data is processed at a variety of time scales. Daily reporting often cares only about one day’s worth of data, while machine learning applications may require digesting several months worth of data to build models.
  • Some events are naturally associated with others. An ad click is logged separately from an ad impression, but the two need to be processed together. Some data extraction applications need to process these associated events together, but others only care about the individual events.
  • Random-access lookups to track all the actions of a user across time are often helpful. This is a powerful debugging tool to understand how the user interacts with the web at large and with the TellApart system in particular.

Original title and link: HBase at TellApart for Log Event Processing (NoSQL database©myNoSQL)