ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Druid: Distributed In-Memory OLAP Data Store

Over the last twelve months, we tried and failed to achieve scale and speed with relational databases (Greenplum, InfoBright, MySQL) and NoSQL offerings (HBase).

Stepping back from our two failures, let’s examine why these systems failed to scale for our needs:

  1. Relational Database Architectures

    • Full table scans were slow, regardless of the storage engine used
    • Maintaining proper dimension tables, indexes and aggregate tables was painful
    • Parallelization of queries was not always supported or non-trivial
  2. Massive NOSQL With Pre-Computation

    • Supporting high dimensional OLAP requires pre-computing an exponentially large amount of data

Many of the questions you have in mind have already been asked in the this comment thread, but with not so many answers until now.

Original title and link: Druid: Distributed In-Memory OLAP Data Store (NoSQL databases © myNoSQL)

via: http://metamarketsgroup.com/blog/druid-part-i-real-time-analytics-at-a-billion-rows-per-second/