Over the last twelve months, we tried and failed to achieve scale and speed with relational databases (Greenplum, InfoBright, MySQL) and NoSQL offerings (HBase).
Stepping back from our two failures, let’s examine why these systems failed to scale for our needs:
Relational Database Architectures
- Full table scans were slow, regardless of the storage engine used
- Maintaining proper dimension tables, indexes and aggregate tables was painful
- Parallelization of queries was not always supported or non-trivial
Massive NOSQL With Pre-Computation
- Supporting high dimensional OLAP requires pre-computing an exponentially large amount of data
Many of the questions you have in mind have already been asked in the this comment thread, but with not so many answers until now.
Original title and link: Druid: Distributed In-Memory OLAP Data Store (NoSQL databases © myNoSQL)