ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

OLAP: All content tagged as OLAP in NoSQL databases and polyglot persistence

OLAP and Reporting That Feels Like 2012

Question on Hacker News:

after researching how to continuously aggregate and mine our tracking data (200GB and growing fast) for almost a week, I’m still stuck. Is it just me or didn’t I just find the right product yet? I must admit that I’m a generalist developer, no DBA - but to me it looks like all the products I’ve looked into just don’t “feel” right. JasperReports, InfiniDB, Pentaho, just to name a few… it’s all - how can I say - crusty, unintuitive, 1999. All the products look very corporate, there are basically no howto’s, no prices, no shiny, bold Open Source products that fit my bill. I wouldn’t even mind using a good commercial product that does what I want, but even the advertisements are “from DBA to DBA”. Lots of termdropping like ETLs, M/ROLAP, BI, Case-Based-Reasoning - but nothing that looks reasonably simple and straightforward. Maybe I’m spoiled by the DX (Developer Experience) that Backbone and Rails give me, but is there really nobody that has done something simpler and more straightforward? Like “these are my dimensions, these are my facts - now generate my Cube here so I can go datamining”? Now I know this is a huge field and I might sound naive but I’d like to know if there are others sharing my pain or having something closer to a solution.

As usual, lots of links and commentary. Two answers stood out while reading the thread though:

I’ve been in this field for more than a decade and I couldn’t agree more. The tools, and even the underlying theory is stuck in the past, and every project I work on is hampered by the technology not meeting the expectations of (business) users who demand a much more intuitive way of working - because that’s what the experience every day on consumer devices.

and

Almost all OLAP products provide this simple functionality, but it’s almost impossible to find where it is because it is hidden under layers and layers of “added value” in the form of tools to extract, transform and present that data in flashy executive dashboards.

Original title and link: OLAP and Reporting That Feels Like 2012 (NoSQL database©myNoSQL)


Druid: Distributed In-Memory OLAP Data Store

Over the last twelve months, we tried and failed to achieve scale and speed with relational databases (Greenplum, InfoBright, MySQL) and NoSQL offerings (HBase).

Stepping back from our two failures, let’s examine why these systems failed to scale for our needs:

  1. Relational Database Architectures

    • Full table scans were slow, regardless of the storage engine used
    • Maintaining proper dimension tables, indexes and aggregate tables was painful
    • Parallelization of queries was not always supported or non-trivial
  2. Massive NOSQL With Pre-Computation

    • Supporting high dimensional OLAP requires pre-computing an exponentially large amount of data

Many of the questions you have in mind have already been asked in the this comment thread, but with not so many answers until now.

Original title and link: Druid: Distributed In-Memory OLAP Data Store (NoSQL databases © myNoSQL)

via: http://metamarketsgroup.com/blog/druid-part-i-real-time-analytics-at-a-billion-rows-per-second/


Why NoSQL is hard at The Knot

Jason Sirota summarizes concerns expressed by developers, DBAs, and OPs teams related to NoSQL databases:

  1. Developers: NoSQL databases do not separate tiers or concerns
  2. Developers: Sometimes, you can fix app problems just by changing a stored procedure
  3. DBA: There is not as much instrumentation around NoSQL
  4. Reporting: We have to write an app every time we want to show data? 5. Ops: WTF is NoSQL, SQL Server seems pretty fast to me.

Points 1, 2, and 4 are all related to the NoSQL reduced/different data access layer. And as much as we like to tout separation of OLTP and OLAP systems, we need to agree that there are a lot of companies out there that do not do it (lots of reasons) thus validating these concerns.

Original title and link: Why NoSQL is hard at The Knot (NoSQL databases © myNoSQL)

via: http://jasonsirota.posterous.com/why-nosql-is-hard-at-the-knot