NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Pentaho: All content tagged as Pentaho in NoSQL databases and polyglot persistence

5 Business Intelligence Suites

Pricing takes a bit of work to nail down; it pays to talk to the nice salespeople and let them help you sort it all out. All of these BI suites have free software downloads, and free community support. All of them offer multiple modules, custom support and engineering services, and training and documentation. So the answer to “how much will it cost” is always “it depends.”

Software licenses and custom engineering are especially “it depends.” Your total first-year costs can easily hit $10,000 for a small shop. Presumably you’ll spend less on support and training after your first year. Still, that’s considerably less than the traditional proprietary offerings from SAP, IBM Cognos, SAS and other old-time proprietary BI vendors.

The 5 solutions mentioned: Jaspersoft, Spargo, Pentago, Openl, Actuate.

Original title and link: 5 Business Intelligence Suites (NoSQL database©myNoSQL)


OLAP and Reporting That Feels Like 2012

Question on Hacker News:

after researching how to continuously aggregate and mine our tracking data (200GB and growing fast) for almost a week, I’m still stuck. Is it just me or didn’t I just find the right product yet? I must admit that I’m a generalist developer, no DBA - but to me it looks like all the products I’ve looked into just don’t “feel” right. JasperReports, InfiniDB, Pentaho, just to name a few… it’s all - how can I say - crusty, unintuitive, 1999. All the products look very corporate, there are basically no howto’s, no prices, no shiny, bold Open Source products that fit my bill. I wouldn’t even mind using a good commercial product that does what I want, but even the advertisements are “from DBA to DBA”. Lots of termdropping like ETLs, M/ROLAP, BI, Case-Based-Reasoning - but nothing that looks reasonably simple and straightforward. Maybe I’m spoiled by the DX (Developer Experience) that Backbone and Rails give me, but is there really nobody that has done something simpler and more straightforward? Like “these are my dimensions, these are my facts - now generate my Cube here so I can go datamining”? Now I know this is a huge field and I might sound naive but I’d like to know if there are others sharing my pain or having something closer to a solution.

As usual, lots of links and commentary. Two answers stood out while reading the thread though:

I’ve been in this field for more than a decade and I couldn’t agree more. The tools, and even the underlying theory is stuck in the past, and every project I work on is hampered by the technology not meeting the expectations of (business) users who demand a much more intuitive way of working - because that’s what the experience every day on consumer devices.


Almost all OLAP products provide this simple functionality, but it’s almost impossible to find where it is because it is hidden under layers and layers of “added value” in the form of tools to extract, transform and present that data in flashy executive dashboards.

Original title and link: OLAP and Reporting That Feels Like 2012 (NoSQL database©myNoSQL)

BI Pentaho Integrates Hadoop, NoSQL Databases, and Analytic Databases


  • The ability to orchestrate execution of Hadoop related tasks (i.e., executing a Hive Query, Pig Script, or M/R job) as part of a broader IT workflow.
  • The ability to setup dependencies, so if a step fails the job can branch down a recovery path or send a notification, or if it’s a success it goes on to subsequent dependent tasks. Likewise it supports initiating several tasks in parallel.
  • New integration for Pig — so that developers have the ability to execute a Pig job from a PDI Job flow, integrate the execution of Pig jobs in broader IT workflows through PDI Jobs, take advantage of our out of the box scheduler, and so on.

The list of tools Pentaho 4 integrates with is quite long:

  • a long list of traditional RDBMS
  • analytics databases (Greenplum, Vertica, Netezza, Teradata, etc.)
  • NoSQL databases (MongoDB, HBase, etc.)
  • Hadoop variants
  • LexisNexis HPCC

This is the world of polyglot persistence and hybrid data storage.

Original title and link: BI Pentaho Integrates Hadoop, NoSQL Databases, and Analytic Databases (NoSQL database©myNoSQL)

SQL Access to CouchDB Views

Nicholas Goodman:

[…] enabling SQL Access to CouchDB Views […] single, biggest advantage is: The ability to connect, run of the mill, commodity BI tools to your big data system.

While the video below doesn’t show a PRPT it does show Pentaho doing Ad Hoc, drag and drop reporting on top of CouchDB with LucidDB in the middle, providing the connectivity and FULL SQL access to CouchDB. Once again, the overview:

CouchDB LucidDB pentaho

Being able to bring together both structured and unstructured data—it doesn’t really matter if it is BigData or not—, query it with a language that is familiar to many developers and that had tons of tools available represents the future of polyglot persistence.

Original title and link: SQL Access to CouchDB Views : Easy Reporting | Goodman on BI (NoSQL database©myNoSQL)