analytic database: All content tagged as analytic database in NoSQL databases and polyglot persistence
Infochimps put together a comprehensive Venn diagram of the database world in the TechCrunch article Big Data Right Now: Five Trendy Open Source Technologies
Original title and link: The Database World in a Venn Diagram ( ©myNoSQL)
after researching how to continuously aggregate and mine our tracking data (200GB and growing fast) for almost a week, I’m still stuck. Is it just me or didn’t I just find the right product yet? I must admit that I’m a generalist developer, no DBA - but to me it looks like all the products I’ve looked into just don’t “feel” right. JasperReports, InfiniDB, Pentaho, just to name a few… it’s all - how can I say - crusty, unintuitive, 1999. All the products look very corporate, there are basically no howto’s, no prices, no shiny, bold Open Source products that fit my bill. I wouldn’t even mind using a good commercial product that does what I want, but even the advertisements are “from DBA to DBA”. Lots of termdropping like ETLs, M/ROLAP, BI, Case-Based-Reasoning - but nothing that looks reasonably simple and straightforward. Maybe I’m spoiled by the DX (Developer Experience) that Backbone and Rails give me, but is there really nobody that has done something simpler and more straightforward? Like “these are my dimensions, these are my facts - now generate my Cube here so I can go datamining”? Now I know this is a huge field and I might sound naive but I’d like to know if there are others sharing my pain or having something closer to a solution.
As usual, lots of links and commentary. Two answers stood out while reading the thread though:
I’ve been in this field for more than a decade and I couldn’t agree more. The tools, and even the underlying theory is stuck in the past, and every project I work on is hampered by the technology not meeting the expectations of (business) users who demand a much more intuitive way of working - because that’s what the experience every day on consumer devices.
Almost all OLAP products provide this simple functionality, but it’s almost impossible to find where it is because it is hidden under layers and layers of “added value” in the form of tools to extract, transform and present that data in flashy executive dashboards.
Original title and link: OLAP and Reporting That Feels Like 2012 ( ©myNoSQL)
Published by a group from Los Alamos National Lab (Hristo Djidjev, Gary Sandine, Curtis Storlie, Scott Vander Wiel):
We propose a method for analyzing traffic data in large computer networks such as big enterprise networks or the Internet. Our approach combines graph theoretical representation of the data and graph analysis with novel statistical methods for discovering pattern and timerelated anomalies. We model the traffic as a graph and use temporal characteristics of the data in order to decompose it into subgraphs corresponding to individual sessions, whose characteristics are then analyzed using statistical methods. The goal of that analysis is to discover patterns in the network traffic data that might indicate intrusion activity or other malicious behavior.
The embedded PDF and download link after the break.
In the blue corner we have IBM with Netezza as analytic database, Cognos for BI, and SPSS for predictive analytics. In the green corner we have EMC with Greenplum and the partnership with SAS. And in the open source corner we have Hadoop and R.
Update: there’s also another corner I don’t know how to color where Teradata and its recently acquired Aster Data partner with SAS.
Who is ready to bet on which of these platforms will be processing more data in the next years?