ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

BI: All content tagged as BI in NoSQL databases and polyglot persistence

Hadoop, HBase and R: Will Open Source Software Challenge BI & Analytics Software Vendors?

Harish Kotadia:

Predictive Analytics has been billed as the next big thing for almost fifteen years, but hasn’t gained mass acceptance so far the way ERP and CRM solutions have. One of the main reason for this is the high upfront investment required in Software, Hardware and Talent for implementing a Predictive Analytics solution.

Well, this is about to change – […] Using R, HBase and Hadoop, it is possible to build cost-effective and scalable Big Data Analytics solutions that match or even exceed the functionality offered by costly proprietary solutions from leading BI/Analytics software vendors at a fraction of the cost.

Vendors will argue that software licensing represents just a small fraction of the costs of implementing BI or data analytics. What they’ll leave out is the costs of acquiring know-how and more important, the costs of maintenance and modernization of their solutions.

Original title and link: Hadoop, HBase and R: Will Open Source Software Challenge BI & Analytics Software Vendors? (NoSQL database©myNoSQL)

via: http://smartdatacollective.com/hkotadia1/45540/big-data-will-open-source-software-challenge-bi-analytics-software-vendors


Data Science and BI: Similarities and Differences

Data science and BI differ in the foci of their  investigations. DS is consumed with supporting the development of data products. As Monica Rogati of LinkedIn notes, “On one side, I’ve been working on building products … The other side is finding interesting stories in the data.” BI, on the other hand, is all about measuring and managing business performance. At their best, though, both disciplines have an evidenced-based “science of business” foundation that makes me reject the contention by some that data science has a higher calling and is more scientifically sophisticated than BI.

Steve Miller puts the accent on the difference of maturity of the two fields. I’d say the difference in the approaches is even more important.

Original title and link: Data Science and BI: Similarities and Differences (NoSQL database©myNoSQL)

via: http://www.information-management.com/blogs/data-science-BI-database-Hadoop-Enzee-10021757-1.html


10 BI Trends for 2012 According to Tableau Software

  1. Big data gets even bigger
  2. Self-reliance is the new self-service
  3. The “Consumerization of Enterprise Software accelerates”
  4. Mobile BI goes mainstream
  5. Some companies start to get comfortable with social BI
  6. Companies explore the BI cloud
  7. Most jobs will require analytical skills… leading to talent shortages
  8. BI projects flourish under aligned IT & business
  9. Interactive data visualization becomes a requirement
  10. Hadoop gathers momentul — unstructured data isn’t going anywhere.

It sounds like companies will have to discover the fountain of money to be able to accomplish 2, 6, and 8 within an year.


What's the Relationship Between Traditional Business Intelligence (BI) and Big Data?

Alistair Croll for O’Reilly:

Big data is a successor to traditional BI, and in that respect, there’s bound to be some bloodshed. But both BI and big data are trying to do the same thing: answer questions. If big data gets businesses asking better questions, it’s good for everyone.

Big data is different from BI in three main ways:

  • It’s about more data than BI, and this is certainly a traditional definition of big data.
  • It’s about faster data than BI, which means exploration and interactivity, and in some cases delivering results in less time than it takes to load a web page.
  • It’s about unstructured data, which we only decide how to use after we’ve collected it and need algorithms and interactivity in order to find the patterns it contains.

Original title and link: What’s the Relationship Between Traditional Business Intelligence (BI) and Big Data? (NoSQL database©myNoSQL)

via: http://radar.oreilly.com/2011/11/big-data-business-enterprise.html


Big Data Focus Shifting to Analytics and Visualization

Jeff Kelly:

To reiterate, there’s still plenty of work to do on the infrastructure layer of Hadoop and other Big Data approaches. But the focus of the Big Data industry is — and should be — moving to include analytics and visualization.

Differently put data is not the end goal.

Original title and link: Big Data Focus Shifting to Analytics and Visualization (NoSQL database©myNoSQL)

via: http://wikibon.org/blog/hadoop-big-data-focus-shifting-to-analytics-and-visualization/


What Is Business Intelligence 3.0?

According to Bill Cabiro citing Tableau software the answer is visual analysis:

[…] visual analysis is not a graphical depiction of data. Virtually any software application can produce a chart, gauge or dashboard. Visual analytics offers something much more profound. Visual analytics is the process of analytical reasoning facilitated by interactive visual interfaces.

I’m not sure that a tool providing data visualization with investigative capabilities qualifies as a business intelligence solution. But I can agree it can be quite sexy for the C-level people.

Original title and link: What Is Business Intelligence 3.0? (NoSQL database©myNoSQL)

via: http://blog.strat-wise.com/2011/08/04/what-is-bi-30.aspx


Top 5 Priorities of BI Professionals

Bob Zurek lists the top 5 priorities of BI people:

  • Performance
  • Speed of loading data for analysis
  • What do we do with all this social data?
  • Keeping the BI lights always on
  • Getting to project completion in shorter bursts of time

Taking a step back these top priorities are the same for everyone working with Big Data and storage in general.

Original title and link: Top 5 Priorities of BI Professionals (NoSQL database©myNoSQL)

via: http://agileanalytics.endeca.com/2011/08/what-keeps-agile-bi-professionals-up-at-night-3/


BI Pentaho Integrates Hadoop, NoSQL Databases, and Analytic Databases

Dr.Dobb’s:

  • The ability to orchestrate execution of Hadoop related tasks (i.e., executing a Hive Query, Pig Script, or M/R job) as part of a broader IT workflow.
  • The ability to setup dependencies, so if a step fails the job can branch down a recovery path or send a notification, or if it’s a success it goes on to subsequent dependent tasks. Likewise it supports initiating several tasks in parallel.
  • New integration for Pig — so that developers have the ability to execute a Pig job from a PDI Job flow, integrate the execution of Pig jobs in broader IT workflows through PDI Jobs, take advantage of our out of the box scheduler, and so on.

The list of tools Pentaho 4 integrates with is quite long:

  • a long list of traditional RDBMS
  • analytics databases (Greenplum, Vertica, Netezza, Teradata, etc.)
  • NoSQL databases (MongoDB, HBase, etc.)
  • Hadoop variants
  • LexisNexis HPCC

This is the world of polyglot persistence and hybrid data storage.

Original title and link: BI Pentaho Integrates Hadoop, NoSQL Databases, and Analytic Databases (NoSQL database©myNoSQL)


Business Intelligence for Big Data: What Is Missing?

Eric Rogge:

Still, even with improving connections between BI and unstructured data stores, the challenge with today’s business intelligence deployments is that they only enable quantitative analysis of a fraction of an enterprises’ information assets. That’s because the majority of information available to an enterprise is unstructured content held in documents, e-mail messages, collaboration forums, and on the Web. Enterprises now realize that to have a complete, 360-degree view of their operations, they need to analyze that unstructured data. That analysis involves both qualitative assessments as well as quantitative analytics. The challenge of BI isn’t storing the unstructured data; it is the significant back-end development work needed to gather and quantify unstructured information sources.

Missing from an enterprise’s portfolio of BI tools are search and semantic processing technology, which can efficiently process unstructured data into gists and metrics, plus handle large volumes of data from widely dispersed sources.

No further than yesterday, I was writing on two separate posts that:

  1. the value of BigData resides both in its volume and the possibilities to enhance it with metadata and link it with other data sets
  2. bringing together both structure and unstructure data is the future

Original title and link: Business Intelligence for Big Data: What Is Missing? (NoSQL database©myNoSQL)

via: http://tdwi.org/articles/2011/06/08/googlizing-bi-with-search-based-applications.aspx


Big Data Has a Secret

Nicholas Goodman’s (LucidDB) sharp vision about Business Intelligence on Big Data:

It’s just a bunch of technology that propeller heads (I am one myself) sling code with that crunch data to get data into custom built reporting type applications. Unlike SQL databases, they’re NOT ACCESSIBLE to analysts, and reporting tools for easy report authoring and for businesses to quickly and easily write reports.

Until businesses get to ACTUALLY USE Big Data systems (and not via proxy built IT applications) it’s value to the business will be minimal. When businesses get to use Big Data systems directly; there will be dramatic benefit to the business in terms of timeliness, decision making, and insights.

A must read.

Original title and link: Big Data Has a Secret (NoSQL database©myNoSQL)

via: http://www.nicholasgoodman.com/bt/blog/2011/06/08/a-different-vision-for-the-value-of-big-data/


SQL Access to CouchDB Views

Nicholas Goodman:

[…] enabling SQL Access to CouchDB Views […] single, biggest advantage is: The ability to connect, run of the mill, commodity BI tools to your big data system.

While the video below doesn’t show a PRPT it does show Pentaho doing Ad Hoc, drag and drop reporting on top of CouchDB with LucidDB in the middle, providing the connectivity and FULL SQL access to CouchDB. Once again, the overview:

CouchDB LucidDB pentaho

Being able to bring together both structured and unstructured data—it doesn’t really matter if it is BigData or not—, query it with a language that is familiar to many developers and that had tons of tools available represents the future of polyglot persistence.

Original title and link: SQL Access to CouchDB Views : Easy Reporting | Goodman on BI (NoSQL database©myNoSQL)

via: http://www.nicholasgoodman.com/bt/blog/2011/06/22/sql-access-to-couchdb-views-easy-reporting/


MongoDB, BI and Non-Relational Databases

Avoiding storing and maintaining a second copy of large volumes of data is always a good thing.  And if the analysis doesn’t require joining with data from another source, using the original source data can be advantageous.  There are always questions about performance impacts on the operational source, and sometimes security implications as well.  However, the main question is around the types of query possible against a NoSQL store in general or a document-oriented database in this case.  It is generally accepted that normalizing data in a relational database leads to a more query-neutral structure, allowing a wider variety of queries to be handled.  On the other hand, as we saw with the emergence of dimensional schemas and now columnar databases, query performance against normalized databases often leaves much to be desired.  In the case of Operational BI, however, most experience indicates that the queries are usually relatively simple, and closely related to the primary access paths used operationally for the data concerned.  The experience with MongoDB bears this out, at least in the initial analyses users have required.

Let me try rephrasing this for you: MongoDB can be used for operational BI when the following conditions are satisfied:

  1. BI queries are very simple
  2. Collected data is already in the right format and doesn’t require any additional correlations or joins
  3. There’s no need for additional data from external sources
  4. The impact on the performance of the live storage can be ignored
  5. Security is not a concern

How many scenarios fit the above description?

Original title and link: MongoDB, BI and Non-Relational Databases (NoSQL database©myNoSQL)

via: http://www.b-eye-network.com/blogs/devlin/archives/2011/06/mongodb_bi_and.php