NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



BI: All content tagged as BI in NoSQL databases and polyglot persistence

MongoDB, BI and Non-Relational Databases

Avoiding storing and maintaining a second copy of large volumes of data is always a good thing.  And if the analysis doesn’t require joining with data from another source, using the original source data can be advantageous.  There are always questions about performance impacts on the operational source, and sometimes security implications as well.  However, the main question is around the types of query possible against a NoSQL store in general or a document-oriented database in this case.  It is generally accepted that normalizing data in a relational database leads to a more query-neutral structure, allowing a wider variety of queries to be handled.  On the other hand, as we saw with the emergence of dimensional schemas and now columnar databases, query performance against normalized databases often leaves much to be desired.  In the case of Operational BI, however, most experience indicates that the queries are usually relatively simple, and closely related to the primary access paths used operationally for the data concerned.  The experience with MongoDB bears this out, at least in the initial analyses users have required.

Let me try rephrasing this for you: MongoDB can be used for operational BI when the following conditions are satisfied:

  1. BI queries are very simple
  2. Collected data is already in the right format and doesn’t require any additional correlations or joins
  3. There’s no need for additional data from external sources
  4. The impact on the performance of the live storage can be ignored
  5. Security is not a concern

How many scenarios fit the above description?

Original title and link: MongoDB, BI and Non-Relational Databases (NoSQL database©myNoSQL)


NoSQL and its Role in the BI Arena

Business intelligence applications are moving from the traditional connection to an OLAP Data source based on relational database systems to the ability to link to and consume data from a variety of disparate sources including social networks.  The ability for a modern BI application to be able to use mashups of data to provide agility when dealing with integrations of multiple types of data sources has led to NoSql being promoted by many as the next big thing within BI.  Does this mean that we have seen the end of the SQL style RDBMS system within the BI area – there are many pros and cons for both systems but I believe that there are still a place for both within the BI arena.

Recently I’ve started to read about data virtualization: a common access layer to heterogenous data sources.

Original title and link: NoSQL and its Role in the BI Arena (NoSQL databases © myNoSQL)


The Disruptive Value of Distributed Key-Value Stores

Martin Schneider (Basho):

Organizations with specific needs best met by a platform like Riak could save a company:

  • Millions of dollars in oracle license/maintenance
  • Hundreds of thousands a year in BI system license/maintenance
  • Up to hundreds of thousands in sys-admin salary/overhead

This sounds correct in theory. But the last couple of Oracle databases I’ve seen were:

  1. serving multiple applications
  2. sharing data between applications
  3. used for generating tens/hundreds of reports

So, the online storage/OLTP costs equation to beat is:

licenses + operational costs + “data integration costs” + etl + reporting « licenses + operational costs

Original title and link: The Disruptive Value of Distributed Key-Value Stores (NoSQL databases © myNoSQL)


Types of Big Data Work

Mike Minelli: Working with big data can be classified into three basic categories […] One is information management, a second is business intelligence, and the third is advanced analytics

Information management captures and stores the information, BI analyzes data to see what has happened in the past, and advanced analytics is predictive, looking at what the data indicates for the future.

There’s also a list of tools for BigData: AsterData (acquired by Teradata), Datameer, Paraccel, IBM Netezza, Oracle Exadata, EMC Greenplum.

Original title and link: Types of Big Data Work (NoSQL databases © myNoSQL)


11 Big-Data Analytics Predictions for 2011

Maybe I’m over-simplifying it, but I’m reading Ketan Karia’s[1] “11 BigData analytics predictions for 2011” as in:

  • hardware will be pushing BigData analytics forward — software will just catch up and follow hardware lead
  • 2011 will bring more BigData analytics adoption which also means analytics companies will be more profitable

Here are the original 11 predictions:

  1. We’ll hark the chips, not the hardware

    Many companies keep throwing hardware (especially more servers) at the problem, and the chip industry’s enormous investment in computer performance continues to sit idly by.

  2. Chip scale-out will date MPP and shrink big data networks.

    […] we are going to have the capability to run 256 or 512 cores on a single chip. If companies can do that, it will shrink the amount of hardware required to fuel the big data networks.

  3. Memory will go RAM.

    As the core density of chips and the RAM size keeps rising dramatically, total in-memory data warehouses are now feasible.

  4. Chip companies will spend more on R&D in 2011

    It will become more common place for engineers to stay in tune with chip technology advancements as they build out their solutions to manage and leverage huge volumes of business data.

  5. Acceleration of analytics will support the agile enterprise

    Analytics technologies will help businesses be more agile and will become a key business differentiator in 2011 and beyond.

  6. Businesses will sponsor their own analytical capabilities

  7. Analytics gets more embedded into business applications

  8. Open source moves to more hybrid models.

    Hybrid models are being used by companies including JasperSoft, SugarCRM, and Ingres (with our VectorWise database).

  9. Subscriptions stack up by the hour.

  10. Self-service BI gets more attention.

    In 2011, companies will have to accelerate report delivery.

  11. Users want to be “in the moment” with data insights

What is the definition of “prediction”? A guess or a bet or an educated thought?

  1. Ketan Karia: Chief Marketing Officer and senior vice president at Ingres  

Original title and link: 11 Big-Data Analytics Predictions for 2011 (NoSQL databases © myNoSQL)


The New Business Intelligence

Excellent visualization of BI processes on InfoWorld:

Big Data

Conventional BI

Operational BI

Credit InfoWorld

Original title and link: The New Business Intelligence (NoSQL databases © myNoSQL)

No SQL and Big Data from a Business Intelligence & Data Warehousing Perspective

“No SQL” and Big data appearing in Rick Sherman’s list of overhyped trends in BI and Data warehousing:

7.No SQL: Pundits are confusing the complexity of integrating data with the use of SQL. Enterprise business data is complex because business processes and their relationships are complex. Relational databases are a symptom of that complexity, not the cause. Sorry, but the world of data is complex and no amount of wishful thinking (or avoiding SQL) is going to change that.

9.BIG Data: What is this? Ask 10 vendors and you get 10 answers (based on what they are selling). This can be a trend if somebody can define it.

Original title and link: No SQL and Big Data from a Business Intelligence & Data Warehousing Perspective (NoSQL databases © myNoSQL)


6 Trends Driving Data Warehousing and Business Intelligence

Philip Russom here and Curt Monash here:

most drivers of change in BI and DW concern four Mega-Trends:

  • size
  • speed
  • interoperability
  • economics
  • new kinds of data
  • increased analytic sophistication

I guess what’s new is the impact of the new kinds of data — I’d probably include here social data, sensor data, the continuously increasing size and new analytic approaches.

Original title and link: 6 Trends Driving Data Warehousing and Business Intelligence (NoSQL databases © myNoSQL)

Business Intelligence - SQL or NoSQL

The problem as I see it for most major business clients is who within their organizations to use to implement a NoSQL BI solution. […] However most current technical staff involved in BI projects in the corporate world will be skilled in SQL style RDBMS applications (Oracle, MS SQL, Microstrategy).

I don’t think there’s a need to rewrite the existing BI tools to work either on top of SQL solutions or NoSQL databases. All is needed is the right pipes to let the data flow naturally back and forth between BI tools and either SQL or NoSQL databases.

Original title and link: Business Intelligence - SQL or NoSQL (NoSQL databases © myNoSQL)