NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



BI: All content tagged as BI in NoSQL databases and polyglot persistence

A retrospective of two years of Big Data with Andrew Brust

Andrew Brust on his way out from ZDNet to GigaOm Research:

As much as I chide the Hadoop world for having started out artificially siloed and aloof, it did the industry a great service: it took the mostly- ossified world of databases, data warehouses and BI and made it dynamic again.

Suddenly, the incumbent players had to respond, add value to their products, and innovate rapidly. It’s hard to imagine that having happened without Hadoop.

Original title and link: A retrospective of two years of Big Data with Andrew Brust (NoSQL database©myNoSQL)


Datameer raises $19 Million

Announced yesterday:

“This funding is entirely about allowing us to meet the nonstop global demand for our product. Across every industry, companies are moving past Hadoop science projects and realizing they need a proven big data analytics tool that finally frees them from schemas and ETL,” said Stefan Groschupf, CEO of Datameer.

Funding in the Hadoop space is at a higher level than the pure NoSQL databases market. In the Big Data/BI market it’s easier to grasp the competitors and the market potential they’re fighting for. In the NoSQL market, many are still afraid to think that some of these players will actually make (big) dents into incumbents’ market segments.

Original title and link: Datameer raises $19 Million (NoSQL database©myNoSQL)


The three most common ways data junkies are using Hadoop

Shaun Connolly (Hortonworks) lists the 3 most commons usages of Hadoop in a guest post on GigaOm:

  1. Data refinery
  2. Data exploration
  3. Application enrichment

Nothing new here, except the new buzzwords used to describe those Hadoop use cases that were slowly, but steadily establishing as patterns. And even if they sound nicer than ETL, analytics, etc. I doubt anyone needed new terms.

Original title and link: The three most common ways data junkies are using Hadoop (NoSQL database©myNoSQL)

Top 10 trends in Business Intelligence for 2014 from Tableau Software

The month of retrospectives is here. But even better, December is the month of predictions—the time when you need to carefully file links for later claim chowder.

Tableau Software’s predictionstrends for 2014:

  1. The end of data scientists.
  2. Cloud business intelligence goes mainstream.
  3. Big data finally goes to the sky.
  4. Agile business intelligence extends its lead.
  5. Predictive analytics, once the realm of advanced and specialized systems, will move into the mainstream.
  6. Embedded business intelligence begins to emerge in an attempt to put analytics in the path of everyday business activities.
  7. Storytelling becomes a priority.
  8. Mobile business intelligence becomes the primary experience for leading-edge organizations.
  9. Organizations begin to analyze social data in earnest.
  10. NoSQL is the new Hadoop.

I hope they don’t really mean the 1st one.

Original title and link: Top 10 trends in Business Intelligence for 2014 from Tableau Software (NoSQL database©myNoSQL)


5 different kinds of BI

The earlier post from Rob Klopp about The future of BI made me look for other similar classifications of the BI field1. I’ve found this post from Curt Monash:

That could lead to the categories:

  • Both operational and root-cause: Real-time monitoring (or more precisely human real-time).
  • Operational but not root-cause: Operational BI without that monitoring aspect — e.g., checking whether an expense submission makes sense.
  • Root-cause but not operational: Investigative/exploratory BI.
  • Neither operational nor root-cause: Monitoring BI without an immediate operational aspect — e.g., checking the dashboard periodically.

This categorization seem to be more high level goal-oriented, while Rob Klopp’s post clearer in what each of the approaches offers.

  1. Regular readers already know that BI is not exactly my area of expertise. 

Original title and link: 5 different kinds of BI (NoSQL database©myNoSQL)


The future of BI: Infographics is the next phase

Great summary of the history of BI main directions and what comes next from Rob Klopp:

But an interesting thing was happening outside BI… and this is the point of this note. In the same way that PowerPoint led reporting to charting a new presentation technique called infographics that is emerging. It is the state of the art in data visualization… and Powerpoint… and art… rolled into one. And it is very impactful. I imagine that the next wave of BI tools must embrace this more advanced presentation technique.

Original title and link: The future of BI: Infographics is the next phase (NoSQL database©myNoSQL)


How to estimate cost of BI deployment

Boris Evelson lists 7 top categories to consider when calculating the costs of future BI deployments:

The effort and costs associated with professional services, whether you use internal staff or hire contractors, depend not only on the complexity of business requirements like metrics, measures, reports, dashboards, and alerts, but also on the number of data sources you are integrating, the complexity of your data integration processes, and logical and physical data modeling. At the very least Forrester recommends considering the following components and their complexity to estimate development, system integration and deployment effort

It would be interesting if this outline would be accompanied by survey data about the average costs of each categories.Hint: this would tell us where we’ll see companies focusing their efforts in the next couple of years.

Original title and link: How to estimate cost of BI deployment (NoSQL database©myNoSQL)


Recognizing the Power of Hadoop: Platfora BI Is Better on Hadoop

Ben Werther announcing the general availability of the Platfora BI:

At Platfora, we made a bet that Hadoop’s destiny wasn’t simply to be a cheaper, slower cousin of the relational data warehouse. […] Hadoop is superb at two things — it provides a near-infinite data reservoir where data of all kinds can be landed without needing to figure out how it will be used ahead of time, and it is a slow lumbering freight-train of an engine for crunching and aggregating batches of millions or billions of rows.

They are neither the first, nor the last to understand and bet on Hadoop. But in some cases this bet originates only in the financial potential of the Hadoop market and less so on the technological potential.

Indeed it’s rarely the case that these two can leave alone. When they do, it leads to either a smaller market segment or to a shorter life time. Looking around at what’s happening in the Hadoop space, technologically and business wise, I assume many economists would recognize the signs of a long lived opportunity.

As a side note, I find it interesting that very few articles are looking at two other fundamental aspects of the Hadoop platform, which, in my opinion, were, are and will remain critical to the growth of this market: open source and extensibility. Without any of these two, what would we see would be tons of copy cats wasting resources in creating small indistinguishable clones, plus countless and endless negotiations to extend and integrate the platform. Hadoop is open source and the open source developers working on it have built it with extensibility in mind. The proof is out there and is clear: look at the breadth and depth of the tools around Hadoop.

That’s the power of open source. The way of the future.

Original title and link: Recognizing the Power of Hadoop: Platfora BI Is Better on Hadoop (NoSQL database©myNoSQL)


Reports Indicate That Part of Your Business Algorithm Is Executed by Humans

Jay Kreps1 had a very interesting follow up to the GigaOM’s article Why big data might be more about automation than insights :

That article reminded me how immature people’s thinking about the use of data is. They are still thinking about “reports”. Reports indicate that that part of your business algorithm that is executed by a human. When you understand it well enough, whatever you are doing looking at a report a computer can do better and faster. But the real advantage is that computers can disaggregate decisions humans make into many many individual cases and be far more accurate.

The algorithms is:

  1. add instrumentation
  2. visualzie data
  3. turn visualization into a report
  4. automate reaction to report
  5. Wash, rinse, repeat.

  1. Jay Kreps is working at LinkedIn in the SNA team. 

Original title and link: Reports Indicate That Part of Your Business Algorithm Is Executed by Humans (NoSQL database©myNoSQL)

7 Questions to Understand What Type of Hadoop Intergration BI Vendors Mean

Hadoop certainly plays a key role in the big data revolution, so all Business Intelligence (BI) vendors are jumping on the bandwagon and say that they integrate with Hadoop. But what does that really mean?

Boris Evelson suggests 7 questions to clarify the kind and level of integration with Hadoop each BI vendor is providing. While not my space, most often I hear about this integration it only means: “we can use Hadoop as an ETL tool”.

Original title and link: 7 Questions to Understand What Type of Hadoop Intergration BI Vendors Mean (NoSQL database©myNoSQL)


5 Business Intelligence Suites

Pricing takes a bit of work to nail down; it pays to talk to the nice salespeople and let them help you sort it all out. All of these BI suites have free software downloads, and free community support. All of them offer multiple modules, custom support and engineering services, and training and documentation. So the answer to “how much will it cost” is always “it depends.”

Software licenses and custom engineering are especially “it depends.” Your total first-year costs can easily hit $10,000 for a small shop. Presumably you’ll spend less on support and training after your first year. Still, that’s considerably less than the traditional proprietary offerings from SAP, IBM Cognos, SAS and other old-time proprietary BI vendors.

The 5 solutions mentioned: Jaspersoft, Spargo, Pentago, Openl, Actuate.

Original title and link: 5 Business Intelligence Suites (NoSQL database©myNoSQL)


Busting 10 Myths About Hadoop

Philip Russom clarifies some myths about Hadoop and MapReduce circulating inside the BI community:

  1. Hadoop consists of multiple products.
  2. Hadoop is open source but available from vendors, too.
  3. Hadoop is an ecosystem, not a single product.
  4. HDFS is a file system, not a database management system (DBMS).
  5. Hive resembles SQL but is not standard SQL.
  6. Hadoop and MapReduce are related but don’t require each other.
  7. MapReduce provides control for analytics, not analytics per se.
  8. Hadoop is about data diversity, not just data volume.
  9. Hadoop complements a DW; it’s rarely a replacement.
  10. Hadoop enables many types of analytics, not just Web analytics.

I do hope this lack of information and misconceptions are not real as otherwise some BI careers would really be endangered.

Original title and link: Busting 10 Myths About Hadoop (NoSQL database©myNoSQL)