NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL future: All content tagged as NoSQL future in NoSQL databases and polyglot persistence

The Data Deluge Makes the Scientific Method Obsolete

Chris Anderson in a 2008 article:

Sixty years ago, digital computers made information readable. Twenty years ago, the Internet made it reachable. Ten years ago, the first search engine crawlers made it a single database. Now Google and like-minded companies are sifting through the most measured age in history, treating this massive corpus as a laboratory of the human condition. They are the children of the Petabyte Age.

The Petabyte Age is different because more is different. Kilobytes were stored on floppy disks. Megabytes were stored on hard disks. Terabytes were stored in disk arrays. Petabytes are stored in the cloud. As we moved along that progression, we went from the folder analogy to the file cabinet analogy to the library analogy to — well, at petabytes we ran out of organizational analogies.

Original title and link: The Data Deluge Makes the Scientific Method Obsolete (NoSQL database©myNoSQL)


Mine Is Bigger Than Yours: Hadoop Code Contributions

Who’s bigger? Hortonworks’ The Yahoo! Effect or Cloudera’s The Community Effect?

This is ugly and should never happen to an open source project.

Still Joe Brockmeier (RWW) describes this as a superb win-win situation:

It might seem unhealthy for companies to be clamoring for credit in open source projects, but it’s a sign of health for projects. If companies position themselves to be top contributors, and care about their standing, the projects win. Users win too. Developers in the ecosystem also win – since it’s far easier to hire existing contributors than trying to push outsiders in to a project.

But there’s just a minor thing missing. Who gets the cheese?

Original title and link: Mine Is Bigger Than Yours: Hadoop Code Contributions (NoSQL database©myNoSQL)

The Best Month for NoSQL, Big Data, and the Data Space?

Last evening I was trying to catch up with the news in the NoSQL and Big Data space—it looks like nobody wants to pick up the job I’m doing here, except maybe GigaOm’s Infrastructure Curator Derick Harris.

After skimming for a while through the links I’ve bookmarked, I’ve started to realize that this month, September 2011, is looking like the most exciting month in the data space, including but not limited to NoSQL and NewSQL, Big Data, data analytics etc. Partnerships, fundings, acquisitions, major releases. Every couple of days I had a news about a very interesting announcement.

You’ve probably read about some of these, but I thought I should group them together so you could get the same feeling I got:

For a while I’ll keep updating this post to point to the most interesting news this month.

Original title and link: The Best Month for NoSQL, Big Data, and the Data Space? (NoSQL database©myNoSQL)

SDB Explorer for Amazon SimpleDB

I opened my email this morning just to find one of the daily Mac software deals email promoting an Amazon SimpleDB tool: SDB Explorer[1]. This reminded me that last month I’ve seen NoSQL mentioned twice on TechMeme. I don’t know if any relational database has ever been mentioned on Oprah, but that’s the next stop for NoSQL databases. NoSQL is mainstream.

  1. Here is also the promotion.  

Original title and link: SDB Explorer for Amazon SimpleDB (NoSQL database©myNoSQL)

State of HBase With Michael Stack

Michael Stack (StumbleUpon & Hadoop PMC) presents on some of the more interesting HBase deployments, HBase scenario usages, HBase and HDFS, and near-future of HBase:

The Appealing Future of Big Data and Data Analytics

In a RWW article, David Smith writes about the R statistics language:

Over two million analysts worldwide use R, and they come from an extremely diverse pool of industries that ranges from journalism to financial services to life sciences.

If you replace R with data analytics, this could seen as a very appealing future of Big Data and data analytics. Something like a generalized version of data analytics at work.

But before loosing myself in this perspective, I thought I should take a look at the present and see how what is done now is going to lead to that amazing tomorrow:

  1. Tim O’Reilly said a couple of years ago “Data is the Intel inside” and since then we’re seeing lots and lots of companies trying to materialize this slogan.
  2. More new technologies for storage, processing, and analysis are developed and reaching the market then in the 10 previous years.
  3. People are starting to embrace big data overcoming their fear of privacy invasion

All these are good signs that we could consider as a good basis for the future. On the other hand the past and today’s reality tell a different story:

  1. Even if technology costs decreased over time, the investment in creating data startups are still high.
  2. Financial institutions are not investing (too much) into data technology companies.
  3. There are only a few companies that are able to accumulate significant amounts of useful data.
  4. There are even fewer companies that are able to use effectively the huge amounts of data.

What worries me is that even if we will continue to see both a commoditization and impressive improvement of data solutions, by the time all tools will be in place and accessible to everyone, as per the opening paragraph, really valuable data will reside in just a few private well locked silos.

Original title and link: The Appealing Future of Big Data and Data Analytics (NoSQL database©myNoSQL)

What Is Business Intelligence 3.0?

According to Bill Cabiro citing Tableau software the answer is visual analysis:

[…] visual analysis is not a graphical depiction of data. Virtually any software application can produce a chart, gauge or dashboard. Visual analytics offers something much more profound. Visual analytics is the process of analytical reasoning facilitated by interactive visual interfaces.

I’m not sure that a tool providing data visualization with investigative capabilities qualifies as a business intelligence solution. But I can agree it can be quite sexy for the C-level people.

Original title and link: What Is Business Intelligence 3.0? (NoSQL database©myNoSQL)


Characteristics of Big Data Application Platform

Nati Shalom describes the main characterstics of a future Big Data platform (as compared to existing application platforms like JavaEE):

  • Support Batch and Real Time analytics
    • Domain model and Data access API
    • Business logic
    • Support new semantics that fit the dynamic web era
    • Provide built-in semantics for handling of the tradeoffs between consistency, availability, scalability rather than trying to force a least common denominator as with XA and JTA
    • Provide built-in support for event driven data distribution using pub/sub model
  • Built in support for public/private cloud
  • Open & consistent management and orchestration across the stack

While having around such a platform sounds compelling, we shouldn’t forget that some of the fundamental parts are still under development or are completely missing. Before having an integrated uniform Big Data application platform we should at least attempt to have the right building blocks and ensure that they are created with integration in mind. Even if their (customer acquisition budgets) allow Spring Data and Hibernate OGM to already work towards unifying data access layers, the lack of real scenarios and the lack of integration between storage and processing solutions might prove them to be too early to market. Anyways it is good to know that when things will start to settle and a much clearer perspective on a Big Data application platform will be available, they will already have enough experience to provide us with the right solutions.

Original title and link: Characteristics of Big Data Application Platform (NoSQL database©myNoSQL)


Is Red Hat Interested in the Database Market?

Jim Whitehurst (Red Hat CEO):

When I say I don’t want to be a database company, I’m saying that I don’t want to be a SQL database company.

Now that Hadoop has become the tool for handling Big Data, everyone wants to seat at its table. What most seem to ignore is that Hadoop is just a part of the puzzle. There are many interesting solutions around it where a seat at the table is not yet so busy and expensive.

Original title and link: Is Red Hat Interested in the Database Market? (NoSQL database©myNoSQL)


The Enterprise Opportunity of Big Data: Closing the "Clue Gap"

Dion Hinchcliffe’s excellent article analysing the complexity and the opportunities of Big Data:

Knowledge is where the value is being created in business today, and has been the leading source of economic power for several decades now. Many of the most interesting and intrinsically valuable new businesses are ones that are fundamentally powered, almost directly, by the total sum of their information. […] The ultimate challenge in the end is putting enough useful Big Data capabilities into the hands of the largest number of workers. The organizations that figure out this part will reap corresponding rewards.

Big Data: The Moving Parts

Original title and link: The Enterprise Opportunity of Big Data: Closing the “Clue Gap” (NoSQL database©myNoSQL)


40% Penetration for NoSQL: An Interview With Basho's CEO Don Rippert

Don Rippert interviewed by Derrick Harris (GigaOm):

Enterprises will start adopting NoSQL en masse, Rippert thinks, because the types of data they’re now dealing with require new technologies. “We are the data store for the new type of data being stored,” he explained. […]

That data is largely of the unstructured variety coming from web applications, machines and other sources that aren’t the traditional business-transaction data for which relational databases were created. Relational databases were the answer to almost everything previously, but now Rippert thinks NoSQL is “the answer to about 40 percent of business use cases today”.

A couple of follow up questions for Don Rippert[1]:

  1. Is your prediction of 40% market share relative to scenarios for large scale, unstructured data with high availability requirements? That would basically mean a 40% market share for just a couple of products: Cassandra, HBase, Riak, Project Voldemort, and (probably) Couchbase.

  2. How is the rest of 60% of the market devided between the other NoSQL databases, NewSQL databases, and the traditional relational databases?

  3. Considering the current market structure, when do you think the shift towards large scale, highly available requirements happened?

  4. How long do you think it will take the market to remodel? What factors will accelerate this transition?

  1. I’d really appreciate if someone could forward these questions to him.  

Original title and link: 40% Penetration for NoSQL: An Interview With Basho’s CEO Don Rippert (NoSQL database©myNoSQL)


Why Big Data Is Such a Hot Topic in the World of Data Management?

Dave Kellogg[1]:

First I think Big Data is a hot topic because it represents the first time in about 30 years that people are rethinking databases. Literally, since about 1980 people haven’t had to think much about databases. If you were an SMB, you went SQL server; if you were enterprise, you’d go Oracle or IBM depending on your enterprise preferences. But in terms of technology, to paraphrase Henry Ford: any color you want, as long it’s relational[2].

Bob Warfield’s recent post NoSQL is a premature optimization, which got a lot of press, shows that this mentality is still in wide use despite the fact that during the last two years NoSQL databases, NewSQL databases, and analytic databases have already proved their strength in various markets and scenarios.

  1. Ex-MarkLogic CEO, ex-Aster data board member, ex-VP of Marketing at Versant and Ingres  

  2. My emphasis  

Original title and link: Why Big Data Is Such a Hot Topic in the World of Data Management? (NoSQL database©myNoSQL)