NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL: All content tagged as NoSQL in NoSQL databases and polyglot persistence

The Database World in a Venn Diagram

Infochimps put together a comprehensive Venn diagram of the database world in the TechCrunch article Big Data Right Now: Five Trendy Open Source Technologies

The Database World

Original title and link: The Database World in a Venn Diagram (NoSQL database©myNoSQL)

Life of Data at Facebook

Nice screenshot by TechCrunch people of the slide talking about the data lifecycle at Facebook:


Credit TechCrunch

Based on this you’ll now have a better picture of how Facebook data ingestion numbers correlate to their architecture.

Original title and link: Life of Data at Facebook (NoSQL database©myNoSQL)

A Short History of NoSQL, SQL, NoSQL

This post was written by Konstantin Osipov, one of the authors of Tarantool key-value store, and has been posted on the NoSQL Google group.

Way back in the 1960s databases didn’t separate data representation and data access.

To navigate in an index, a database user had to know the physical structure of the index.

Obvious deficiencies of the approach led to introduction of separation of data model and data representation. Relational model is one and still the most popular way to do it.

One of the most well known deficiencies of a relational model is the so-called object-relational impedance mismatch: there is more than one way to map objects to relations, and none of them fits all access patterns well.

It has as well a number of advantages: simplicity, ease of analytical processing, and, let’s not forget, performance: by normalizing data, a user is forced to tell the DBMS more about data constraints, distribution, future access patterns.

This makes building efficient and to-the-point data representation structures easier.

Unfortunately, the past generations of database management systems did not address one of the main architecture drawbacks, which plagues the relational model: rigidity of schema change. Very few mainstream DBMS allow to change the structure of a relational database quickly, without downtime or significant performance penalty. This is not a drawback of the relational model, but of one which relates to the implementation.

It should also be kept in mind that in many cases a relational model is an overkill, and a simple key to value mapping is sufficient.

And of course no single model can fit all needs (e.g. graph databases build around the notion of nodes & edges, yet, good luck trying to quickly calculate CUBE on a bunch of nodes in a graph database).

Unfortunately, the world of NoSQL, when it comes to the data model, often simply takes us back to the 60s: there is minimal abstraction of data access from data representation, and once a certain representation has been chosen, there is no way to change it without rewriting your application (e.g. to fit the new performance profile).

Scalability is an answer, but a silly one: throwing more hardware at a problem is not always economical.

Original title and link: A Short History of NoSQL, SQL, NoSQL (NoSQL database©myNoSQL)

Michael Stonebraker Says in Defense of NewSQL

The reason that this is becoming a hot-button issue is because IT organizations have invested billions of dollars in investments in SQL. Adding new data management frameworks such as Hadoop will add considerable expense in terms of finding people with the skills needed to manage these platforms. Stonebraker isn’t necessarily against Hadoop; he’s just pointing out that there is no one SQL database engine that fits all requirements and that before IT organizations adopt a NoSQL approach, they should consider other SQL-compatible approaches to solving the same problem.

Translated: What do these NoSQL kids know? My products are always the best. So instead of paying them, why not continue paying me.

Original title and link: Michael Stonebraker Says in Defense of NewSQL (NoSQL database©myNoSQL)


The Oracle NoSQL Database and Big Data Appliance

There’s been a lot of speculation about the announcements coming from Oracle’s OpenWorld event. A first part was revealed during the keynote in the form of an in-memory analytics appliance called Exalytics [2]. But there’s talk about a Big Data Appliance and an Oracle NoSQL database.

Here’re my predictions[1]

  1. Oracle became very aggressive in selling products based on hardware, software, and services. So they’ll announce a Hadoop appliance integrated with an existing Oracle product. It could be either the Oracle Exadata or even the newly announced Exalytics.

    This appliance will place Oracle in competition with all other Hadoop appliance sellers: EMC, NetApp, IBM. Also these days most of the analytics databases try to integrate with Hadoop.

  2. Oracle already has a couple of non-relational solutions in their portfolio: BerkleyDB, TimesTen, Coherence. And they’ve already started to test the NoSQL market by announcing the MySQL and MySQL Cluster NoSQL hybrid systems.

    I don’t expect Oracle NoSQL database to be a new product. Just a rebranding or repackaging of one of the above mentioned ones. Probably the TimesTen.

  3. Oracle will invest more into integrating its line of products with Hadoop. Having both a Hadoop and an in-memory analytics appliance will make them very competitive in this space.

  4. Oracle will extend the support for NoSQLish interfaces (memcached) to its other database products.

What are your predictions?

  1. or speculations  

  2. I’m currently gathering more details about Exalytics.  

Original title and link: The Oracle NoSQL Database and Big Data Appliance (NoSQL database©myNoSQL)

The Best Month for NoSQL, Big Data, and the Data Space?

Last evening I was trying to catch up with the news in the NoSQL and Big Data space—it looks like nobody wants to pick up the job I’m doing here, except maybe GigaOm’s Infrastructure Curator Derick Harris.

After skimming for a while through the links I’ve bookmarked, I’ve started to realize that this month, September 2011, is looking like the most exciting month in the data space, including but not limited to NoSQL and NewSQL, Big Data, data analytics etc. Partnerships, fundings, acquisitions, major releases. Every couple of days I had a news about a very interesting announcement.

You’ve probably read about some of these, but I thought I should group them together so you could get the same feeling I got:

For a while I’ll keep updating this post to point to the most interesting news this month.

Original title and link: The Best Month for NoSQL, Big Data, and the Data Space? (NoSQL database©myNoSQL)

Will Oracle Win the NoSQL Competition

I agree this title is misleading but problem is clear: today Oracle does not provide any product can compete with new cloud computing needs and with the NoSQL movement. It is not possibile to think that actually the RAC technology of oracle can be used in a cloud environment and also a cloud service cannot be deployed over an Exadata.

The real question though is if Oracle is really interested by the market currently served by NoSQL databases and/or hybrid solutions. And judging by the latest versions of MySQL and MySQL Cluster[1] it looks like they are testing the waters.

  1. Latest versions of MySQL and MySQL Cluster are adding support for using the Memcached protocol. See NoSQL to MySQL with Memcached  

Original title and link: Will Oracle Win the NoSQL Competition (NoSQL database©myNoSQL)


A Few More SQLish Statements

A few more statements:

  • SQL-based relational database systems are indeed as moribund as NoSQL advocates charge

  • Elephants are not slow because they support SQL.

  • Oracle doesn’t scale,

I assume you already know who’s the author.

Original title and link: A Few More SQLish Statements (NoSQL database©myNoSQL)


NoSQL: It’s Beginning to Look a Lot Like SQL

Stephen O’Grady:

It is too early to handicap the probable outcomes for these various query language projects. Nor is it certain that NoSQL will achieve the same consolidation the relational market did around a single approach; the differing approaches of the various NoSQL projects argue against this, in fact.

If by consolidation we mean having a query language that pseudo-works (by imposing tons of limitations, like GQL), I think we’ll be better of with custom query languages that take full advantage of their underlying NoSQL database engine.

Programming languages are not unified. Nor are file systems. And we are still using them to take full advantage of their unique features.

Original title and link: NoSQL: It’s Beginning to Look a Lot Like SQL (NoSQL database©myNoSQL)


Is Nosql a Premature Optimization That’s Worse Than Death? Or the Lady Gaga of the Database World?

I was just preparing for a long trip when Michael Stonebraker created a new storm. I only caught Domas Mituzas’ sharp reply and Werner Vogel’s comment:

scaling data systems in real life has humbled me. I would not dare to criticize an architecture that holds the social graphs of 750M and works

So if you feel like watching an action movie featuring A-class actors, Todd Hoff has summarized the whole conversation paraphrazing a comment about Lady Gaga:

You know, there’s a difference between not liking someone’s music and not recognizing their talent. If€ you can’t recognize the fact that Lady GaGa is, in fact, extremely talented in many ways, then you may want to try to look at her with less of a bias. There’s plenty of artists I can’t stand, but still respect their talent.

Even if you don’t like Lada Gaga’s schtick, that is a great performance. I get the feeling a lot SQL people don’t recognize the talent of NoSQL, whereas NoSQL people are generally use the best tool for the job types who have no problem with you using SQL if that works for you.

Original title and link: Is Nosql a Premature Optimization That’s Worse Than Death? Or the Lady Gaga of the Database World? (NoSQL database©myNoSQL)

What Scales Best?

Tony Bain:

What is best?  Well that comes down to the resulting complexity, cost, performance and other trade-offs.  Trade-offs are key as there are almost always significant concessions to be made as you scale up.


So what is my point? Well I guess what I am saying is physical scalability is of course an important consideration in determining what is best. But it is only one side of the coin. What it “costs” you in terms of complexity, actual dollars, performance, flexibility, availability, consistency etc, etc are all important too. And these are often relative, what is complex for you may not be complex for someone else.

I concur—a long time ago I wrote: Complexity is a dimension of scalability.

Original title and link: What Scales Best? (NoSQL database©myNoSQL)


NoSQL/NewSQL/MySQL Is Not a Zero Sum Game

Although there will be isolated examples, it is going to be rare, therefore, that any potential adopter would be directly comparing NoSQL and NewSQL technologies unless they are still at the stage trying to figure out the level of consistency required for an individual application.

I believe that the future will bring these technologies together so being aware of their pros and cons will be essential. Categorizing all of storage and processing engines just from the level of consistency perspective is like saying there’s only transactional data out there. We all know that’s not true at all.

Original title and link: NoSQL/NewSQL/MySQL Is Not a Zero Sum Game (NoSQL database©myNoSQL)