NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



debate: All content tagged as debate in NoSQL databases and polyglot persistence

What Makes It NoSQL?

An interesting post on the NOSQL Group about what takes a storage to be considered NoSQL:

  1. SQL-the-language vs. alternate query languages
  2. A tabular model for data as opposed to one that is not (e.g. key-value, object, graph, …)
  3. ACID vs. non-ACID
  4. Centralized vs. distributed/decentralized

In case we agree with the author, Johannes Ernst, then we might be tempted to conclude as he does:

It’s interesting to observe that any “NoSQL” product could be “NoSQL” in any number of these dimensions. […]

Which would also explain why so many “NoSQL” products are so dissimilar to each other.

So, what makes it NoSQL?

NoSQL != automatic scalability

James Golick has an excellent point on the article ☞ Future of RDBMS is RAM Clouds & SSD, which I reviewed and added more context in SQL or NoSQL? The Conclusion is …:

Most of these new NOSQL systems scale without additional effort.

This simply is not true. Many of them only “scale” using consistent hashing in the client (e.g. redis, tokyo?), which means that you’re still responsible for figuring out how to rebalance shards when the time comes. That’s extra effort.

Many of the popular NoSQL dbs don’t partition at all. Couch certainly doesn’t. Mongo’s “auto-sharding” is still in alpha, and I’m not aware of any major deployments of it.

Cassandra can partition data automatically, but as of the current released version, you can’t remove capacity.

NoSQL != automatic scalability.

So true!

Make sure that after reading SQL or NoSQL? The Conclusion is …, you do read the comment thread on ☞ Ilya’s post.

SQL or NoSQL? The Conclusion is …

Lately there seems to be quite a few articles reviving an idea that is not so new anymore: RAM is the new disk and some are connecting this to the NoSQL vs SQL debates.

Jim Gray from Microsoft published the “Tape is Dead. Disk is Tape. Flash is Disk. RAM Locality is King” (see below embedded) in December 2006. There is a nice round up of the opinions on this subject in this InfoQ article: ☞ RAM is the new disk

[M]emory is several orders of magnitude faster than disk for random access to data (even the highest-end disk storage subsystems struggle to reach 1,000 seeks/second). Second, with data-center networks getting faster, it’s not only cheaper to access memory than disk, it’s cheaper to access another computer’s memory through the network. As I write, Sun’s Infiniband product line includes a switch with 9 fully-interconnected non-blocking ports each running at 30Gbit/sec; yow! The Voltaire product pictured above has even more ports; the mind boggles. (If you want the absolute last word on this kind of ultra-high-performance networking, check out Andreas Bechtolsheim’s Stanford lecture.) Tim Bray in ☞ On Grids

Getting back to our days, Nati Shalom of Gigaspace has published an article ☞ Why Existing Databases (RAC) are So Breakable! in which he writes:

Memory can be more reliable then disk

Many people assumes that memory is an unreliable data storage.
That assumption holds true if your data “lives” on a single machine; in this case if the machine fails or crashes your application crashes. But what if you distribute the data across a cluster of nodes and maintain more than one copy of the data over the network? In this case, if a node crashes the data is not gone; it lives elsewhere and can be continuously served from one of its replicas.

The article links to various research papers with real data about disk and RAM reliability:

Then there is Ilya Grigorik’s article ☞ Future of RDBMS is RAM Clouds & SSD in which he writes:

However, while the new storage engines are exciting to see, it is also important to recognize that relational databases still have a bright future ahead - RDBMS systems are headed into main memory, which changes the playing field all together. […] Memory is fast, disks are slow. Nothing is stopping relational systems from taking advantage of main memory or SSD storage.

I do think that it is wrong saying that only RDBMS can benefit of the reliability and speed of the RAM. Maybe NoSQL solutions been built nowadays are adapting faster, while long time, massive RDBMS will take a bit longer, but at the end of the day everyone has already agreed that the RAM is the new disk and sooner or later all systems will be rethought to take advantage of this.

Jim Gray: Tape is Dead. Disk is Tape. Flash is Disk. RAM Locality is King

In case the embed doesn’t work (please do let me know) you can also download the ☞ PDF or ☞ PPT.

SQL vs NoSQL Panel at the November 2009 OpenSQLCamp in Portland, Oregon

Judging by the people on the panel this should be extremely interesting:

  • SQL
    • Brian Aker - Drizzle
    • Monty Widenius - MariaDB
    • Selena Deckelmann - PostgreSQL
  • NoSQL
    • Eric Evans - Cassandra
    • Mike Dirolf - MongoDB
    • Mike Miller - CouchDB

MongoDB and others, convince me. :-)

My experience is that almost any application can be broken up and thought of as tables. Especially in the business world, people naturally think in terms of spreadsheets since the spreadsheet is king there. And a spreadsheet is nothing but a table.

It is always important to try to see things from the other side of the fence too.


Will HTML5 be SQL-free?

A follow up from Kas Thomas (@kasthomas) on the What does NoSQL mean?, a different point of view on NoSQL from the perspective of web browsers.

HTML5 talks about SQL quite openly. And it appears Opera, Safari, and (soon) Chrome are implementing WebDB, which is a SQL database in the spirit of the (emerging) Web SQL Database spec. But that’s not to say WebDB is a traditional SQL database. It implements SQLite, which is another beast entirely.


What does NoSQL mean?

A popular question these days has been “What does NoSQL mean?”
Some say it means “Not only SQL” or something.

It’s a different kind of answer than you’d expect. The only hint I’ll give you is that it is related to the upcoming HTML5, but you’ll have to read the rest of the article to discover the answer.


NoSQL No Niche

While this obviously puts the lie to the idea that the market for NoSQL is too early to build a business on, one thing is certain: what people want from NoSQL varies from significantly from client to client.

Use cases anyone?


The Confused World of "NoSQL"

I believe it would be beneficial to seperate these use-cases and treat them differently (eg. call one NoSQL and the other DDS for Distributed Data Store).

I’d argue that we should firstly get these use cases.


SQL Server, NoSQL, RDBMS, Relational

Think the “NoSQL” movement isn’t prominent on Microsoft’s radar screen?

Well, at least from this post it is not clear at all


NoSQL: Not Going Anywhere For a While?

But the challenge with NoSQL is that the name implies that it means something, and that’s enough for folks new to the space to form opinions on the matter.

Sometimes it is not only about the term/name, but also about its origins…