NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



SQL or NoSQL? The Conclusion is …

Lately there seems to be quite a few articles reviving an idea that is not so new anymore: RAM is the new disk and some are connecting this to the NoSQL vs SQL debates.

Jim Gray from Microsoft published the “Tape is Dead. Disk is Tape. Flash is Disk. RAM Locality is King” (see below embedded) in December 2006. There is a nice round up of the opinions on this subject in this InfoQ article: ☞ RAM is the new disk

[M]emory is several orders of magnitude faster than disk for random access to data (even the highest-end disk storage subsystems struggle to reach 1,000 seeks/second). Second, with data-center networks getting faster, it’s not only cheaper to access memory than disk, it’s cheaper to access another computer’s memory through the network. As I write, Sun’s Infiniband product line includes a switch with 9 fully-interconnected non-blocking ports each running at 30Gbit/sec; yow! The Voltaire product pictured above has even more ports; the mind boggles. (If you want the absolute last word on this kind of ultra-high-performance networking, check out Andreas Bechtolsheim’s Stanford lecture.) Tim Bray in ☞ On Grids

Getting back to our days, Nati Shalom of Gigaspace has published an article ☞ Why Existing Databases (RAC) are So Breakable! in which he writes:

Memory can be more reliable then disk

Many people assumes that memory is an unreliable data storage.
That assumption holds true if your data “lives” on a single machine; in this case if the machine fails or crashes your application crashes. But what if you distribute the data across a cluster of nodes and maintain more than one copy of the data over the network? In this case, if a node crashes the data is not gone; it lives elsewhere and can be continuously served from one of its replicas.

The article links to various research papers with real data about disk and RAM reliability:

Then there is Ilya Grigorik’s article ☞ Future of RDBMS is RAM Clouds & SSD in which he writes:

However, while the new storage engines are exciting to see, it is also important to recognize that relational databases still have a bright future ahead - RDBMS systems are headed into main memory, which changes the playing field all together. […] Memory is fast, disks are slow. Nothing is stopping relational systems from taking advantage of main memory or SSD storage.

I do think that it is wrong saying that only RDBMS can benefit of the reliability and speed of the RAM. Maybe NoSQL solutions been built nowadays are adapting faster, while long time, massive RDBMS will take a bit longer, but at the end of the day everyone has already agreed that the RAM is the new disk and sooner or later all systems will be rethought to take advantage of this.

Jim Gray: Tape is Dead. Disk is Tape. Flash is Disk. RAM Locality is King

In case the embed doesn’t work (please do let me know) you can also download the ☞ PDF or ☞ PPT.