NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Thoughts from NoSQL Evening in Palo Alto

I should start by saying that I love new technology as long as it proves to be useful and that it works. For example I’ve been using LiveScribe to record the great NoSQL Evening in Palo Alto event organized by Tim Anglade with InfiniteGraph support, just to discover later that it decided to lose everything. So, instead of being able to quote things from the event, I’ll have to rely on my memory, which is really really bad.

After Berlin Buzzwords, this was the largest NoSQL event I’ve participated to. With InifiniteGraph’s people support, Tim Anglade did a great job in organizing this event and gathered together in a panel quite a few leaders of the NoSQL market. Unfortunately there were a few notable absences too; Redis, HBase, Project Voldemort, Neo4j, and RavenDB being the ones I’ve missed. Anyways, knowing how difficult is to put something like this together, this is understandable.

Before getting to what I’m remembering from the event, I have to tell you that I’ve been impressed with the fact that InfiniteGraph has not pushed for their product during the event. They have been great hosts and I had enjoyable discussions with many of their people, especially Darren Wood the lead architect.

Now it is time to test how bad my memory is. In case I got things wrong, please feel free to correct me.

What triggered so much activity in the persistence space?

  • cloud computing
  • failings do lead to innovation
  • the changing nature of the applications
  • old ideas reoccurring in customer work

Simplicity has different meanings for different people

NoSQL databases try to be:

  • developer friendly
  • ops friendly
  • user friendly

As a side note, it looks like there is still a myth out there related to NoSQL databases not needing DBAs. While the title doesn’t need to be the same, every NoSQL database will need someone…

What is the market size?

The relational database market size is estimated at $27bil. Noone wanted to go on record with their NoSQL databases market estimation. The only number I’ve heard mentioned that “70% usecases can fit NoSQL solutions” (Roger Bodamer, 10gen)

(New) Data models

The data model determines the access model. This discussion continued over the dinner, when people tried to answer the question how connected is SQL to the relational model.

I’ve also seem to remember some interesting remarks about indexes:

  • indexes are a different data model that enable different access models
  • indexes will be orders of magnitude larger than real data

RAM is the new disk

There are many products out there which believe in RAM being the new disk (i.e. VoltDB, elastic caching solutions, etc.). Darren Wood (InfiniteGraph) mentioned that “graph analytics is a counter-example of using RAM storage for all data”.


Many NoSQL databases have chosen to use some form or another of open source licensing models. The reasons for doing it:

  • ease market penetration
  • there’ll always be companies willing to pay for software, support, streams of patches, etc.
  • open source gives great hiring resources

Some other disparate notes:

  • Many of the solutions are old, but the wrapping/packaging is new. For example, MarkLogic is a good proof document databases work, even if it is not one of the “cool” products.
  • Not all NoSQL databases are about size
  • OLTP is for knowledge classification; OLAP is for knowledge discovery
  • Will we have multi-purpose NoSQL databases?
  • Cold data should be on disk
  • Using the right protocol can help you skip supporting specific features. CouchDB is HTTP friendly, so it doesn’t need to directly have a caching layer
  • Are key-value stores offering too little compared to file systems?
  • SQLite has a great distribution model: it is basically everywhere.

For more accurate coverage of the event you can read:

Update: now we’ve got also the video!

Original title and link: Thoughts from NoSQL Evening in Palo Alto (NoSQL databases © myNoSQL)