NoSQL event: All content tagged as NoSQL event in NoSQL databases and polyglot persistence
- ☞ Part 1: Opening Remarks, Hibari, Okuyama, Cassandra, ROMA, MyCassandra
- ☞ Part 2: MongoDB, kumofs
- ☞ Part 3: Couch DB, HBase/Hadoop, Closing Remarks.
Quite a lot to watch over the weekend!
I should start by saying that I love new technology as long as it proves to be useful and that it works. For example I’ve been using LiveScribe to record the great NoSQL Evening in Palo Alto event organized by Tim Anglade with InfiniteGraph support, just to discover later that it decided to lose everything. So, instead of being able to quote things from the event, I’ll have to rely on my memory, which is really really bad.
After Berlin Buzzwords, this was the largest NoSQL event I’ve participated to. With InifiniteGraph’s people support, Tim Anglade did a great job in organizing this event and gathered together in a panel quite a few leaders of the NoSQL market. Unfortunately there were a few notable absences too; Redis, HBase, Project Voldemort, Neo4j, and RavenDB being the ones I’ve missed. Anyways, knowing how difficult is to put something like this together, this is understandable.
Before getting to what I’m remembering from the event, I have to tell you that I’ve been impressed with the fact that InfiniteGraph has not pushed for their product during the event. They have been great hosts and I had enjoyable discussions with many of their people, especially Darren Wood the lead architect.
Now it is time to test how bad my memory is. In case I got things wrong, please feel free to correct me.
What triggered so much activity in the persistence space?
- cloud computing
- failings do lead to innovation
- the changing nature of the applications
- old ideas reoccurring in customer work
Simplicity has different meanings for different people
NoSQL databases try to be:
- developer friendly
- ops friendly
- user friendly
As a side note, it looks like there is still a myth out there related to NoSQL databases not needing DBAs. While the title doesn’t need to be the same, every NoSQL database will need someone…
What is the market size?
The relational database market size is estimated at $27bil. Noone wanted to go on record with their NoSQL databases market estimation. The only number I’ve heard mentioned that “70% usecases can fit NoSQL solutions” (Roger Bodamer, 10gen)
(New) Data models
The data model determines the access model. This discussion continued over the dinner, when people tried to answer the question how connected is SQL to the relational model.
I’ve also seem to remember some interesting remarks about indexes:
- indexes are a different data model that enable different access models
- indexes will be orders of magnitude larger than real data
RAM is the new disk
There are many products out there which believe in RAM being the new disk (i.e. VoltDB, elastic caching solutions, etc.). Darren Wood (InfiniteGraph) mentioned that “graph analytics is a counter-example of using RAM storage for all data”.
Many NoSQL databases have chosen to use some form or another of open source licensing models. The reasons for doing it:
- ease market penetration
- there’ll always be companies willing to pay for software, support, streams of patches, etc.
- open source gives great hiring resources
Some other disparate notes:
- Many of the solutions are old, but the wrapping/packaging is new. For example, MarkLogic is a good proof document databases work, even if it is not one of the “cool” products.
- Not all NoSQL databases are about size
- OLTP is for knowledge classification; OLAP is for knowledge discovery
- Will we have multi-purpose NoSQL databases?
- Cold data should be on disk
- Using the right protocol can help you skip supporting specific features. CouchDB is HTTP friendly, so it doesn’t need to directly have a caching layer
- Are key-value stores offering too little compared to file systems?
- SQLite has a great distribution model: it is basically everywhere.
For more accurate coverage of the event you can read:
Update: now we’ve got also the video!
The ☞ Strange Loop conference hosted two NoSQL talks: ☞ Steve Smith on Real world modeling with MongoDB (PDF) and ☞ Billy Newport on Enterprise NoSQL: Silver Bullet or Poison Pill?.
First one was on the practical parts of NoSQL, the second offered an “enterprisey” perspective on NoSQL. Victor Olteanu has a ☞ long post summarizing the talks at the conference, including the two mentioned.
Talking about modeling with MongoDB, here is another slidedeck on this subject:
Update: I stand corrected: there were many more NoSQL talks at StrangeLoop.
Riak from Small to Large
Working with Dimensional data in Distributed Hash Tables
Unifying the Search Engine and NoSQL DBMS with a Universal Index
Chris Biow’s slides also available ☞ as PDF.
There were 4 more that I couldn’t tracked down
HyperGraphDB - Data Management for Complex Systems
Borislav Iordanov’s slides available ☞ here (pdf)
NoSQL At Twitter
Kevin Weil’s slides available ☞ here (pdf)
Adopting Apache Cassandra
Eben Hewitt’s slides available ☞ here (pdf)
Scaling with MongoDB
Roger Bodamer’s slides available ☞ here.
I still cannot figure out how I managed to miss Hadoop World. All I got left (except being pissed off) is to follow the tweets coming from the NoSQL event.
Topics covered so far:
- Big Data
- Hadoop security
- Case studies
- Facebook: Hadoop, HBase, Hive, Scribe
- Twitter: Hadoop, Scribe, Oozie, Pig
- Hadoop at Chicago Mercantile Exchange
Check out my curated Hadoop World stream.
Yesterday was the NoSQL Frankfurt conference and today we have the chance to review some of the slide decks presented.
Beyond NoSQL with MarkLogic and The Universal Index
The GraphDB Landscape and sones
Achim Friedland (@ahzf) has provided a very interesting overview of the graph databases products, the goals and some scenarios for graph databases, a brief comparison of property graphs with other models (relational databases, object-oriented, semantic web/RDF, and many other interesting aspects.
Data Modeling with Cassandra Column Families
Neo4j Spatial - GIS for the rest of us
Cassandra vs Redis
Tim Lossen (@tlossen) slides compare Cassandra and Redis from the perspective of a Facebook game requirements. All I can say is that the conclusion is definitely interesting, but you’ll have to check the slides by yourselves.
Mastering Massive Data Volumes with Hypertable
Doug Judd — who impressed me with his fantastic Hypertable: The Ultimate Scaling Machine at the Berlin Buzzwords NoSQL conference — gave a talk on Hypertable, its architecture and performance. The presentation also mentioned two Hypertable case studies: Zvents (an analytics platform) and Reddiff.com (spam classification)
More presentations will be added as I’m receiving them.
Fantastic presentation by Doug Judd covering not only Hypertable but also other really scalable NoSQL databases: