NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



presentation: All content tagged as presentation in NoSQL databases and polyglot persistence

Presentation: MongoDB Internals

Mike Dirolf from 10gen, the company the started the MongoDB project, gave a technical presentation a bit above the regular “Introduction to MongoDB”. While the slides, embedded below, might be a bit terse, there is a link at the bottom of each slide that will send you in the right direction for more information about that specific subject.

Some of the things mentioned in the slides:

  • MongoDB wire protocol (slide 7) (documented ☞ here)
  • data file allocation (slide 8) (documented ☞ here)
  • memory management (slides 9-11)
  • replication Oplog (slide 21) (documented ☞ here)

This post is written by new myNoSQL contributor Andrei Maxim. Andrei is a software engineer specialized mainly on web apps where his “weapon of choice” remains Ruby on Rails. Follow him on Twitter: @xhr

Presentations on Hadoop, HBase, PIG and Cascalog from Hadoop Meet-Up

The Yahoo! Developer Network Blog has ☞ posted the materials presented at Hadoop’s monthly user group meeting. I’ve embedded these below for your convenience:

What’s New With Pig: Alan Gates

Pig is one of the solutions used for data processing/analysis in the NoSQL world. For example Pig is heavily used at Twitter.

Recently Pig has released ☞ two new versions (0.6.0 and 0.7.0) and this talk focuses on the new features included with these versions and a compatibility plan with Hadoop[1]

Cascalog: Powerful and easy-to-use data analysis tool for Hadoop: Nathan Marz

Cascalog is a Clojure-based query language solution for Hadoop-stored data analysis. Nathan Marz (BackType) is demoing this cool tool:

HBase and Pig: The Hadoop ecosystem at Twitter: Dmitriy Ryaboy

As already mentioned Twitter is extensively using HBase, Pig and Hadoop — in their words Cassandra is OLTP and HBase is OLAP — and Dmitriy provides an overview of their Hadoop-based ecosystem:


Presentation: SQL anti patterns and NoSQL alternatives

One of the presentations at NoSQL Brazil was Gleicon Moraes’ (@gleicon) list of SQL anti patterns which represents a very good checklist for situations in which we should take a step back, reanalyze requirements and figure out if a NoSQL solution might not be a better alternative:

  • The eternal tree
  • Dynamic table creation (and dynamic query building)
  • Table as cache
  • Table as queue
  • Table as log file
  • Stored procedures
  • Row alignment
  • Extreme JOINs
  • Your schema must be printed on an A3 sheet
  • Your ORM issues full queries for dataset iterations

For more details about these check slidedeck below:

Presentation: Why MongoDB is Awesome

Nicely structured MongoDB intro by John Nunemaker: Easy to try, Easy to understand and Easy to learn:

Building TweetReach with Sinatra, Tokyo Cabinet and Grackle

I’m starting to forget how many Twitter NoSQL-enabled apps I’ve mentioned on the NoSQL blog — fortunately the consistent tagging helps, so you can find them all under the tag Twitter — but every time I’m finding a new one I feel like posting about it.

This time it is a presentation about building a Twitter utility using Tokyo Cabinet and ☞ Sinatra (a Ruby web framework).

The author concludes with some Tokyo Cabinet lessons learned:

  • Lack of auto-expiration when using as mostly a key-value cache is annoying

  • Would definitely use it again for this type of task

I think it is interesting to note that from the key-value stores covered here, only Redis comes with support for key expiration.

Building TweetReach with Sinatra, Tokyo Cabinet and Grackle

Presentation: Introducing Riak and Ripple

A presentation by Sean Cribbs (@seancribbs) on Riak and the Ripple Ruby client library. Together with Kevin Smith’s Introduction to Riak you should get a pretty good idea of Riak strenght.

My notes:

Mythbusting “scalability”

  • Scalability is not a yes/no question
  • It’s a ration of benefit to cost
    • Benefits: low latency, throughput, uptime, concurrency, reliability
    • Costs: CPU, RAM, disk, bandwidth, power, hw/sw
  • scalability = bang for your buck (ROI)

What is Riak?

(the horizontal scaling bits)

  • Based on Amazon’s Dynamo (2007)
    • key-value storage
    • masterless, peer-to-peer replication
    • consistent hashing
    • eventual consistency
    • failover - quorums, hinted handoff

What is Riak?

(the document database bits)

  • store your objects as JSON (or any format)
  • link between objects (like hypertext)
  • No SQL - query with Javascript Map-Reduce

What is Riak?

(the ops-friendly bits)

  • web-shaped storage
    • store data in its original format
    • it’s just HTTP - same techniques apply
      • load balancing, proxy caches, round-robin DNS
  • no node is special - grow horizontally
  • get some sleep, fix it in the morning

Other topics covered:

  • CRUD in Riak (slide 74)
  • Links (slide 80) and link-walking (slide 83)
  • Riak Map-Reduce (slide 93)
  • Ripple (slide 132)

This week MongoDB Reading List

Articles ☞ Storing ViewState in MongoDb Database

ViewState can serve as both good and evil. Good because it allows to persist the ASP.NET control state during postbacks. Evil because it makes the page size larger and causes more bandwidth to be consumed. In this article we are going to demonstrate how to move the ViewState out of the page and store it in a MongoDb database.

Daniel Wertheim: ☞ Simple-MongoDB – Part 1, Getting started

So, I thought it was time for me to write a “Getting started with MongoDB” article but instead of using Sam Corder’s driver, I will use my own: “Simple-MongoDB”. It will be a series of posts covering this topic. This post is the first and will cover how-to get connected and how-to add some entities.

Daniel Wertheim: ☞ Simple-MongoDB – Part 2, Anonymous types, JSON, Embedded entities and references

This is the second article of me showing some features in the Simple-MongoDB driver. This time I will explore more ways of inserting data. More specifically, I will look at how-to insert Anonymous types as well as how-to insert entities described with JSON. Finally we will look at how to deal with embedded documents and relationships between entities, so called references.

Shiju Varghese: ☞ NoSQL with MongoDB, NoRM and ASP.NET MVC - Part 1

In this post, I will give an introduction to how to work on NoSQL and document database with MongoDB , NoRM and ASP.Net MVC 2.

Shiju Varghese: ☞ NoSQL with MongoDB, NoRM and ASP.NET MVC - Part 2

In this post, let’s discuss on domain entity with deep object graph.

Hernan Garcia: ☞ MongoDb provider for, part 3 – Mappers and more refactoring

Today we need to implement the mapper class for Post. But first let’s check what we have done so far and see if we can improve this a bit. We notice some duplication on the MongoDb class.

Scott Watermasysk: ☞ Dynamics and MongoDB Revisited

In what appeared to be moment of clarity a couple of days ago, I decided to try to use extension methods and add dynamics on top of MongoDB-CSharp.

The first post was mentioned in MongoDB: Articles and Videos

Michael Kennedy: ☞ The NoSQL Movement, LINQ, and MongoDB - Oh My!

I cover the programming model in detail as well as introduce the actual database server below. For some vague motivation, let me just give you a quick look at how you define the data model and maintain it.

You can read more about MongoDB and LINQ in Rob Conery’s Using Mongo With LINQ mentioned in MongoDB: Articles and Videos.


Alex Sharp: ☞ Practical Ruby Projects With Mongo Db

Michael Dirolf: ☞ Introduction to MongoDB

Kyle Banker: ☞ MongoDB: The Way and its Power


☞ MongoDB and Mongoid with Durran Jordan - Hashrocket Lunch ‘n’ Learn

Seth Edwards: ☞ MongoDB

Presentation: NoSQL, Riak and CouchDB: A Very Interesting Association

Mårten Gustafson’s (@martengustafson) slides focus on Riak and CouchDB after a quick intro to the NoSQL ecosystem. I find this association very interesting, but you’ll have to read on to find out why.

Some notes:


  • decentralized key-value store
  • a flexible map/reduce engine
  • HTTP/JSON API (note: the upcoming version will also support a Protocol Buffers client and API)
  • a database ideally suited for Web applications

Riak “stuff”

Riak - Takeaways

  • No single point of failure
  • Choose your levels for:
    • availability
    • consistency
    • partition tolerance

Before jumping to the CouchDB section, I should remind you of Kevin Smith’s introduction to Riak presentation.


  • document oriented database
  • Kick ass replication
  • Map/reduce view (index) definitions

CouchDB “stuff”

CouchDB - Takeaways

  • Kick ass replication
  • Views are fast
  • Can host and serve complete webapps

If you don’t think this is enough, you can also check Will Leinweber’s Relaxing with CouchDB video

I don’t know if this association between CouchDB and Riak was intentional or not, but personally I find it extremely interesting. Getting started with CouchDB should be quite simple and it will probably take you far. Once your application will need more scalability changing to Riak should be also quite simple and that not only due to their friendly HTTP/JSON protocol, but to quite a few similarities of these two products. That’s not to say that Riak and CouchDB don’t have unique features that you’ll probably miss while moving from one to another — for example you’ll need to rework a bit your map/reduce functions.

And now I’ll leave you with the slidedeck:

Presentation: Project Voldemort - Scaling Simple Storage

InfoQ-style video presentation on Project Voldemort:

Jay Kreps discusses the architecture, algorithms, implementation and deployment of Voldemort, a distributed storage system. He also presents the problems solved using Voldemort at LinkedIn.

Some of the features Jay is mentioning during the talk have already become available with the latest Project Voldemort releases.


Presentation: MapReduce in Simple Terms

Saliya Ekanayake explains what juice blenders and MapReduce have in common. Pretty funny slides!

Just in case you don’t like blenders, you can try out this Quick guide to MapReduce!

Presentation: Blending NoSQL and SQL at Confoo

Earlier today I wrote about the steps involved to migrate from MySQL to NoSQL. Anyways I do feel that in many cases NoSQL and RDBMS will live together under the same project umbrella. Michael Bleigh is covering this topic in his presentation: Blending NoSQL and SQL at Confoo:

Presentation: NoSQL: Dealing with the Data Deluge

A presentation by John Quinn (@doofdoofsf) on NoSQL, relational databases and massive amounts of data. Somehow a nicer and extended form of NoSQL is here to stay: