Presentation: All content tagged as Presentation in NoSQL databases and polyglot persistence
Friday, 28 May 2010
Presentation: MongoDB Internals
Mike Dirolf from 10gen, the company the started the MongoDB project, gave a technical presentation a bit above the regular “Introduction to MongoDB”. While the slides, embedded below, might be a bit terse, there is a link at the bottom of each slide that will send you in the right direction for more information about that specific subject.
Some of the things mentioned in the slides:
- MongoDB wire protocol (slide 7) (documented ☞ here)
- data file allocation (slide 8) (documented ☞ here)
- memory management (slides 9-11)
- replication Oplog (slide 21) (documented ☞ here)
This post is written by new myNoSQL contributor Andrei Maxim. Andrei is a software engineer specialized mainly on web apps where his “weapon of choice” remains Ruby on Rails. Follow him on Twitter: @xhr
Wednesday, 26 May 2010
Presentations on Hadoop, HBase, PIG and Cascalog from Hadoop Meet-Up
The Yahoo! Developer Network Blog has ☞ posted the materials presented at Hadoop’s monthly user group meeting. I’ve embedded these below for your convenience:
What’s New With Pig: Alan Gates
Pig is one of the solutions used for data processing/analysis in the NoSQL world. For example Pig is heavily used at Twitter.
Recently Pig has released ☞ two new versions (0.6.0 and 0.7.0) and this talk focuses on the new features included with these versions and a compatibility plan with Hadoop[1]
Cascalog: Powerful and easy-to-use data analysis tool for Hadoop: Nathan Marz
Cascalog is a Clojure-based query language solution for Hadoop-stored data analysis. Nathan Marz (BackType) is demoing this cool tool:
HBase and Pig: The Hadoop ecosystem at Twitter: Dmitriy Ryaboy
As already mentioned Twitter is extensively using HBase, Pig and Hadoop — in their words Cassandra is OLTP and HBase is OLAP — and Dmitriy provides an overview of their Hadoop-based ecosystem:
References
- [1] Yahoo! Developer Network Blog has an article on this topic ☞ Towards Enterprise-Class Compatibility for Apache Hadoop. Considering that after the last release HBase has become a top-level Apache project and that there’s a very strong userbase for HBase and Hadoop, ensuring a healthy ecosystem for all these projects is extremely important. (↩)
Friday, 21 May 2010
Presentation: SQL anti patterns and NoSQL alternatives
One of the presentations at NoSQL Brazil was Gleicon Moraes’ (@gleicon) list of SQL anti patterns which represents a very good checklist for situations in which we should take a step back, reanalyze requirements and figure out if a NoSQL solution might not be a better alternative:
- The eternal tree
- Dynamic table creation (and dynamic query building)
- Table as cache
- Table as queue
- Table as log file
- Stored procedures
- Row alignment
- Extreme JOINs
- Your schema must be printed on an A3 sheet
- Your ORM issues full queries for dataset iterations
For more details about these check slidedeck below:
Wednesday, 19 May 2010
Presentation: Why MongoDB is Awesome
Nicely structured MongoDB intro by John Nunemaker: Easy to try, Easy to understand and Easy to learn:
Thursday, 13 May 2010
Building TweetReach with Sinatra, Tokyo Cabinet and Grackle
I’m starting to forget how many Twitter NoSQL-enabled apps I’ve mentioned on the NoSQL blog — fortunately the consistent tagging helps, so you can find them all under the tag Twitter — but every time I’m finding a new one I feel like posting about it.
This time it is a presentation about building a Twitter utility using Tokyo Cabinet and ☞ Sinatra (a Ruby web framework).
The author concludes with some Tokyo Cabinet lessons learned:
Lack of auto-expiration when using as mostly a key-value cache is annoying
Would definitely use it again for this type of task
I think it is interesting to note that from the key-value stores covered here, only Redis comes with support for key expiration.
Friday, 7 May 2010
Presentation: Introducing Riak and Ripple
A presentation by Sean Cribbs (@seancribbs) on Riak and the Ripple Ruby client library. Together with Kevin Smith’s Introduction to Riak you should get a pretty good idea of Riak strenght.
My notes:
Mythbusting “scalability”
- Scalability is not a yes/no question
- It’s a ration of benefit to cost
- Benefits: low latency, throughput, uptime, concurrency, reliability
- Costs: CPU, RAM, disk, bandwidth, power, hw/sw
- scalability = bang for your buck (ROI)
What is Riak?
(the horizontal scaling bits)
- Based on Amazon’s Dynamo (2007)
- key-value storage
- masterless, peer-to-peer replication
- consistent hashing
- eventual consistency
- failover - quorums, hinted handoff
What is Riak?
(the document database bits)
- store your objects as JSON (or any format)
- link between objects (like hypertext)
- No SQL - query with Javascript Map-Reduce
What is Riak?
(the ops-friendly bits)
- web-shaped storage
- store data in its original format
- it’s just HTTP - same techniques apply
- load balancing, proxy caches, round-robin DNS
- no node is special - grow horizontally
- get some sleep, fix it in the morning
Other topics covered:
- CRUD in Riak (slide 74)
- Links (slide 80) and link-walking (slide 83)
- Riak Map-Reduce (slide 93)
- Ripple (slide 132)
Wednesday, 5 May 2010
This week MongoDB Reading List
Articles
HighOnCoding.com: ☞ Storing ViewState in MongoDb Database
ViewState can serve as both good and evil. Good because it allows to persist the ASP.NET control state during postbacks. Evil because it makes the page size larger and causes more bandwidth to be consumed. In this article we are going to demonstrate how to move the ViewState out of the page and store it in a MongoDb database.
Daniel Wertheim: ☞ Simple-MongoDB – Part 1, Getting started
So, I thought it was time for me to write a “Getting started with MongoDB” article but instead of using Sam Corder’s driver, I will use my own: “Simple-MongoDB”. It will be a series of posts covering this topic. This post is the first and will cover how-to get connected and how-to add some entities.
Daniel Wertheim: ☞ Simple-MongoDB – Part 2, Anonymous types, JSON, Embedded entities and references
This is the second article of me showing some features in the Simple-MongoDB driver. This time I will explore more ways of inserting data. More specifically, I will look at how-to insert Anonymous types as well as how-to insert entities described with JSON. Finally we will look at how to deal with embedded documents and relationships between entities, so called references.
Shiju Varghese: ☞ NoSQL with MongoDB, NoRM and ASP.NET MVC - Part 1
In this post, I will give an introduction to how to work on NoSQL and document database with MongoDB , NoRM and ASP.Net MVC 2.
Shiju Varghese: ☞ NoSQL with MongoDB, NoRM and ASP.NET MVC - Part 2
In this post, let’s discuss on domain entity with deep object graph.
Hernan Garcia: ☞ MongoDb provider for BlogEngine.net, part 3 – Mappers and more refactoring
Today we need to implement the mapper class for Post. But first let’s check what we have done so far and see if we can improve this a bit. We notice some duplication on the MongoDb class.
Scott Watermasysk: ☞ Dynamics and MongoDB Revisited
In what appeared to be moment of clarity a couple of days ago, I decided to try to use extension methods and add dynamics on top of MongoDB-CSharp.
The first post was mentioned in MongoDB: Articles and Videos
Michael Kennedy: ☞ The NoSQL Movement, LINQ, and MongoDB - Oh My!
I cover the programming model in detail as well as introduce the actual database server below. For some vague motivation, let me just give you a quick look at how you define the data model and maintain it.
You can read more about MongoDB and LINQ in Rob Conery’s Using Mongo With LINQ mentioned in MongoDB: Articles and Videos.
Presentations:
Alex Sharp: ☞ Practical Ruby Projects With Mongo Db
Michael Dirolf: ☞ Introduction to MongoDB
Kyle Banker: ☞ MongoDB: The Way and its Power
Videos
☞ MongoDB and Mongoid with Durran Jordan - Hashrocket Lunch ‘n’ Learn
Seth Edwards: ☞ MongoDB
Tuesday, 4 May 2010
Presentation: NoSQL, Riak and CouchDB: A Very Interesting Association
Mårten Gustafson’s (@martengustafson) slides focus on Riak and CouchDB after a quick intro to the NoSQL ecosystem. I find this association very interesting, but you’ll have to read on to find out why.
Some notes:
Riak
- decentralized key-value store
- a flexible map/reduce engine
- HTTP/JSON API (note: the upcoming version will also support a Protocol Buffers client and API)
- a database ideally suited for Web applications
Riak “stuff”

Riak - Takeaways
- No single point of failure
- Choose your levels for:
- availability
- consistency
- partition tolerance
Before jumping to the CouchDB section, I should remind you of Kevin Smith’s introduction to Riak presentation.
CouchDB
- document oriented database
- Kick ass replication
- HTTP/JSON API
- Map/reduce view (index) definitions
CouchDB “stuff”

CouchDB - Takeaways
- Kick ass replication
- Views are fast
- Can host and serve complete webapps
If you don’t think this is enough, you can also check Will Leinweber’s Relaxing with CouchDB video
I don’t know if this association between CouchDB and Riak was intentional or not, but personally I find it extremely interesting. Getting started with CouchDB should be quite simple and it will probably take you far. Once your application will need more scalability changing to Riak should be also quite simple and that not only due to their friendly HTTP/JSON protocol, but to quite a few similarities of these two products. That’s not to say that Riak and CouchDB don’t have unique features that you’ll probably miss while moving from one to another — for example you’ll need to rework a bit your map/reduce functions.
And now I’ll leave you with the slidedeck:
Monday, 26 April 2010
Presentation: Project Voldemort - Scaling Simple Storage
InfoQ-style video presentation on Project Voldemort:
Jay Kreps discusses the architecture, algorithms, implementation and deployment of Voldemort, a distributed storage system. He also presents the problems solved using Voldemort at LinkedIn.
Some of the features Jay is mentioning during the talk have already become available with the latest Project Voldemort releases.
via: http://www.infoq.com/presentations/Project-Voldemort-Scaling-Simple-Storage
Friday, 23 April 2010
Presentation: MapReduce in Simple Terms
Saliya Ekanayake explains what juice blenders and MapReduce have in common. Pretty funny slides!
Just in case you don’t like blenders, you can try out this Quick guide to MapReduce!
Thursday, 22 April 2010
Presentation: Blending NoSQL and SQL at Confoo
Earlier today I wrote about the steps involved to migrate from MySQL to NoSQL. Anyways I do feel that in many cases NoSQL and RDBMS will live together under the same project umbrella. Michael Bleigh is covering this topic in his presentation: Blending NoSQL and SQL at Confoo:
Tuesday, 20 April 2010
Presentation: NoSQL: Dealing with the Data Deluge
A presentation by John Quinn (@doofdoofsf) on NoSQL, relational databases and massive amounts of data. Somehow a nicer and extended form of NoSQL is here to stay:
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling