NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



tutorial: All content tagged as tutorial in NoSQL databases and polyglot persistence

Getting Started with MongoDB and C#

Pretty basic intro to using MongoDB from C#:

Of all the various options out there, MongoDB has been getting a lot of press lately. MongoDB is hailed as one of the top NoSql options out there for being ultra-scalable, and highly performant.

The article is briefly taking the reader from installing MongoDB to connecting and using collections, but also spends some time on defining “documents”, the central piece of document stores, in the section “Introducing the Document: end all, be all”. That reminded me of a nice trick we shared in our daily NoSQL ecosystem news: making your MongoDB C# code more readable with dynamics (instead of dictionaries)

As a final note, it’s good to hear that it is not only me thinking that MongoDB is getting a lot of coverage these days.



Screencast: A step by step intro to MongoDB

A step by step MongoDB intro screencast from Michael Dirolf (@mdirolf). While nothing new, if you haven’t played yet with MongoDB, this 36 minute video (requires Silverlight) will give you a good feeling of how to start using it.

Once you are done, you can move to the next set of tutorials: Introduction to MongoDB and MongoMapper and MongoDB and Mongoid


Hadoop Tutorial Part 2: Getting Started with Partitioning

After setting the grounds for experimenting and learning Hadoop and MapReduce in the first part of the Hadoop tutorial, Philippe Adjiman is continuing his series by covering data partitioning. The post presents the default Hadoop partitioner(s), strategies for determining better partitioning using sampling and the importance of this initial MapReduce phase.

Partitioning in map/reduce is a fairly simple concept but that is important to get correctly.

[…] First, it has a direct impact on the overall performance of your job: a poorly designed partitioning function will not evenly distributes the charge over the reducers, potentially loosing all the interest of the map/reduce distributed infrastructure.

[…] Second, it maybe sometimes necessary to control the key/value pairs partitioning over the reducers.


A Step-by-Step Intro to HBase with Ruby

A basic, step-by-step introduction by Olexiy Prokhorenko for setting up HBase and start using it to build a forum-like solution using Ruby. In case you are interesting on how to build a message board system or play with something similar to Stackoverflow in MongoDB, make sure you check the MongoDB usecases.

The post contains details about the available HBase Ruby libraries:

and the documentation that got him started:

I’d definitely add to these: HBase vs BigTable Comparison and ☞ HBase Architecture 101 - Storage.

The article contains tons of code snippets and details (nb sometimes a bit too much though) on how to get up and running with HBase and Ruby.

Hbase Schema Design Case Studies


MongoDB Aggregation Tutorial

A three part article series by Kyle Banker introducing the aggregation features available in MongoDB:

Hadoop Tutorial Part 1: Setting Up Your MapReduce Learning Playground

If you are talking large datasets, NoSQL, you definitely talk Map/Reduce and so Hadoop.

This is the first post of a series of small hadoop tutorials introducing progressively core hadoop functionnalities.

This first post is dedicated to build what I called a “MapReduce Learning Playground”: for practice or for a real need, you read or wrote on a sheet of paper the map and reduce functions that might solve a particular problem and you want to see it in action, not necessarily on huge data sets, just check that it computes the correct answer.


An Introduction to Redis

After the great hands on introduction to Amazon Dynamo, here is a basic introduction to Redis, whose latest version, Redis 1.1.91 was launched a couple of days ago:

Here on MyNoSQL, we have also linked to the introduction to Redis data types and there are quite a few other notable news and links about Redis.

Understanding Amazon Dynamo by Building it in Erlang

This is what I expect to become a great article series on the Amazon Dynamo paper. The author, Will Larson, suggests a different path of understanding the inner workings of this system:

I decided that a good way to record the ideas (as well as solidify them in my mind) was to go through the process of writing a distributed key-value store, and then incrementally add the enhancements discussed in the Dynamo paper. By the end of this series we’ll have re-implemented most of the interesting ideas from Dynamo in a distributed Erlang system.

He sounds really excited to have to deal with the concepts introduced in the Dynamo paper: consistent hashing, merkle trees, vector clocks, gossip protocols, sloppy quorums) and so far has published the first part: Hands On Review of the Dynamo Paper and the 2nd Durable Writes & Consistent Reads.