NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL: Guides, Tutorials, Books, Papers

Resources for getting started with NoSQL databases, including NoSQL guides and tutorials, NoSQL books, and papers.

In case you are new to NoSQL databases or NoSQL in general, please start with the NoSQL definitions, what led to the creation of NoSQL databases, and the NoSQL databases classification and reference.

NoSQL Getting Started Guides & Tutorials








Project Voldemort




NoSQL books

Cassandra: The Definitive Guide

Authors: Even Hewitt

The rising popularity of Apache Cassandra rests on its ability to handle very large data sets that include hundreds of terabytes — and that’s why this distributed database has been chosen by organizations such as Facebook, Twitter, Digg, and Rackspace. With this hands-on guide, you’ll get all the details and practical examples you need to understand Cassandra’s non-relational database design and put it to work in a production environment.

Amazon | O’Reilly

Cassandra High Performance Cookbook

Authors: Edward Capriolo

This book provides detailed recipes that describe how to use the features of Cassandra and improve its performance. Recipes cover topics ranging from setting up Cassandra for the first time to complex multiple data center installations. The recipe format presents the information in a concise actionable form.

The book describes in detail how features of Cassandra can be tuned and what the possible effects of tuning can be. Recipes include how to access data stored in Cassandra and use third party tools to help you out. The book also describes how to monitor and do capacity planning to ensure it is performing at a high level. Towards the end, it takes you through the use of libraries and third party applications with Cassandra and Cassandra integration with Hadoop.


CouchDB: The Definitive Guide: Time to Relax (Animal Guide)

Authors: J. Chris Anderson, Jan Lehnardt, Noah Slater

Three of CouchDB’s creators show you how to use this document-oriented database as a standalone application framework or with high-volume, distributed applications. With its simple model for storing, processing, and accessing data, CouchDB is ideal for web applications that handle huge amounts of loosely structured data. That alone would stretch the limits of a relational database, yet CouchDB offers an open source solution that’s reliable, scales easily, and responds quickly.

Amazon | Oreilly

Beginning CouchDB

Author: Joe Lennon

The new world of cloud computing needs data storage. CouchDB is the scalable, portable, simple database engine that is helping open source cloud architects put their data stores onto a firm foundation. Beginning CouchDB provides the tools to begin using this very powerful database engine without having to pay license fees for the software, or worry about administrator’s certifications or vast hardware requirements. This book teaches the fundamentals of one of the most powerful database engines ever created for the price of a good lunch. After reading this book and working through the examples, you’ll be able to write your own applications for CouchDB quickly and easily.


Scaling CouchDB

Author: Bradley Holt

A practical guide to web developers who need to scale their CouchDB database instances. The basic concepts behind CouchDB’s scalability (i.e. its distributed shared nothing architecture) will be covered as well as: replicating using both Futon and CouchDB’s RESTful API, continuous replication, conflict resolution, load balancing, clustering with CouchDB Lounge.


Writing and Querying MapReduce Views in CouchDB

Author: Bradley Holt

If you want to use CouchDB to support real-world applications, you’ll need to create MapReduce views that let you query this document-oriented database for meaningful data. With this short and concise ebook, you’ll learn how to create a variety of MapReduce views to help you query and aggregate data in CouchDB’s large, distributed datasets.

You’ll get step-by-step instructions and lots of sample code to create and explore several MapReduce views through the course of the book, using an example database you construct. To work with these different views, you’ll learn how to use the Futon web administration console and the cURL command line tool that come with CouchDB.


Hadoop: The Definitive Guide

Author: Tom White

Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework — an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.


Pro Hadoop

Author: Jason Venner

You’ve heard the hype about Hadoop: it runs petabyte–scale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, it’s been heavily committed to by tech giants like IBM, Yahoo!, and the Apache Project, and it’s completely open source (thus free). But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running?


Hadoop in Action

Author: Chuck Lam

Hadoop in Action teaches readers how to use Hadoop and write MapReduce programs. The intended readers are programmers, architects, and project managers who have to process large amounts of data offline. Hadoop in Action will lead the reader from obtaining a copy of Hadoop to setting it up in a cluster and writing data analytic programs.


HBase: The Definitive Guide

Author: Lars George

If your organization is looking for a storage solution to accommodate a virtually endless amount of data, this book will show you how Apache HBase can fulfill your needs. As the open source implementation of Google’s BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. HBase: The Definitive Guide provides the details you require, whether you simply want to evaluate this high-performance, non-relational database, or put it into practice right away.


MongoDB: The Definitive Guide

Authors: Kristina Chodorow, Michael Dirolf

How does MongoDB help you manage a huMONGOus amount of data collected through your web application? With this authoritative introduction, you’ll learn the many advantages of using document-oriented databases, and discover why MongoDB is a reliable, high-performance system that allows for almost infinite horizontal scalability.


Scaling MongoDB

Author: Kristina Chodorow

Create a MongoDB cluster that will to grow to meet the needs of your application. With this short and concise book, you’ll get guidelines for setting up and using clusters to store a large volume of data, and learn how to access the data efficiently. In the process, you’ll understand how to make your application work with a distributed database system.


50 Tips and Tricks for MongoDB Developers

Author: Kristina Chodorow

A collection of tips, tricks, and hacks to help MongoDB developers get the most out of the software. The tips cover everything from application design to data safety and monitoring.


MongoDB in action

Author: Kyle Banker

MongoDB In Action is a comprehensive guide to MongoDB for application developers. The book begins by explaining what makes MongoDB unique and describing its ideal use cases. A series of tutorials designed for MongoDB mastery then leads into detailed examples for leveraging MongoDB in e-commerce, social networking, analytics, and other common applications.

MongoDB in action


The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing

Authors: Eelco Plugge, Tim Hawkins, Peter Membrey

MongoDB, a cross-platform NoSQL database, is the fastest-growing new database in the world. MongoDB provides a rich document orientated structure with dynamic queries that you’ll recognize from RDMBS offerings such as MySQL. In other words, this is a book about a NoSQL database that does not require the SQL crowd to re-learn how the database world works!


Redis: The Definitive Guide: Data modeling, caching, and messaging

Authors: Salvatore Sanfilippo, Pieter Noordhuis

Written by the Redis dream team—including its creator, key developers, and the community itself—this book provides the details you need to implement this advanced key-value data store quickly and discerningly. Database administrators, architects, and programmers will learn how to work with different data structures in Redis, how to handle memory, replication, and the cache itself, and how to implement messaging, using just the shell or programming APIs in Ruby, Python, and JavaScript.


Redis Cookbook: Practical Techniques for Fast Data Manipulation

Authors: Tiago Macedo, Fred Oliveira

Two years since its initial release, Redis already has an impressive list of adopters, including Engine Yard, GitHub, Craigslist, and Digg. This open source data structure server is built for speed and flexibility, making it ideal for many applications. If you’re using Redis, or considering it, this concise cookbook provides recipes for a variety of issues you’re likely to face. Each recipe solves a specific problem, and provides an in-depth discussion of how the solution works.


The Architecture of Open Source Applications

Authors: Adam Marcus

The book has chapters about BerkleyDB, Hadoop Distributed File System (HDFS), the NoSQL ecosystem, and Riak.


NoSQL Handbook

Author: Mathias Meyer

A handy and outright awesome ebook guide to the world of NoSQL databases. Includes heaps of practical material on how to use NoSQL databases like Redis, MongoDB, CouchDB, Riak and Cassandra.

Professional NoSQL

Author: Shashank Tiwari

NoSQL databases are quickly becoming recognized as the most efficient backend for storing vast quantities of online data that can be accessed at any time. This hands-on guide presents solutions to setting up and migrating data to NoSQL databases. Expert author Shashank Tiwari provides unique insight into choosing which NoSQL solutions are best for solving your specific data storage needs. You?ll learn how NoSQL databases are better equipped to handle storage and retrieval of high-volume data, thereby boosting productivity, improving performance, and enhancing usability.


Programming Pig

Author: Alan Gates

This guide is an ideal learning tool and reference for Apache Pig, the programming language that helps you describe and run large data projects on Hadoop. With Pig, you can analyze data without having to create a full-fledged application—making it easy for you to experiment with new data sets.

Programming Pig shows newcomers how to get started, and teaches intermediate users the benefits of using Pig Latin, the data flow language for building and maintaining pipelines for processing data. Advanced users learn how to build complex data processing pipelines with Pig’s macros and modularity features, and discover how to build systems for complex data processing needs by embedding Pig Latin into scripting languages.

  • Learn the advantages and disadvantages of using Pig instead of MapReduce
  • Understand how Pig fits in with other Hadoop components, such as HDFS, Hive, MapReduce, and HBase
  • Follow examples that explain built-in Pig Latin functions, and data operators such as join and group
  • Use grunt, the shell that Pig provides for exploring and working with HDFS
  • Get performance tuning tips for running Pig Latin scripts on Hadoop clusters in less time
  • Extend Pig with powerful user defined functions written in Java or Python


Amazon SimpleDB Developer Guide

Authors: Prabhakar Chaganti, Rich Helms

This book is a practical real-world tutorial covering everything you need to know about Amazon SimpleDB. You will come across examples in three languages: Java, PHP, and Python. This book is aimed at transforming you from a beginner to an advanced developer. If you are a developer wanting to build scalable web-based database applications using SimpleDB, then this book is for you. You do not need to know anything about SimpleDB to read and learn from this book, and no basic knowledge is strictly necessary. This guide will help you to start from scratch and build advanced applications.


A Developer’s Guide to Amazon SimpleDB (Developer’s Library)

Author: Mocky Habeeb

Using SimpleDB, any organization can leverage Amazon Web Services (AWS), Amazon’s powerful cloud-based computing platform–and dramatically reduce the cost and resources associated with application infrastructure. Now, for the first time, there’s a complete developer’s guide to building production solutions with Amazon SimpleDB.

Pioneering SimpleDB developer Mocky Habeeb brings together all the hard-to-find information you need to succeed. Mocky tours the SimpleDB platform and APIs, explains their essential characteristics and tradeoffs, and helps you determine whether your applications are appropriate for SimpleDB. Next, he walks you through all aspects of writing, deploying, querying, optimizing, and securing Amazon SimpleDB applications–from the basics through advanced techniques.


Beginning SimpleDB

Author: Kevin Marshall, Tyler Freeling

Beginning SimpleDB makes cloud computing a reality for programmers and this is the first book on the market explaining it in detail.

  • When, where, and why using Amazon’s SimpleDB can be a better solution than using a traditional relational database store
  • How to design systems and structure data to be best suited for use with SimpleDB
  • How SimpleDB differs from a relational database and from its competitors, such as CouchDB and Google’s BigData