NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL usecase: All content tagged as NoSQL usecase in NoSQL databases and polyglot persistence

Cross-Platform Global High-Score Using Amazon SimpleDB

Timo Fleisch provides a set of requirements for which using Amazon SimpleDB for storing mobile gaming data sound like a good solution:

In this post I am going to describe my solution for a simple global high score that works with WP7, iOS and Android. […] The preconditions for me for the global high score where:

  • It should be easy and fast to implement and maintain.
  • It should use standard web technologies.
  • It should be scaleable.
  • It should use the standard web http protocol.
  • It should be secure,
  • and it should be as cross platform as possible.

SimpleDB definitely fits the bill for these requirements. But there might be some other details that could lead to using a different approach or making things a tad more complex:

  • the Amazon SimpleDB latency
  • the always-connected to the internet requirement

Original title and link: Cross-Platform Global High-Score Using Amazon SimpleDB (NoSQL database©myNoSQL)


Implementing Auto Saves Using RavenDB: NoSQL Tutorials

[…] implementing Auto Save in the RDBMS system could be a problem because of multiple reasons:

  • The schema and overall logic changes to save versioned data in the RDBMS system will be non-trivial
  • There might be validation checks that fail because users kept didn’t fill out some fields at that point.
  • Making periodic (30 second) transactional updates to any live system is not good for overall performance.

A work around would be saving your Object Model to RavenDB directly and if user visits the document after a time out, load both Transactional Data and Object data, compare the timestamp and use the freshest set of data.

By far the best document database usecase I have read about in quite a while.

Original title and link: Implementing Auto Saves Using RavenDB: NoSQL Tutorials (NoSQL database©myNoSQL)


Real-Time Log Collection With Fluentd and MongoDB

Fluentd is an advanced open-source log collector developed at Treasure Data, Inc (see previous post). Because Fluentd handles logs as semi-structured data streams, the ideal database should have strong support for semi-structured data. There are several databases that meet this criterion, but we believe MongoDB is the market leader.

Fluentd MongoDB

A bit like Scribe or Flume, but in Ruby and using MongoDB as a storage.

Original title and link: Real-Time Log Collection With Fluentd and MongoDB (NoSQL database©myNoSQL)


When to Use RavenDB?

Ayende, the creator of RavenDB:

Typically, it isn’t the only database in the project. In brown field projects, we usually see RavenDB brought in to serve as a persistent view model store for the critical pages, data is replicated to RavenDB from the main database and then read directly from RavenDB in order to process the perf critical pages. For green field projects, we usually see RavenDB used as the primary application database, most or all of the data resides inside RavenDB. In some cases, there is also an additional reporting database as well.

So the quick answer, and the one we are following, is that RavenDB is imminently suitable for OLTP applications, and can be used with great success as a persistent view model cache.

You’d be right say that this is an answer that could be used for most of the NoSQL databases. Looking back at their history, most of the NoSQL databases have started as ways to address problems that people in the field have been having with relational databases. Thus NoSQL databases were mostly used together or at least in the same environment with the relational databases or other storage solutions. Also considering NoSQL databases are an evolving technology, not everyone dropped their existing data stores and started to use them.

Adding NoSQL technologies to existing stacks is the leanest adoption path and the one that minimizes risks. Not to mention that this is the way to accumulate experience with them without putting at risk your business.

Now going back to RavenDB, there are a few titillating features that might arouse your curiosity.

Original title and link: When to Use RavenDB? (NoSQL database©myNoSQL)


Storing High Scores in Amazon SimpleDB

This article highlights the benefits of connecting mobile devices to the cloud while also presenting an Amazon SimpleDB use case. Amazon SimpleDB is a highly available, flexible, and scalable non-relational data store that offloads the work of database administration. The app described here demonstrates how to store a high score list or leader board in SimpleDB. The app enables the user to view the high scores sorted by name or score, add and remove scores, and more.

What it doesn’t highlight is what happens with your application if your mobile device is disconnected.

Original title and link: Storing High Scores in Amazon SimpleDB (NoSQL database©myNoSQL)


How to Implement an IMAP Server on Top of a CouchDB/NoSQL Data Store?

Interesting question on SO:

To summarize my objective here, I am really just looking for a simple, opensource method which allows me to create and maintain a (preferably noSQL db) backup/archieve of one/more remote IMAP email accounts on a per user basis and sync each individual users email accounts using a simple, low cost solution which easily scales out, consumes server resources in an efficient maner with the ADDED ABILITY that each user needs to be able to connect to his central email archive by simply addingba new imap account to his existing email client using an imap server, username and password provided through this archive server/setup.

This reminded me of a GSOC project to design and implement a distributed mailbox on top of Hadoop HDFS as part of the Apache James project. The project description can be found on this JIRA ticket and more details here:

We need to implement mailbox storage as a distributed system on top of Hadoop HDFS. The James mailbox API will be used. A first step is to design how to interact with Hadoop (native api, gora incubator at apache,…) and deal with specific performance questions related to mail loading/parsing in a distributed system (use map/reduce or not, use existing local lucene indexes for search,…). The second step is to implement the HDFS mailbox (maildir mailbox is similar because is stores mails as a file and can be an inspiration). A single James server will still be deployed because we don’t have any distributed UID generation.

According to the last comments on the ticket, this project was completed Ioan Eugen Stan under Eric Charles’ mentorship.

Original title and link: How to Implement an IMAP Server on Top of a CouchDB/NoSQL Data Store? (NoSQL database©myNoSQL)


Building a RSS Feed Processor With Redis

A complete example of how prioritization of queues in BLPOP works:

In this post, I am going to use Redis’ atomic and blocking facilities to build a multi-step RSS feed processor. Along the way, some of the topics I hope to touch upon are: queue priorization, synchronization between processes, using redis to gracefully shutdown processes and a few race conditions to watch out for.

Original title and link: Building a RSS Feed Processor With Redis (NoSQL database©myNoSQL)


Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra

Yury Izrailevsky[1]:

The reason why we use multiple NoSQL solutions is because each one is best suited for a specific set of use cases. For example, HBase is naturally integrated with the Hadoop platform, whereas Cassandra is best for cross-regional deployments and scaling with no single points of failure. Adopting the non-relational model in general is not easy, and Netflix has been paying a steep pioneer tax while integrating these rapidly evolving and still maturing NoSQL products. There is a learning curve and an operational overhead. Still, the scalability, availability and performance advantages of the NoSQL persistence model are evident and are paying for themselves already, and will be central to our long-term cloud strategy.

Summarizing the pros for each of the 3 solutions:

  • Amazon SimpleDB Pros

    • highly durable, writes spanning multiple availability zones
    • handy query and data formats
    • batch operations
    • consistent reads
    • hosted solution
  • HBase Pros

    • dynamic partitioning model
    • built-in support for compression
    • range queries
    • support for distributed counters
    • strong consistency
    • interoperability with Hadoop
  • Cassandra Pros

    • no dedicated name nodes
    • no practical architectural limitations on data sizes, row/column counts, etc.
    • flexible data model
    • no underlying storage format requirements like HDFS
    • uniquely flexible consistency and replication models
    • cross-datacenter and cross-regional replication

I hope the next post will be about the “small” issues Netflix ran into when adopting each of these systems. In the past they’ve shared some of the challenges of an Oracle - Amazon SimpleDB hybrid solution.

  1. Yury Izrailevsky: Netflix Director of Cloud and Systems Infrastructure  

Original title and link: Why Netflix Picked Amazon SimpleDB, Hadoop/HBase, and Cassandra (NoSQL databases © myNoSQL)


CouchDB Use Case: Create offline web applications on mobile and stationary devices

A short tutorial on how to build synchronized apps allowing offline operations using CouchDB:

In this article you learned about the technical viewpoint for offline applications with CouchDB. A prototype of a simple inventory management application demonstrated the CouchDB technology with JSON storage and standard synchronization facilities.

It doesn’t get into using CouchApp.

Original title and link: CouchDB Use Case: Create offline web applications on mobile and stationary devices (NoSQL databases © myNoSQL)