usecase: All content on NoSQL databases and projects about usecase, featuring the best daily NoSQL articles, news, and links on usecase

Too Much Redis?

by Alex Popescu

Twitter Reddit
1 likes

Ben Curtis ☞ thinks that using Redis for managing friends list as described in the ☞ EngineYard post is overly complicated:

Yesterday I read a post over at the EngineYard blog about a use case for Redis (in the name of being a polyglot, trying new things, etc.), and I just had to scratch my head. I love Redis — it rocks my world — but that example was too much for me. If you just want to store a set of ids somewhere to avoid normalization headaches, introducing Redis is overkill… just do it in MySQL!

He goes on and proposes a MySQL solution in which friends IDs are serialized as a comma separated list. Frankly speaking, I do see quite a few advantages Redis has compared to this one:

  1. Redis knows how to handle sets
    1. you don’t have to deal with de-duplication
    2. (most probably) the storage is optimized
  2. with manual serialization you’ll have to deal with all concurrency issues occurring when updating these lists

So what is the advantage of Ben’s suggested solution?

Original title and link for this post: Too Much Redis? (published on the NoSQL blog: myNoSQL)


Exploring Neo4j, the NoSQL Graph Database

by Alex Popescu

Twitter Reddit

Rahul Sharma takes a look at Neo4j and some basic operations with graph databases:

Let us say we want to implement a use-case where there are persons and a person can be connected to other persons. In order to use Neo4J we must think about POJOs in terms of interfaces and corresponding implementions. This is so because the database is a key-value store at the back, so it asks us to store the properties of the POJO in terms of key-value pairs. Moreover there are no foreign keys in Neo4J, objects in the db are connected with other objects using Relationships.

Interestingly, he mentions getting some errors when trying to push 151K names. Sounds like he could use this Neo4j tip for handling long transactions.

Original title and link for this post: Exploring Neo4j, the NoSQL Graph Database (published on the NoSQL blog: myNoSQL)


InfiniteGraph Use Case: Modeling Stackoverflow

by Alex Popescu

Twitter Reddit
1 likes

I didn’t hear much about InfiniteGraph after its 1.0 release, except this post that uses Stackoverflow data as input to demo some features of graph databases:

The vertices in the graph are represented as the Users, Questions and Answers above while the edges are represented as the interactions between them (i.e. a User “Posts” a Question, an Answer is “For” a Question, a User “Comments On” a Question or Answer). Simple enough, and like most other social graphs, users seem to be the focal points with the majority of connected edges. Now all I needed was a sample application that could construct the graph data model from the XML sources and run some queries.

Original title and link for this post: InfiniteGraph Use Case: Modeling Stackoverflow (published on the NoSQL blog: myNoSQL)


CouchDB, Mobile Devices and The Distributed Web Data

by Alex Popescu

Twitter Reddit

Getting back from two crazy days I’m finding that the big news (at least in the media) is that CouchDB has released an CouchDB SDK for Android. You can read more about how to get it ☞ here.

We already knew that thanks to its friendly protocol and advanced replication features, CouchDB is a solid option when looking for distributed web data, Palm webOS and its db8 usage of CouchDB for replication being a very good example of this CouchDB use case.

CouchDB, Mobile Devices and The Distributed Web Data originally posted on the NoSQL blog: myNoSQL


Graylog2: MongoDB-backed Syslog System

by Alex Popescu

Twitter Reddit
2 likes

Manage your logs in the dark and have lasers going and make it look like you’re from space.

An open source syslog system based on a Java server for accepting messages and Ruby on Rails for visualization with MongoDB used as storage. You can get it ☞ here.


Presentation: Redis - Persistence Power or Redis Use Cases

by Alex Popescu

Twitter Reddit

Nick Quaranto slides are a great summary of a few Redis use cases:


Heroku Encourages Polyglot Persistence

by Alex Popescu

Twitter Reddit
1 likes

Heroku published an article preaching polyglot persistence through a Database-as-a-Service approach:

Database-as-as-service is one of the coming decade’s most promising business models. […] DaaS also goes hand-in-glove with polyglot persistence. Thanks to database services, you won’t need to learn how to sysadmin/DBA for every datastore you use – you can instead outsource that job to a service provider specializing in each database.

While it definitely sounds exciting to be able to use all these NoSQL databases , we should always keep in mind the cost of complexity even if DaaS will help alleviate some of the complexity of heterogeneous systems.

The article includes also some interesting use cases for a couple of NoSQL databases:

  • Frequently-written, rarely read statistical data (for example, a web hit counter) should use an in-memory key/value store like Redis, or an update-in-place document store like MongoDB.
  • Big Data (like weather stats or business analytics) will work best in a freeform, distributed db system like Hadoop.
  • Binary assets (such as MP3s and PDFs) find a good home in a datastore that can serve directly to the user’s browser, like Amazon S3.
  • Transient data (like web sessions, locks, or short-term stats) should be kept in a transient datastore like Memcache. (Traditionally we haven’t grouped memcached into the database family, but NoSQL has broadened our thinking on this subject.)
  • If you need to be able to replicate your data set to multiple locations (such as syncing a music database between a web app and a mobile device), you’ll want the replication features of CouchDB.
  • High availability apps, where minimizing downtime is critical, will find great utility in the automatically clustered, redundant setup of datastores like Casandra and Riak.

These are good examples, but you can find many more in our coverage of NoSQL uses cases and the per-product case studies: CouchDB case studies or MongoDB case studies, etc.

Heroku Encourages Polyglot Persistence originally posted on the NoSQL blog: myNoSQL


Redis Usecase: API Access Logger

by Alex Popescu

Twitter Reddit

Nice combination of Redis and MySQL:

Redis has to keep all stored objects in memory, so just putting all data in there and forgetting about it was out of the question. We decided to only keep a few days of data in Redis and archive the results to MySQL. Daily API usage stats would be served directly by Redis, archived results on date ranges would be fetched from MySQL.

Note also what correct Redis data modeling means: usage of Redis data structures combined with smart keys (nb smart in the sense of keys carrying additional meta-information).


Building a MongoDB-based Queue

by Alex Popescu

Twitter Reddit
1 likes

Matt Insler shows how to build a MongoDB-based queue using the server-side javascript and findAndModify command, using it to replace usage of Amazon SQS in his application:

I have been using MongoDB for a while now and am enamored with what it can do. I know that it can store lots of schema-less data in 4MB chunks (a document is limited to 4MB) and can store larger files through the use of GridFS. I know that it’s lightning fast (almost memcached speed) for indexed lookups and can handle thousands of operations per second without spiking the CPU over 10% even. I know that I’m paying for the CPU and hard drive space on Amazon EC2 already and thoroughly enjoy minimizing my monthly, weekly, and even daily costs. Blah. Blah. Blah. I want to implement this in Mongo!

But make no mistake: this approach just replaced a reliable, highly scalable, hosted (i.e. involving no operational costs) with a solution that misses all these.


Updates on Cassandra Usage at Twitter

by Alex Popescu

Twitter Reddit
2 likes

Just two days after my Cassandra status update, the Twitter engineering blog is publishing an article sharing more details about Cassandra usage at Twitter.

So, how is Twitter using Cassandra today?

  • Cassandra as database of places of interest used by the geo team[1]
  • Cassandra as storage for the data mining research team
  • Cassandra as an upcoming storage solution for real time analytics

In case you wonder what changed, Twitter will not migrate the tweets storage to Cassandra and continue to save and serve these from the existing MySQL cluster:

We believe that this isn’t the time to make large scale migration to a new technology. We will focus our Cassandra work on new projects that we wouldn’t be able to ship without a large-scale data store.


  1. Probably this is similar to how SimpleGeo is using Cassandra  ()

Redis-based Configuration Management at GitHub

by Alex Popescu

Twitter Reddit
1 likes

Instead of config files and if-s, use Redis to store your flags:


Tekpub: Using both MongoDB and MySQL

by Alex Popescu

Twitter Reddit
1 likes

You shouldn’t be afraid to use both NoSQL and RDBMS in your projects if they help you address real problems:

We split out the duties of our persistence story cleanly into two camps: reports of things we needed to know to make decisions for the business and data users needed to use our site. Ironically these two different ways of storing data have guided us to do what’s natural: put the application data into a high-read, high-availability environment (MongoDb) - put the historical, reporting data into a system that is built to answer questions: a relational data store.

The high-read stuff (account info, productions and episode info) is perfect for a “right now” kind of thing like MongoDb. The “what happened yesterday” stuff is perfect for a relational system.

We don’t want to run reports on our live server. You don’t know how long they will take - nor what indexing they will work over (limiting the site’s perf). Bad. Bad-bad.

Much better case study than this one!



This post is part of the MongoDB Case Studies series.