NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



ruby: All content tagged as ruby in NoSQL databases and polyglot persistence

Easy IP Geotargeting with Geokit and MongoMapper

There are several cases in which it might make sense to tailor your app’s content based on a user’s physical location. But asking them directly is a bit of a pain. Luckily, it’s extremely simple to find a user’s location knowing only something you will always know about a visitor: their IP address. Today I’ll walk you through how to use IPs to geolocate your visitors in a Rails application using Geokit and MongoDB’s geospatial indexing with MongoMapper.

And a couple of days ago it was Rails with Geocoder and MongoDB with Mongoid.

Original title and link: Easy IP Geotargeting with Geokit and MongoMapper (NoSQL databases © myNoSQL)


Activity Feeds with Redis

The how:

One brief note about architecture: since it’s impractical to simply query the activity of 500 friends, there are two general approaches for building scalable news feeds:

  1. Fan-out on read (do these queries ahead of time and cache them)
  2. Fan-out on-write (write follower-specific copies of every activity so when a given user asks for a feed you can retrieve it in one, simple query)

And why Redis:

First off, why Redis? It’s fast, our data model allows us to store minimal data in each feed entry, and Redis’ data-types are pretty well suited for an activity feed. Lists might seem like an obvious choice and could work for a basic feed implementation, but we ended up using sorted sets for two reasons:

  1. If you use a timestamp as the score parameter, you can really easily query for feed items based on time
  2. You can easily get the union of multiple sorted sets in order to generate an aggregated “friend feed”

Then the code in Ruby and PHP

Original title and link: Activity Feeds with Redis (NoSQL databases © myNoSQL)

Using Redis with Ruby on Rails

TL;DR: Redis is fucking awesome.

Title and quote says it all. Added to the getting started with NoSQL guides.

Original title and link: Using Redis with Ruby on Rails (NoSQL databases © myNoSQL)


Soulmate: Redis-backed Autocompletion

Inspired by Auto Complete with Redis, Soulmate uses sorted sets to build an index of partially completed words and the corresponding top matching items, and provides a simple sinatra app to query them.

Before getting too excited about Soulmate, check the first comments here to understand why this solution is nice but suboptimal.

Original title and link: Soulmate: Redis-backed Autocompletion (NoSQL databases © myNoSQL)


Ruby On Rails: Cron Job Scheduling using Redis, Resque, and Rufus

The 5 R: Ruby on Rails, Redis, Resque, Rufus.

Now the question is what’s the best way to run scheduled tasks in a Rails environment? Generally, Rails developers can use an application specific crontab to run application tasks. […] Downsides of Cron are:

  • Cron works well on a single server but what if you need to scale to multiple app servers? You’ll need to introduce some kind of lock to avoid concurrency problems. Also, you need to maintain these shared locks.
  • The other problem with Cron is that they are difficult to debug.
  • Cron is for scheduling things, not doing them.

This can almost always be better to place jobs in queue and place the worker system to perform or to execute those jobs. Luckily there is a very good gem named ‘resque‘ available in Ruby on Rails.

Original title and link: Ruby On Rails: Cron Job Scheduling using Redis, Resque, and Rufus (NoSQL databases © myNoSQL)


Introducing Couchup. An interactive Couchdb Console

I know Futon is nice, but as a developer i need more power. And even though HTTP is ubiquitous, i need simple ways of sending requests to Couchdb, without dealing with setting the headers and json encoding the parameters.

So I wrote a simple, irb based Couchdb Console, we call it Couchup.

With code available on GitHub.

Original title and link: Introducing Couchup. An interactive Couchdb Console (NoSQL databases © myNoSQL)


Simple Ruby Workers with MongoDB and Resque

After some research, we decided to use resque from github, but to adapt it to use mongodb instead of redis. The advantage here is that we already have a significant investment in mongodb, so we would not be introducing a new type of server to our infrastructure. Mongodb also has some features that redis does not, and we used those to build some interesting new features into resque.

After reading the post, I couldn’t identify those additional MongoDB features mentioned above. But I do agree you shouldn’t bring in new infrastructure requirements with every new tools you are using. Very heterogeneous platforms are difficult and expensive to maintain.

Original title and link: Simple Ruby Workers with MongoDB and Resque (NoSQL databases © myNoSQL)


New Redis Libraries

Silver: database cacher, indexer and searcher

Probably because Silver was released by a publisher, this new open source library got some press today:

Enter Silver. Silver is designed to be a simple, lightweight wrapper for all your calls to a database that you want to cache or index with Redis. It is completely database/web-service agnostic so you should be able to use if for anything you can imagine caching.

Couple of thoughts about this new Ruby library:

  • it is not a transparent caching layer that can be used in your application. Basically you’ll have to rewrite your app to use it.
  • there is not way to update the cache
  • the indexer is — in their own words — “stupidly simple fuzzy text search”

Project is hosted on GitHub.

redis-store: Redis-backed Tomcat session store

Definitely not benefiting from the same media buzz, Jonathan Leibiusky pushed to GitHub a Redis-backed Tomcat session store. Plugging this into your Tomcat would require just updating the conf/server.xml:

<Manager className="org.apache.catalina.session.PersistentManager" saveOnRestart="true" maxActiveSessions="1" minIdleSwap="1" maxIdleSwap="1" maxIdleBackup="1">
    <Store className="org.apache.catalina.session.RedisStore" 

Tomcat default session store is disk, so if your app is a heavy session users, you’ll probably see a boost in performance[1].

  1. For the moment I haven’t tried this myself and I suppose it’ll need some extensive testing before going into production.  

Original title and link: New Redis Libraries (NoSQL databases © myNoSQL)

MongoDB Ruby Driver Improvements

An intro to the latest MongoDB Ruby driver, featuring:

  • normal (Connection) and replica-set aware connections (ReplSetConnection)

    The ReplSetConnection class is brand new. It has a slightly different API and must be used when connecting to a replica set. To connect, initialize the ReplSetConnection with a set of seed nodes followed by any connection options.

  • allowing reads from slave nodes

    For certain read-heavy applications, it’s useful to distribute the read load to a number of slave nodes, and the driver now facilitates this.

    With :read_secondary => true, the connection will send all reads to an arbitrary secondary node.

  • default setting for safe mode on Connection, DB, and Collection.

  • support for JRuby where it uses a Java based implementation of the BSON library

Original title and link: MongoDB Ruby Driver Improvements (NoSQL databases © myNoSQL)


Mongo Vs Redis, The Increment Battle

The Hacker News thread points out all the flaws in the test:

  • measuring a mix of client library latency and round trip time
  • single threaded
  • no durability requirements
  • wrong way to compute and present stats

Original title and link: Mongo Vs Redis, The Increment Battle (NoSQL databases © myNoSQL)


Running Ruby Map/Reduce with Apache Hadoop

Here I demonstrate, with repeatable steps, how to fire-up a Hadoop cluster on Amazon EC2, load data onto the HDFS (Hadoop Distributed File-System), write map-reduce scripts in Ruby and use them to run a map-reduce job on your Hadoop cluster. You will not need to ssh into the cluster, as all tasks are run from your local machine.

Overly simplified:

  • use Cloudera’s distribution for Apache Hadoop
  • build, configure and use Whirr scripts to setup the Hadoop cluster on Amazon EC2
  • connect from your laptop to the cluster using a SOCKS proxy
  • check Hadoop and HDFS health status
  • setup the local Hadoop client
  • upload data to HDFS
  • code map/reduce tasks using Ruby
  • run, check stats, and get results for your Ruby map/reduce tasks

Original title and link: Running Ruby Map/Reduce with Apache Hadoop (NoSQL databases © myNoSQL)


Being Awesome with the MongoDB Ruby Driver

Just the basics of MongoDB with Ruby:

The MongoDB Ruby driver is not only simple to use, but it will get you familiar with how queries look and how they operate. Armed with this knowledge, moving into an ORM becomes much easier. You’ll not only be able to understand what is abstracted away, but you’ll be able to spot bad and inefficient generated queries, making performance troubleshooting a snap.

Original title and link: Being Awesome with the MongoDB Ruby Driver (NoSQL databases © myNoSQL)