NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



HTTP: All content tagged as HTTP in NoSQL databases and polyglot persistence

Quick Guide to Riak HTTP API and Using Riak as Cache Service

A two-part article by Simon Buckle introducing the Riak HTTP API and using it with Riak pluggable Memory back-end as a caching service for a web application. Somehow I missed that Riak has a pluggable memory (non-persistent) storage. The only missing piece for making it a better caching solution would be having the option to set a per-key expiry/time-to-live (TTL) value. It might be interesting to experiment with using Cache-Control and Last-Modified HTTP headers to simulate this behavior. Has anyone tried it?

Original title and link: Quick Guide to Riak HTTP API and Using Riak as Cache Service (NoSQL database©myNoSQL)

Apache Mod_redis


This Apache module uses a rule-based engine (based on regular expression parser) to map URLs to REDIS commands on the fly. It supports an unlimited number of rules and can match on the full URL and the request method (GET, POST, PUT or DELETE) to provide a very flexible option for defining a RESTful interface to REDIS.

Original title and link: Apache Mod_redis (NoSQL database©myNoSQL)

Hoop - Hadoop HDFS Over HTTP

Cloudera has created a set of tools named Hoop allowing access through HTTP/S to HDFS. My first question was why would you use HTTP to access HDFS? Here is the answer:

  • Transfer data between clusters running different versions of Hadoop (thereby overcoming RPC versioning issues).
  • Access data in a HDFS cluster behind a firewall. The Hoop server acts as a gateway and is the only system that is allowed to go through the firewall.

Not sure though how many will use HTTP for transfering large amounts of data. But if you want to see how it is implemented, you can find the source code on GitHub.

Original title and link: Hoop - Hadoop HDFS Over HTTP (NoSQL database©myNoSQL)


Webdis: A Redis HTTP Interface with JSON Support

Written in C, supporting HTTP 1.1 pipelining, using only GET and POST and making Redis commands part of the URI, with output in multiple formats (json, bson, txt, raw), ready to be forked or used on GitHub

Original title and link: Webdis: A Redis HTTP Interface with JSON Support (NoSQL databases © myNoSQL)

Securing MongoDB

So, MongoDB presented us with two problems:

  • When sharding it, we can’t even use the basic security that it supports.
  • The basic security that it offers is not reasonable for allowing external servers to connect to MongoDB.

Let’s just review:

  • you have this sharding setup: MongoDB sharding

  • on top of that add some Nginx

  • switch from using a binary protocol to HTTP

Doesn’t sound easy or out of the box anymore.

Original title and link: Securing MongoDB (NoSQL databases © myNoSQL)

RavenDB and HTTP Caching

Ayende writing about choosing between implementing RavenDB cache similar to Hibernate second level cache1 or HTTP caches:

RavenDB is an HTTP server, in the end. Why not use HTTP caching? […] HTTP Caching is a somewhat complex topic, if you think it is not, talk to me after reading this 24 pages document describing it. But in essence, I am actually using only a small bit of it.

Firstly, if you really want to understand web caches I strongly recommend Mark Notthingham’s Caching tutorial for web authors and webmasters.

Getting back to Ayende’s post, a couple of comments

  1. the mechanism described in the post is called conditional GET. According to the HTTP/1.1 RFC:

    The semantics of the GET method change to a “conditional GET” if the request message includes an If-Modified-Since, If-Unmodified-Since, If-Match, If-None-Match, or If-Range header field. A conditional GET method requests that the entity be transferred only under the circumstances described by the conditional header field(s). The conditional GET method is intended to reduce unnecessary network usage by allowing cached entities to be refreshed without requiring multiple requests or transferring data already held by the client.

  2. an HTTP caching mechanism should also consider reversed proxies’ behavior. For these to work freshness (Expires) and cache-control (Cache-control) headers must be used

  3. without reversed proxies, query caching based only on ETags will in most cases just reduce the bandwith (by not sending back a response), but will continue to fetch the data for calculating the query ETag. Basically, the benefits are smaller.

On the topic of HTTP-based caching, you should also check the article CouchDB and Varnish caching .

  1. I’m still confused why he mentiones Hibernate’s second level caching as that is related to caching query results, while the post focuses on single value access caching. Thanks to ewhauser this was clarified.  

Original title and link: RavenDB and HTTP Caching (NoSQL databases © myNoSQL)


Presentation: RestMQ - HTTP/Redis based Message Queue

Gleicon Moraes’ slide deck about RestMQ, an HTTP/Redis based message queue. More about RestMQ can be found ☞ here and the source code is available on ☞ GitHub.

Keep in mind that Redis-backed queues is one very often cited use case for Redis.

Original title and link for this post: Presentation: RestMQ - HTTP/Redis based Message Queue (published on the NoSQL blog: myNoSQL)

NoSQL Protocols Are Important

The more mature the NoSQL solutions grow the more they realize the importance of the protocols they are using. And more and more NoSQL projects try not to repeat the LDAP protocol history.

I’d say that the flagship NoSQL projects that understood the benefits of the protocol simplicity are CouchDB, the relaxed document database and SimpleDB, Amazon’s key-value store, both of them looking like being built on the web and for the web (note: as one of the MyNoSQL readers correctly pointed out, the SimpleDB HTTP use is quite incorrect though). But they are definitely not the only one.

Riak, the decentralized key-value store, is also using JSON over HTTP. Not only that but the Basho team, producers of Riak, have decided lately to completely drop their custom protocol ☞ Jiak.

Terrastore, the consistent, partitioned and elastic document database, being quite young, made its homework and debuted as HTTP/JSON friendly.

Neo4j, the graph database, has added recently a RESTful interface, which even if not available in the Neo4j 1.0 release is making it accessible for a new range of programming languages.

There are some NoSQL solutions that are still using custom protocols. Redis has defined its own protocol, but made sure to keep it “easy to parse by a computer and easy to parse by a human”. Redis also got some help from 3rd party tools/libraries to make it even more accessible through HTTP/JSON: RedBottle, a REST app for Redis and Sikwamic, a Redis over HTTP library.

GT.M, a NoSQL solution about which you can learn more from the Introduction to GT.M and M/DB or these two talks at FOSDEM: GT.M and OpenStreetMap and MDB and MDBX: Open Source SimpleDB Projects based on GTM, has also realized the importance of the protocol and is now introducing ☞ M/Wire, which was inspired by the simplicity of Redis protocol.

MongoDB is another example of a NoSQL storage that uses a custom wire protocol. While the MongoDB ecosystem already includes a lot of libraries, I’d really love to see Kristina’s ☞ Sleepy.Mongoose moving forward (nb: Krsitina, I’m also pretty sure that Sleepy.Mongoose can get much nicer RESTful URIs too ;-) ).

And the story can go on and on, but the lesson to be learned should be quite obvious: the simpler and the easier your protocol is the more accessible your data will be and the easier it will be for the community to come up with (innovative) projects and libraries. The NoSQL libraries page should give you a feeling of what NoSQL solutions are using simple protocols and which are not.

Update: I received a hint from Mathias Meyer (@roidrage) that BSON, the binary JSON serialization used by MongoDB, has a new ☞ home

Redis gets a web interface: Redweb

CouchDB has Futon and recently LoveSeat to access it over a web browser. MongoDB has futon4mongo and phpMoAdmin for the same functionality. So why not something similar for Redis?

Starting a couple of days ago, there is a web interface for Redis too: Redweb ☞. Built on top of the Bottle Python web framework, this initial version supports the following features:

  • adding key-values
  • append lists and sets
  • add/delete elements from sorted sets
  • check length and cardinality
  • get random elements

Ted Nyman, the creator of Redweb, has also brought us RedBottle, a REST-style app for Redis. Anyway, as for the other Redis over HTTP solution, I believe there can be some improvements in terms of the RESTfulness.


Redis Over HTTP

Dor Kalev’s idea of exposing Redis functionality over HTTP is extremely interesting. I’d like to suggest a couple of things:

  • make it RESTful

    For example:

    • /set/KEY/VALUE: should be a PUT request to /KEY/
    • /get/KEY: should be a GET to /KEY
  • I don’t really see the benefit of having Comet used as a way to get notifications for updated values (as suggested in the article).

    Some Redis users are already discussing about a need for push notifications[^1]. I liked the way CouchDB is dealing with this requirement and suggested a similar solution for Redis. And I think Comet would play quite nicely with something like CouchDB’s _changes.

  • I think Comet would make more sense with the Redis list read operations

What else would you like to see?