NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



haproxy: All content tagged as haproxy in NoSQL databases and polyglot persistence

Factual API Powered by Node.js and Redis

Continuing my search for non trivial node.js + NoSQL database application, here’s Factual stack for serving their API:

Factual API Stack

Factual architectural components:

  • Varnish
  • HAProxy
  • Node.js
  • Redis
  • Solr

Why Node.js?

We chose Node because of three F’s: it’s fast, flexible, and familiar. In particular, the flexibility is  what allowed us to use our Node layer to handle things like caching logic and load balancing, in addition to the aforementioned authentication and authorization. To make our Node layer scalable, we use multiple instances of Node tied together with Redis to keep things in sync.

Also worth mentioning is that data served through Factual API is always JSON, so having a server side JavaScript engine alsa takes reduces the need for converting data to different formats.

Original title and link: Factual API Powered by Node.js and Redis (NoSQL database©myNoSQL)


Building an Ad Network Ready for Failure

The architecture of a fault-tolerant ad network built on top of HAProxy, Apache with mod_wsgi and Python, Redis, a bit of PostgreSQL and ActiveMQ deployed on AWS:

The real workhorse of our ad targeting platform was Redis. Each box slaved from a master Redis, and on failure of the master (which happened once), a couple “slaveof” calls got us back on track after the creation of a new master. A combination of set unions/intersections with algorithmically updated targeting parameters (this is where experimentation in our setup was useful) gave us a 1 round-trip ad targeting call for arbitrary targeting parameters. The 1 round-trip thing may not seem important, but our internal latency was dominated by network round-trips in EC2. The targeting was similar in concept to the search engine example I described last year, but had quite a bit more thought regarding ad targeting. It relied on the fact that you can write to Redis slaves without affecting the master or other slaves. Cute and effective. On the Python side of things, I optimized the redis-py client we were using for a 2-3x speedup in network IO for the ad targeting results.

Original title and link: Building an Ad Network Ready for Failure (NoSQL database©myNoSQL)


Tutorial: Developing in Erlang with Webmachine, ErlyDTL, and Riak

I haven’t got to learn Erlang yet, so I’m mostly bookmarking OJ’s 3 part tutorial for future read:

  • ☞ Part 1
    In Part 1 of the series we covered the basics of getting the development environment up and running. We also looked at how to get a really simple ErlyDTL template rendering
  • ☞ Part 2
    There are a few reasons this series is targeting this technology stack. One of them is uptime. We’re aiming to build a site that stays up as much as possible. Given that, one of the things that I missed in the previous post was setting up a load balancer. Hence this post will attempt to fill that gap.
  • ☞ Part 3

    In this post we’re going to cover:

    • A slight refactor of code structure to support the “standard” approach to building applications in Erlang using OTP.
    • Building a small set of modules to talk to Riak.
    • Creation of some JSON helper functions for reading and writing data.
    • Calling all the way from the Webmachine front-end to Riak to extract data and display it in a browser using ErlyDTL templates.

Original title and link: Tutorial: Developing in Erlang with Webmachine, ErlyDTL, and Riak (NoSQL databases © myNoSQL)

Achieving Zero-Downtime Redis Upgrades

☞ A simple notice about a short scheduled GitHub downtime resulted in a great dialog about how to perform zero downtime Redis upgrades. The following approaches were suggested:

  1. using Redis replication (i.e. SLAVE OF) to move existing data from the old Redis to the new Redis and haproxy to transparently pass client requests to the Redis instance (initially to the existing version and once replication completed to the new one)
    1. Leave the old redis version running on, say, redis:6379.
    2. Install and start a new redis on redis:6380 with a different dump file location.
    3. Execute SLAVE OF redis 6379 against redis:6380. Wait for first SYNC to complete.
    4. echo "enable server redis/redis-6380" | socat stdio unix-connect:/var/run/haproxy/admin.sock
    5. echo "disable server redis/redis-6379" | socat stdio unix-connect:/var/run/haproxy/admin.sock
    6. Execute SLAVE OF no one on redis:6380.
    7. Execute SHUTDOWN on redis:6379.
  2. using virtual IPs instead of haproxy

A more generic idea shared in the comment thread was to architect the solution so that the overall system would continue to work even if some subsystems are temporarily down. If you think in terms of the CAP theorem, this idea could be translated as a system that is AP at all times.

While writing this post I remembered reading about a ☞ solution based on node.js. node.js would be used to proxy requests until it is notified that the back system goes down. It would temporarily hold incoming connections until it would be notified again that the back system is resuming.

Any other ideas?