NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



StackOverflow: All content tagged as StackOverflow in NoSQL databases and polyglot persistence

NoSQL Screencast: Building a StackOverflow Clone With RavenDB

Ayende and Justin pair to model a StackOverflow website clone with RavenDB. And they cover:

  • Map/Reduce indexes
  • Modelling tags
    • Root aggregates
    • Metadata
    • Active tags
  • Facets
  • Performance:
    • Built-in caching
    • Lazy loading
    • Aggressive caching
  • RavenDB profiler

Tunning SQL, No Need for NoSQL

The most common complaint against NoSQL is that if you know how to write good SQL queries then SQL works fine. If SQL is slow you can always tune it and make it faster.

Let’s keep in mind two things:

  1. performance is not equivalent to scalability
  2. most NoSQL databases have been created to deal with scalability issues. And they offer a different, non-relational, data model.

A NoSQLite might counter that this is what key-value database is for. All that data could have been retrieved in one get, no tuning, problem solved. The counter is then you lose all the benefits of a relational database and it can be shown that the original was fast enough and could be made very fast through a simple turning process, so there is no reason to go NoSQL.

There’s the third thing too: you shouldn’t always obey the Law of the instrument. Polyglot persistence is a viable option.

Original title and link: Tunning SQL, No Need for NoSQL (NoSQL databases © myNoSQL)


Powered by Redis: StackOverflow Caches

Posting about this Redis-based StackOverflow demo/clone, reminded me that StackOverflow is actually using Redis for site-level and global caches:

  1. A “local cache,” which can only be accessed from 1 server/site pair
    • Contains things like user sessions, and pending view count updates
    • This resides purely in memory, no network or DB access
  2. A “site cache,” which can be accessed by any instance (on any server) of a single site
    • Most cached values go here, things like hot question id lists and user acceptance rates are good examples
    • This resides in Redis (in a distinct DB, purely for easier debugging)
  3. A “global cache,” which is shared amongst all sites and servers
    • Inboxes, API usage quotas, and a few other truly global things live here
    • This resides in Redis (in DB 0, likewise for easier debugging)

It is kind of impressive to read engineers expressing feelings like (nb: when they know what they are talking about):

Redis is so fast that the slowest part of a cache lookup is the time spent reading and writing bytes to the network. This is not surprising, really, if you think about it.

Original title and link: Powered by Redis: StackOverflow Caches (NoSQL databases © myNoSQL)


Powered by Redis: StackOverflow Demo Clone with ServiceStack

A StackOverflow demo/clone built using ServiceStack[1] and Redis

Redis StackOverflow ServiceStack

The demo is live here. Code available on GitHub and discussion on Hacker News.

Note: In case you didn’t know: StackOverflow is actually using Redis for caches.

  1. ServiceStack: open source .NET and Mono web services framework. Project page  

Original title and link: Powered by Redis: StackOverflow Demo Clone with ServiceStack (NoSQL databases © myNoSQL)

StackOverflow Preparing to Use Redis

According to Jeff Atwood (@codinghorror):

we are tooling up to begin using redis on stackoverflow for shared memory caching

Sounds like more and more projects are realizing the benefits of using Redis and that Memcached and Membase do have a strong competitor.

While it looks like Redis is currently focusing on optimizing memory consumption, I’d say that the next most important feature should be making Redis truly distributed (nb currently it only support master/slave replication and client side hashing).

Update: the confirmation came today from Kevin Montrose (@ kevinmontrose ):

We’re field testing #redis on Meta.SO. #stackoverflow

Original title and link for this post: StackOverflow Preparing to Use Redis (published on the NoSQL blog: myNoSQL)


InfiniteGraph Use Case: Modeling Stackoverflow

I didn’t hear much about InfiniteGraph after its 1.0 release, except this post that uses Stackoverflow data as input to demo some features of graph databases:

The vertices in the graph are represented as the Users, Questions and Answers above while the edges are represented as the interactions between them (i.e. a User “Posts” a Question, an Answer is “For” a Question, a User “Comments On” a Question or Answer). Simple enough, and like most other social graphs, users seem to be the focal points with the majority of connected edges. Now all I needed was a sample application that could construct the graph data model from the XML sources and run some queries.

Original title and link for this post: InfiniteGraph Use Case: Modeling Stackoverflow (published on the NoSQL blog: myNoSQL)