Optimizing Memcached Performance on a Rapidly Growing Site

Predicting operational growth by monitoring the correct metrics:

Such simple data can reveal a wealth of insights. Most important is the cache’s miss rate: how frequently do we need to regenerate data? It is the miss rate that ultimately impacts site performance. Using such data, we were shocked to discover that we were caching a lot less than we thought, and that our cache actually behaved quite erratically, with a greater than 2x difference between peak and trough miss rates

The story reminded me of the Foursquare accident.

