Or the horrid story of trying distributed transactions:
Second, we’re setting up firedrills to deal with the redis lockups. The client should behave much better with regards to timeouts, similar to how well memcached handles failure.
Third, we’re going to get all redis interaction out of MySQL transaction blocks, so that problems with redis doesn’t also cause problems with MySQL.
On the other hand, (and I’ve mentioned this before on Twitter, current programming languages and frameworks do not help us much building services with built-in SLAs.
Original title and link: Explaining the Campfire outage on November 30th (NoSQL databases © myNoSQL)