Maxim Grinev explains why having idempotent data operations is important for scalability and how you might have some small surprises with highly distributed NoSQL databases like Cassandra, Riak, or Project Voldemort:
The idea is that you retry the failed update until it is successful. As a result, the same update can be executed several times! If the update increments a counter the counter value gets incorrect. Here we come to the main point of this post: all your updates should be idempotent (i.e. repeated update applications have the same effect as one). Designing updates to be idempotent is the standard discipline to cope with repeated updates. Read great articles ☞ Life beyond Distributed Transactions: an Apostate’s Opinion (pdf) and ☞ Building on Quicksand by Pat Helland that stress the importance of idempotent updates in highly scalable systems.