Learn about Conflict Resolution and Vector Clocks
After spending some time in the NoSQL space you start hearing about conflict resolution, vector clocks, version vectors, etc. While some of the NoSQL projects do not need any of these either because they are not distributed or because they use a central node for coordinating writes, sooner or later you’ll probably still need to learn about them, so I thought I should put together a short list of resources that I’ve found interesting.
While you might be tempted to begin from Wikipedia, the conflict resolution entry[1] is not the best start and I’ve found the CouchDB book ☞ conflict management chapter to offer a good enough perspective and examples.
One closely related notion to conflict resolution is vector clocks[2]. The guys from Basho — Riak creators — have two blog posts covering different aspects of vector clocks (nb: make sure you are also reading the comment threads for both posts):
- ☞ Why vector clocks are easy: covers vector clocks from client perspective
- ☞ Why vector clocks are hard: covering them from the implementors perspective.
Last, but not least, you should also check Jeff Darcy’s post on ☞ conflict resolution:
The most important thing about both vector clocks and version vectors (henceforth “vectors” for both) is that they do not by themselves resolve conflicts. All they can do is detect conflicts, meaning updates whose order cannot be determined. The conflicting versions must all be saved until someone, at some time, looks at them and determines how to resolve the conflict – i.e. turn them into a single combined version.
While you’ll probably not become an expert after reading these, you’ll definitely have an idea when people around you will discuss about conflict resolution, vector clocks in Riak, Cassandra, Project Voldemort, CouchDB or any other system.
References
- [1] ☞ Wikipedia: Conflict resolution (↩)
- [2] This time the Wikipedia entry on ☞ vector clocks can be useful (even if a bit too concise). (↩)