Notes on Distributed Programming and CAP
Firstly, an interesting presentation by Paulo Gaspar (@paulogaspar7) on ☞ Distributed programming and data consistency
Key take-aways:
The network falacies:
- The network is reliable
- Latency is zero
- Bandwith is infinite
- The network is secure
- Topology doesn’t change
- There is one administrator
- Transport cost is zero
- The network is homogenous
-
CAP Trade-offs:
- CA without P: Databases providing distributed transactions can only do it while their network is up
- CP without A: While there is a partition, transactions to an ACID db may be blocked until the partition heals
- AP without C: Caching provides client-server partition resilience by replicating data, even if the partition prevents verifying if a replica is fresh
Another interesting post on this topic, is ☞ The CAP Theorem Distilled by Sid Anand (@r39132). Under the assumption that “any system needs to support ‘P’” (nb I am not sure why the article is limiting the analysis to this case only), the article compares ‘A’ vs ‘C’ in CAP:
If you choose ‘C’, your system might implement 2-phase commit (a.k.a 2PC) . […]
On the other hand, if you opt for an AP system, you are opening the door to potential data inconsistencies. […] AP systems can get quite complicated (relative to CP systems)
In two subsequent articles, Sid is explaining “eventual consistency” for ☞ non-techies and for ☞ techies . I liked the way Paulo visually represented inconsistency across time in his slides: