CAP Theorem
In distributed computing, there are three properties which we can use to describe our system:
- Consistency
- Availability
- Partition Tolerance
Consistency means that the client will read the same data, no matter which node he will read from.
Availability means that the client will get the response from our system, even when some nodes are down.
Partition Tolerance means that our system functions correctly even, when the system gets partitioned (some nodes are separated, for example because of networking issues).
CAP theorem says that it's impossible to have three of those properties in the same time in our system. In other words, we need to sacrifice one of the property to be able to have the rest.
Because of that we can divide distributed systems into CA (Consistency and Availability), CP (Consistency and Partition Tolerance) and AP (Availability and Partition Tolerance).
In practice, it's not possible to eliminate networking issues, so it's not possible to have distributed system without Partition Tolerance property which leaves us with CP or AP systems.
In CP systems, we sacrifice Availability. For example when our system gets partitioned — so let's say we have two nodes in one data center and one node in other — not connected, then in order to maintain Consistency, we need to disable data modifications until we'll get the system connected. Otherwise, we'll have writes in nodes one and two, and we won't be able to read that data in node three, so we'll lose Consistency. Because we are blocking writes, we are loosing Availability.
In AP systems, we sacrifice Consistency. For example if our system get partitioned again, and we want to make it available, then clients can write to nodes one and two, thus on the node three we won't get consistent data. We have Availability but don't have Consistency.