- tags: Distributed Systems
Data replication happens when the same data is stored on multiple storage devices that might not be on the same network.
- High availability since the system is still available even when some nodes are offline.
- Higher performance and scalability since the clients can connect to their closest data source.
- Transparency: the client needs to see one system instead of cluster of collaborating systems.
- Consistency: the ideal model is where a client makes an update and all observers would see it immediately.
Replication can tolerate a maximum number of faults:
- Fail-silent faults: for $f+1$ nodes, we can tolerate $f$ faults since we only need one replica to get the value.
- Byzantine faults: for $2f+1$ nodes, we can tolerate $f$ faults. This way we can assure that we receive $f+1$ equal responses.
Active vs Passive
There are two types of data replication:
- Active, where all servers execute all requests.
- Passive, where there is a primary server and multiple secondary ones. Only the primary server executes the requests.
Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability - Werner Vogels1
Data Consistency, High Availability, Tolerance to Network Partition: only 2 of these can be achieved at a given time.
- Data Consistency + High Availability
- Not tolerant to network partitions
- Usually uses transaction protocol
- Client and storage systems must be part of the same environment
- On large scale distributed systems, network partitions are a given, so it’s highly unlikely that the C+A “paradigm” can be applied.
- System may not be available
- System may not be consistent
- ACID (Atomicity, Consistency, Isolation, Durability)
- From the practical POV, monotonic reads and read-your-writes are the most desirable properties, but not always required.
For these models, we will take into account these components:
- A storage system.
- A process A that reads and writes.
- Two other processes B and C.
After A makes an update, both B and C will return the updated value.
After an update, there is no guarantee that subsequent updates will return the updated value for none of the clients, not even for the one who made the update.
- Gossip architecture.
It’s a #Weak Consistency model. If no new updates are made to a given item, eventually all accesses to that item will return the updated value.
Conditions such as propagation delays and others can be taken into account to calculate the inconsistency window.
E.g.: Domain Name System
In this model, all updates that are causally related must be applied in order on all processes. Other updates might be applied at any time.
After a process A writes an update to a given item, all subsequent reads by the same process should return the updated value.
Practical version of #Read-your-writes Consistency. There is a session between a process and a storage system. While the session is on, the process should always read values consistent to their previous updates.
Monotonic Read Consistency
If the process A sees an update for a given item, all subsequent reads by that process on that item must reflect that update.
Monotonic Write Consistency
The updates made by a process A to a given item are all applied in order, i.e., each update is executed after the previous one is executed.