/ˈpæk.sɒs/
noun … “Consensus algorithm for unreliable networks.”
Paxos is a fault-tolerant Consensus algorithm designed to achieve agreement among nodes in a Distributed System, even when some nodes fail or messages are delayed or lost. It ensures that a single value is chosen and consistently replicated across all non-faulty nodes, providing a foundation for reliable state machines, replicated databases, and coordination services.
Key characteristics of Paxos include:
- Roles: nodes operate as Proposers (suggest values), Acceptors (vote on values), and Learners (learn the agreed value).
- Quorum-based decisions: a value is chosen only when a majority of acceptors agree, ensuring safety despite node failures.
- Safety: at most one value can be chosen, preventing conflicting decisions.
- Liveness: the algorithm guarantees progress if a sufficient number of nodes are operational and communication is eventually reliable.
- Fault tolerance: works correctly even if some nodes crash or messages are delayed, provided a majority of acceptors remain reachable.
Workflow example: In a distributed key-value store, when a client proposes a new value for a key, Proposers send prepare requests to Acceptors. Acceptors respond with promises to reject lower-numbered proposals. Once a majority agree, the value is accepted and propagated to Learners, ensuring all non-faulty nodes converge on the same value.
-- Simplified Paxos prepare phase
proposers = ["P1", "P2"]
acceptors = ["A1", "A2", "A3"]
proposal_number = 1
for proposer in proposers {
for acceptor in acceptors {
send_prepare(acceptor, proposal_number)
}
}
-- Acceptors respond with promise to ignore lower-numbered proposalsConceptually, Paxos is like a committee that must choose a single candidate among several options. Even if some members are absent or communication is delayed, the committee follows a strict protocol to ensure only one candidate is elected, and all members eventually learn the result.
See Consensus, Raft, Distributed Systems, Replication, CAP Theorem.