State machine replication

In computer science, state machine replication (SMR) or state machine approach is a general method for implementing a fault-tolerant service by replicating servers and coordinating client interactions with server replicas.

Intuitively, if multiple copies of a system exist, a fault in one would be noticeable as a difference in the State or Output from the others.

[4] All of this deduction pre-supposes that replicas are experiencing only random independent faults such as memory errors or hard-drive crash.

Failures caused by replicas which attempt to lie, deceive, or collude can also be handled by the State Machine Approach, with isolated changes.

Failed replicas are not required to stop; they may continue operating, including generating spurious or incorrect Outputs.

2F+1 replicas, with non-cryptographic hashes suffices to survive all non-malicious Byzantine failures (with high probability).

Malicious attacks require cryptographic primitives to achieve 2F+1 (using message signatures), or non-cryptographic techniques can be applied but the number of replicas must be increased to 3F+1.

The appendix contains discussion on typical extensions used in real-world systems such as Logging, Checkpoints, Reconfiguration, and State Transfer.

The critical step in building a distributed system of State Machines is choosing an order for the Inputs to be processed.

[2][6][7][8][9] A Visible Channel is a communication path between two entities actively participating in the system (such as clients and servers).

An order of Inputs may be defined using a voting protocol whose results depend only on the visible channels.

Client requests are interpreted as Inputs to the State Machine, and processed into Outputs in the appropriate order.

Proof of failure is difficult to obtain, as the replica may simply be slow to respond,[13] or even lie about its status.

A common implementation is to pass checksums of the current replica State and recent Outputs among servers.

It is possible that the local server is compromised, or that the Audit process is faulty, and the replica continues to operate incorrectly.

Realistic deployments must compensate for transient non-failure behaviors of the system such as message loss, network partitions, and slow processors.

A persistent log may compensate for extended transient periods, or support additional system features such as Checkpoints, and Reconfiguration.

The system will ensure non-faulty replicas process this command in the same order, after which all log entries before the checkpoint may be discarded.

Replicas which cannot locate copies of a needed log entry are faulty and must re-join the system (see Reconfiguration).

Reconfiguration allows replicas to be added and removed from a system while client requests continue to be processed.

[7] That is, one of the replicas must remain leader long enough to achieve consensus on the next operation of the state machine.

A number of researchers published articles on the replicated state machine approach in the early 1980s.

Leslie Lamport also proposed the state machine approach, in his 1984 paper on "Using Time Instead of Timeout In Distributed Systems".

Recent work by Miguel Castro and Barbara Liskov used the state machine approach in what they call a "Practical Byzantine fault tolerance" architecture that replicates especially sensitive services using a version of Lamport's original state machine approach, but with optimizations that substantially improve performance.

BFT-SMaRt is the most recent effort to implement state machine replication, still being actively maintained.

Motivated by PBFT, Tendermint BFT[16] was introduced for partial asynchronous networks and it is mainly used for Proof of Stake blockchains.