This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation.
Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts.
Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs (and multiple robots/agents in general), load balancing, blockchain, and others.
The consensus problem requires agreement among a number of processes (or agents) on a single data value.
Some of the processes (agents) may fail or be unreliable in other ways, so consensus protocols must be fault-tolerant or resilient.
Protocols that solve consensus problems are designed to deal with a limited number of faulty processes.
In evaluating the performance of consensus protocols two factors of interest are running time and message complexity.
[3] In the most traditional single-value consensus protocols such as Paxos, cooperating nodes agree on a single value such as an integer, which may be of variable size so as to encode useful metadata such as a transaction committed to a database.
In a fully asynchronous message-passing distributed system, in which at least one process may have a crash failure, it has been proven in the famous 1985 FLP impossibility result by Fischer, Lynch and Paterson that a deterministic algorithm for achieving consensus is impossible.
[5] This impossibility result derives from worst-case scheduling scenarios, which are unlikely to occur in practice except in adversarial situations such as an intelligent denial-of-service attacker in the network.
For instance, the loss of a communication link may be modeled as a process which has suffered a Byzantine failure.
Randomized consensus algorithms can circumvent the FLP impossibility result by achieving both safety and liveness with overwhelming probability, even under worst-case scheduling scenarios such as an intelligent denial-of-service attacker in the network.
In the absence of such a well-defined, closed group with authenticated members, a Sybil attack against an open consensus group can defeat even a Byzantine consensus algorithm, simply by creating enough virtual participants to overwhelm the fault tolerance threshold.
A permissionless consensus protocol, in contrast, allows anyone in the network to join dynamically and participate without prior permission, but instead imposes a different form of artificial cost or barrier to entry to mitigate the Sybil attack threat.
Bitcoin introduced the first permissionless consensus protocol using proof of work and a difficulty adjustment function, in which participants compete to solve cryptographic hash puzzles, and probabilistically earn the right to commit blocks and earn associated rewards in proportion to their invested computational effort.
[2] In a fully asynchronous system there is no consensus solution that can tolerate one or more crash failures even when only requiring the non triviality property.
[5] This result is sometimes called the FLP impossibility proof named after the authors Michael J. Fischer, Nancy Lynch, and Mike Paterson who were awarded a Dijkstra Prize for this significant work.
The Paxos consensus algorithm by Leslie Lamport, and variants of it such as Raft, are used pervasively in widely deployed distributed and cloud computing systems.
These algorithms are typically synchronous, dependent on an elected leader to make progress, and tolerate only crashes and not Byzantine failures.
An example of a polynomial time binary consensus protocol that tolerates Byzantine failures is the Phase King algorithm by Garay and Berman.
[14] The algorithm solves consensus in a synchronous message passing model with n processes and up to f failures, provided n > 4f.
The king broadcasts the majority value it observed in the first round and serves as a tie breaker.
[15] Chubby maintains lock information in small files which are stored in a replicated database to achieve high availability in the face of failures.
The database is implemented on top of a fault-tolerant log layer which is based on the Paxos consensus algorithm.
In this scheme, Chubby clients communicate with the Paxos master in order to access/update the replicated log; i.e., read/write to the files.
Another well-known approach is called MSR-type algorithms which have been used widely from computer science to control theory.
To extend bitcoin's blockchain or distributed ledger, miners attempt to solve a cryptographic puzzle, where probability of finding a solution is proportional to the computational effort expended in hashes per second.
As an example, bitcoin mining (2018) is estimated to consume non-renewable energy sources at an amount similar to the entire nations of Czech Republic or Jordan, while the total energy consumption of Ethereum, the largest proof of stake network, is just under that of 205 average US households.
Researchers defined wait-freedom as the guarantee that the algorithm completes in a finite number of steps.
), which means they can solve consensus among any number of processes and they can simulate any other objects through an operation sequence.