Conflict-free replicated data type

In distributed computing, a conflict-free replicated data type (CRDT) is a data structure that is replicated across multiple computers in a network, with the following features:[1][2][3][4][5][6][7][8] The CRDT concept was formally defined in 2011 by Marc Shapiro, Nuno Preguiça, Carlos Baquero and Marek Zawirski.

The NoSQL distributed databases Redis, Riak and Cosmos DB have CRDT data types.

Accordingly, much of distributed computing focuses on the problem of how to prevent concurrent updates to replicated data.

While optimistic replication might not work in the general case, there is a significant and practically useful class of data structures, CRDTs, where it does work — where it is always possible to merge or resolve concurrent updates on different replicas of the data structure without conflicts.

As an example, a one-way Boolean event flag is a trivial CRDT: one bit, with a value of true or false.

The intuition behind commutativity, associativity and idempotence is that these properties are used to make the CRDT invariant under package re-ordering and duplication.

Operation-based CRDTs (also called commutative replicated data types, or CmRDTs) are defined without a merge function.

State-based CRDTs are often simpler to design and to implement; their only requirement from the communication substrate is some kind of gossip protocol.

However, operation-based CRDTs require guarantees from the communication middleware; that the operations are not dropped or duplicated when transmitted to the other replicas, and that they are delivered in causal order.

Gossip protocols work well for propagating state-based CRDT state to other replicas while reducing network use and handling topology changes.

Updates are propagated in the background, and merged by taking the max() of every element in P. The compare function is included to illustrate a partial order on the states.

In this case, two G-Counters are combined to create a data type supporting both increment and decrement operations.

[16] A sequence, list, or ordered set CRDT can be used to build a collaborative real-time editor, as an alternative to operational transformation (OT).

[19] CRATE[20] is a decentralized real-time editor built on top of LSEQSplit (an extension of LSEQ) and runnable on a network of browsers using WebRTC.

MUTE [22][23] is an online web-based peer-to-peer real-time collaborative editor relying on the LogootSplit algorithm.

Industrial sequence CRDTs, including open-source ones, are known to out-perform academic implementations due to optimizations and a more realistic testing methodology.

[24] The main popular example is Yjs CRDT, a pioneer in using a plainlist instead of a tree (ala Kleppmann's automerge).