Salsa20, the original cipher, was designed in 2005, then later submitted to the eSTREAM European Union cryptographic validation process by Bernstein.
This gives Salsa20 and ChaCha the unusual advantage that the user can efficiently seek to any position in the key stream in constant time.
Salsa20 offers speeds of around 4–14 cycles per byte in software on modern x86 processors,[5] and reasonable hardware performance.
It is not patented, and Bernstein has written several public domain implementations optimized for common architectures.
In other words, applying the reverse operations would produce the original 4×4 matrix, including the key.
These variants were introduced to complement the original Salsa20, not to replace it, and perform better[note 1] in the eSTREAM benchmarks than Salsa20, though with a correspondingly lower security margin.
[12] The eSTREAM committee recommends the use of Salsa20/12, the 12-round variant, for "combining very good performance with a comfortable margin of security.
In 2005, Paul Crowley reported an attack on Salsa20/5 with an estimated time complexity of 2165 and won Bernstein's US$1000 prize for "most interesting Salsa20 cryptanalysis".
[15] In 2007, Tsunoo et al. announced a cryptanalysis of Salsa20 which breaks 8 out of 20 rounds to recover the 256-bit secret key in 2255 operations, using 211.37 keystream pairs.
In 2008, Bernstein published the closely related ChaCha family of ciphers, which aim to increase the diffusion per round while achieving the same or slightly better performance.
[20] Additionally, the input formatting has been rearranged to support an efficient SSE implementation optimization discovered for Salsa20.
It also defines a variant using sixteen 64-bit words (1024 bits of state), with correspondingly adjusted rotation constants.
[22] Aumasson argues in 2020 that 8 rounds of ChaCha (ChaCha8) probably provides enough resistance to future cryptanalysis for the same security level, yielding a 2.5× speedup.
[19] Google had selected ChaCha20 along with Bernstein's Poly1305 message authentication code in SPDY, which was intended as a replacement for TLS over TCP.
[31] ChaCha20 is also used for the arc4random random number generator in FreeBSD,[32] OpenBSD,[33] and NetBSD[34] operating systems, instead of the broken RC4, and in DragonFly BSD[35] for the CSPRNG subroutine of the kernel.
[36][37] Starting from version 4.8, the Linux kernel uses the ChaCha20 algorithm to generate data for the nonblocking /dev/urandom device.
As a result, ChaCha20 is sometimes preferred over AES in certain use cases involving mobile devices, which mostly use ARM-based CPUs.