Cyclic redundancy check

[1][2] Blocks of data entering these systems get a short check value attached, based on the remainder of a polynomial division of their contents.

On retrieval, the calculation is repeated and, in the event the check values do not match, corrective action can be taken against data corruption.

[3] CRCs are so called because the check (data verification) value is a redundancy (it expands the message without adding information) and the algorithm is based on cyclic codes.

CRCs are popular because they are simple to implement in binary hardware, easy to analyze mathematically, and particularly good at detecting common errors caused by noise in transmission channels.

The use of systematic cyclic codes, which encode messages by adding a fixed-length check value, for the purpose of error detection in communication networks, was first proposed by W. Wesley Peterson in 1961.

[4] Cyclic codes are not only simple to implement but have the benefit of being particularly well suited for the detection of burst errors: contiguous sequences of erroneous data symbols in messages.

Typically an n-bit CRC applied to a data block of arbitrary length will detect any single error burst not longer than n bits, and the fraction of all longer error bursts that it will detect is approximately (1 − 2−n).

The important caveat is that the polynomial coefficients are calculated according to the arithmetic of a finite field, so the addition operation can always be performed bitwise-parallel (there is no carry between digits).

The simplest error-detection system, the parity bit, is in fact a 1-bit CRC: it uses the generator polynomial x + 1 (two terms),[5] and has the name CRC-1.

When a codeword is received or read, the device either compares its check value with one freshly calculated from the data block, or equivalently, performs a CRC on the whole codeword and compares the resulting check value with an expected residue constant.

Otherwise, the data is assumed to be error-free (though, with some small probability, it may contain undetected errors; this is inherent in the nature of error-checking).

[6] CRCs are specifically designed to protect against common types of errors on communication channels, where they can provide quick and reasonable assurance of the integrity of messages delivered.

Firstly, as there is no authentication, an attacker can edit a message and recompute the CRC without the substitution being detected.

[9] To compute an n-bit binary CRC, line the bits representing the input in a row, and position the (n + 1)-bit pattern representing the CRC's divisor (called a "polynomial") underneath the left end of the row.

Start with the message to be encoded: This is first padded with zeros corresponding to the bit length n of the CRC.

Here is the first calculation for computing a 3-bit CRC: The algorithm acts on the bits directly above the divisor in each step.

The validity of a received message can easily be verified by performing the above calculation again, this time with the check value added instead of zeroes.

Note that this code works with string inputs rather than raw numbers: Mathematical analysis of this division-like process reveals how to select a divisor that guarantees good error-detection properties.

In this analysis, the digits of the bit strings are taken as the coefficients of a polynomial in some variable x—coefficients that are elements of the finite field GF(2) (the integers modulo 2, i.e. either a zero or a one), instead of more familiar numbers.

However, choosing a reducible polynomial will result in a certain proportion of missed errors, due to the quotient ring having zero divisors.

that admits other factorizations may be chosen then so as to balance the maximal total blocklength with a desired error detection power.

Regardless of the reducibility properties of a generator polynomial of degree r, if it includes the "+1" term, the code will be able to detect error patterns that are confined to a window of r contiguous bits.

The concept of the CRC as an error-detecting code gets complicated when an implementer or standards committee uses it to design a practical system.

may be transcribed as: In the table below they are shown as: CRCs in proprietary protocols might be obfuscated by using a non-trivial initial value and a final XOR, but these techniques do not add cryptographic strength to the algorithm and can be reverse engineered using straightforward methods.

By no means does one algorithm, or one of each degree, suit every purpose; Koopman and Chakravarty recommend selecting a polynomial according to the application requirements and the expected distribution of message lengths.

[13] The number of distinct CRCs in use has confused developers, a situation which authors have sought to address.

Since 1993, Koopman, Castagnoli and others have surveyed the space of polynomials between 3 and 64 bits in size,[13][15][16][17] finding examples that have much better performance (in terms of Hamming distance for a given message size) than the polynomials of earlier protocols, and publishing the best of these with the aim of improving the error detection capacity of future standards.

The design of the 32-bit polynomial most commonly used by standards bodies, CRC-32-IEEE, was the result of a joint effort for the Rome Laboratory and the Air Force Electronic Systems Division by Joseph Hammond, James Brown and Shyan-Shiang Liu of the Georgia Institute of Technology and Kenneth Brayer of the Mitre Corporation.

The earliest known appearances of the 32-bit polynomial were in their 1975 publications: Technical Report 2956 by Brayer for Mitre, published in January and released for public dissemination through DTIC in August,[18] and Hammond, Brown and Liu's report for the Rome Laboratory, published in May.

CRC-32C computation is implemented in hardware as an operation (CRC32) of SSE4.2 instruction set, first introduced in Intel processors' Nehalem microarchitecture.