Endianness

In computing, endianness is the order in which bytes within a word of digital data are transmitted over a data communication medium or addressed (by rising addresses) in computer memory, counting only byte significance compared to earliness.

Endianness is primarily expressed as big-endian (BE) or little-endian (LE), terms introduced by Danny Cohen into computer science for data ordering in an Internet Experiment Note published in 1980.

[1] The adjective endian has its origin in the writings of 18th century Anglo-Irish writer Jonathan Swift.

On most modern computers, the smallest data group with an address is eight bits long and is called a byte.

There are two possible ways a computer could number the individual bytes in a larger group, starting at either end.

In the context of this article where its type cannot be arbitrarily complicated, a "field" consists of a consecutive sequence of bytes and represents a "simple data value" which – at least potentially – can be manipulated by one single hardware instruction.

When an operation such as addition is performed, the processor begins at the low-order positions at the high addresses of the two fields and works its way down to the high-order.

In pure form this is valid for moderate sized non-negative integers, e.g. of C data type unsigned.

These positions can be mapped to memory mainly in two ways:[12] In these expressions, the term "end" is meant as the extremity where the big resp.

On big-endian machines, the value appears left-to-right, coinciding with the correct string order for reading the result ("J O H N").

Other compilers have options for generating code that globally enables the conversion for all file IO operations.

[21] Some operations in positional number systems have a natural or preferred order in which the elementary steps are to be executed.

The implementation of these operations is marginally simpler using little-endian machines where this first byte contains the least significant digit.

For fixed-length numerical values (typically of length 1,2,4,8,16), the implementation of these operations is marginally simpler on big-endian machines.

Some big-endian processors (e.g. the IBM System/360 and its successors) contain hardware instructions for lexicographically comparing varying length character strings.

The normal data transport by an assignment statement is in principle independent of the endianness of the processor.

Many historical and extant processors use a big-endian memory representation, either exclusively or as a design option.

[22][23] The MOS Technology 6502 family (including Western Design Center 65802 and 65C816), the Zilog Z80 (including Z180 and eZ80), the Altera Nios II, the Atmel AVR, the Andes Technology NDS32, the Qualcomm Hexagon, and many other processors and processor families are also little-endian.

Architectures that support switchable endianness include PowerPC/Power ISA, SPARC V9, ARM versions 3 and above, DEC Alpha, MIPS, Intel i860, PA-RISC, SuperH SH-4, IA-64, C-Sky, and RISC-V.

The word bi-endian, when said of hardware, denotes the capability of the machine to compute or pass data in either endian format.

Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems, the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g. Alpha, which runs only in big-endian mode on the Cray T3E).

In the absence of this unusual motherboard hardware, device driver software must write to different addresses to undo the incomplete transformation and also must perform a normal byte swap.

[30] Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another.

However, on modern standard computers (i.e., implementing IEEE 754), one may safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type.

Machines able to manipulate such data with one instruction (e.g. compare, add) include the IBM 1401, 1410, 1620, System/360, System/370, ESA/390, and z/Architecture, all of them of type big-endian.

[32] A way to interpret this endianness is that it stores a 32-bit integer as two little-endian 16-bit words, with a big-endian word ordering: Segment descriptors of IA-32 and compatible processors keep a 32-bit base address of the segment stored in little-endian order, but in four nonconsecutive bytes, at relative positions 2, 3, 4 and 7 of the descriptor start.

An attempt to read such a file using Fortran on a system of the other endianness results in a run-time error, because the count fields are incorrect.

Unicode text can optionally start with a byte order mark (BOM) to signal the endianness of the file or stream.

[34] TIFF image files are an example of the second strategy, whose header instructs the application about the endianness of their internal binary integers.

ZFS, which combines a filesystem and a logical volume manager, is known to provide adaptive endianness and to work with both big-endian and little-endian systems.