Alternatively, one can pack the structure, omitting the padding, which may lead to slower access, but uses three quarters as much memory.
Fortran, Ada,[1][2] PL/I,[3] Pascal,[4] certain C and C++ implementations, D,[5] Rust,[6] C#,[7] and assembly language allow at least partial control of data structure padding, which may be useful in certain special circumstances.
An n-byte aligned address would have a minimum of log2(n) least-significant zeros when expressed in binary.
Note that the definitions above assume that each primitive datum is a power of two bytes long.
When this is not the case (as with 80-bit floating-point on x86) the context influences the conditions where the datum is considered aligned or not.
This requires a lot of complex circuitry to generate the memory accesses and coordinate them.
Some processor designs deliberately avoid introducing such complexity, and instead yield alternative behavior in the event of a misaligned memory access.
For example, implementations of the ARM architecture prior to the ARMv6 ISA require mandatory aligned memory access for all multi-byte load and store instructions.
[8] Depending on which specific instruction was issued, the result of attempted misaligned access might be to round down the least significant bits of the offending address turning it into an aligned access (sometimes with additional caveats), or to throw an MMU exception (if MMU hardware is present), or to silently yield other potentially unpredictable results.
When a single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it.
Although C and C++ do not allow the compiler to reorder structure members to save space, other languages might.
Likewise, in PL/I a structure may be declared UNALIGNED to eliminate all padding except around bit strings.
The following formulas provide the number of padding bytes required to align the start of a data structure (where mod is the modulo operator): For example, the padding to add to offset 0x59d for a 4-byte aligned structure is 3.
The following formulas produce the correct values (where & is a bitwise AND and ~ is a bitwise NOT) – providing the offset is unsigned or the system uses two's complement arithmetic: Data structure members are stored sequentially in memory so that, in the structure below, the member Data1 will always precede Data2; and Data2 will always precede Data3: If the type "short" is stored in two bytes of memory then each member of the data structure depicted above would be 2-byte aligned.
The following typical alignments are valid for compilers from Microsoft (Visual C++), Borland/CodeGear (C++Builder), Digital Mars (DMC), and GNU (GCC) when compiling for 32-bit x86: The only notable differences in alignment for an LP64 64-bit system when compared to a 32-bit system are: Some data types are dependent on the implementation.
Note that Padding1[1] has been replaced (and thus eliminated) by Data4 and Padding2[3] is no longer necessary as the structure is already aligned to the size of a long word.
If an array is partitioned for more than one thread to operate on, having the sub-array boundaries unaligned to cache lines could lead to performance degradation.
Here is an example to allocate memory (double array of size 10) aligned to cache of 64 bytes.
where aligntonext(p, r) works by adding an aligned increment, then clearing the r least significant bits of p. A possible implementation is