x86 assembly language

These languages provide backward compatibility with CPUs dating back to the Intel 8008 microprocessor, introduced in April 1972.

[1][2] As assembly languages, they are closely tied to the architecture's machine code instructions, allowing for precise control over hardware.

[3] Each mnemonics corresponds to a basic operation performed by the processor, such as arithmetic calculations, data movement, or control flow decisions.

This includes real-time embedded systems, operating-system kernels, and device drivers, all of which may require direct manipulation of hardware resources.

This allows for optimization at the assembly level before producing the final machine code that the processor executes.

[10] The AT&T syntax is nearly universal across other architectures (retaining the same operand order for the mov instruction); it was originally designed for PDP-11 assembly.

[11] x86 processors feature a set of registers that serve as storage for binary data and addresses during program execution.

The original IBM PC restricted programs to 640 KB but an expanded memory specification was used to implement a bank switching scheme that fell out of use when later operating systems, such as Windows, used the larger address ranges of newer processors and implemented their own virtual memory schemes.

To access the extended functionality of the 80286, the operating system would set the processor into protected mode, enabling 24-bit addressing and thus 224 bytes of memory (16 megabytes).

The 32-bit flat memory model of the 80386's extended protected mode may be the most important feature change for the x86 processor family until AMD released x86-64 in 2003, as it helped drive large scale adoption of Windows 3.1 (which relied on protected mode) since Windows could now run many applications at once, including DOS applications, by using virtual memory and simple multitasking.

The instruction set is similar in each mode but memory addressing and word size vary, requiring different programming strategies.

The 64-bit operating system kernel checks and switches the CPU into Long mode and then starts new kernel-mode threads running 64-bit code.

In general, the features of the modern x86 instruction set are: The x86 architecture has hardware support for an execution stack mechanism.

Instructions such as push, pop, call and ret are used with the properly set up stack to pass parameters, to allocate space for local data, and to save and restore call-return points.

When setting up a stack frame to hold local data of a recursive procedure there are several choices; the high level enter instruction (introduced with the 80186) takes a procedure-nesting-depth argument as well as a local size argument, and may be faster than more explicit manipulation of the registers (such as push bp ; mov bp, sp ; sub sp, size).

Whether it is faster or slower depends on the particular x86-processor implementation as well as the calling convention used by the compiler, programmer or particular program code; most x86 code is intended to run on x86-processors from several manufacturers and on different technological generations of processors, which implies highly varying microarchitectures and microcode solutions as well as varying gate- and transistor-level design choices.

The full range of addressing modes (including immediate and base+offset) even for instructions such as push and pop, makes direct usage of the stack for integer, floating point and address data simple, as well as keeping the ABI specifications and mechanisms relatively simple compared to some RISC architectures (require more explicit call stack details).

Various instruction technologies support different operations on different register sets, but taken as complete whole (from MMX to SSE4.2) they include general computations on integer or floating-point arithmetic (addition, subtraction, multiplication, shift, minimization, maximization, comparison, division or square root).

Streaming SIMD Extensions or SSE also includes a floating-point mode in which only the very first value of the registers is actually modified (expanded in SSE2).

extensions include addition and subtraction instructions for treating paired floating-point values like complex numbers.

So for example, one can encode mov eax, [Table + ebx + esi*4] as a single instruction which loads 32 bits of data from the address computed as (Table + ebx + esi * 4) offset from the ds selector, and stores it to the eax register.

[21][22] The x86 instruction set includes string load, store, move, scan and compare instructions (lods, stos, movs, scas and cmps) which perform each operation to a specified size (b for 8-bit byte, w for 16-bit word, d for 32-bit double word) then increments/decrements (depending on DF, direction flag) the implicit address register (si for lods, di for stos and scas, and both for movs and cmps).

The stack pointer is decremented when items are added (‘push’) and incremented after things are removed (‘pop’).

The comparison cmp (compare) and test instructions set the flags as if they had performed a subtraction or a bitwise AND operation, respectively, without altering the values of the operands.

Floating point comparisons are performed via fcom or ficom instructions which eventually have to be converted to integer flags.

Hard interrupts are triggered by external hardware events, and must preserve all register values as the state of the currently executing program is unknown.

(Note: There is also an alternative AT&T-syntax flavor where the order of source and destination operands are swapped, among many other differences.

)[23] Using the software interrupt 21h instruction to call the MS-DOS operating system for output to the display – other samples use libc's C printf() routine to write to stdout.

But this is a static executable because we linked using ld without -pie or any shared libraries; the only instructions that run in user-space are the ones you provide.

The x87 floating point maths subsystem also has its own independent ‘flags’-type register the fp status word.