It originated as an extension of the 8086 instruction set in the form of optional floating-point coprocessors that work in tandem with corresponding x86 CPUs.
Like other extensions to the basic instruction set, x87 instructions are not strictly needed to construct working programs, but provide hardware and microcode implementations of common numerical tasks, allowing these tasks to be performed much faster than corresponding machine code routines can.
Before x87 instructions were standard in PCs, compilers or programmers had to use rather slow library calls to perform floating-point operations, a method that is still common in (low-cost) embedded systems.
There are instructions to push, calculate, and pop values on top of this stack; unary operations (FSQRT, FPTAN etc.)
The non-strict stack model also allows binary operations to use ST(0) together with a direct memory operand or with an explicitly specified stack register, ST(x), in a role similar to a traditional accumulator (a combined destination and left operand).
[2][3]) The x87 provides single-precision, double-precision and 80-bit double-extended precision binary floating-point arithmetic as per the IEEE 754-1985 standard.
A given sequence of arithmetic operations may thus behave slightly differently compared to a strict single-precision or double-precision IEEE 754 FPU.
Since the introduction of SSE2, the x87 instructions are not as essential as they once were, but remain important as a high-precision scalar unit for numerical calculations sensitive to round-off error and requiring the 64-bit mantissa precision and extended range available in the 80-bit format.
Clock cycle counts for examples of typical x87 FPU instructions (only register-register versions shown here).
[5] The A...B notation (minimum to maximum) covers timing variations dependent on transient pipeline status and the arithmetic precision chosen (32, 64 or 80 bits); it also includes variations due to numerical cases (such as the number of set bits, zero, etc.).
Companies that have designed or manufactured[a] floating-point units compatible with the Intel 8087 or later models include AMD (287, 387, 486DX, 5x86, K5, K6, K7, K8), Chips and Technologies (the Super MATH coprocessors), Cyrix (the FasMath, Cx87SLC, Cx87DLC, etc., 6x86, Cyrix MII), Fujitsu (early Pentium Mobile etc.
), Harris Semiconductor (manufactured 80387 and 486DX processors), IBM (various 387 and 486 designs), IDT (the WinChip, C3, C7, Nano, etc.
(Intel's earlier 8231 and 8232 floating-point processors, marketed for use with the i8080 CPU, were in fact licensed versions of AMD's Am9511 and Am9512 FPUs from 1977 and 1979.
[6]) Although the original 1982 datasheet for the (NMOS based) 80188 and 80186 seem to mention specific math coprocessors,[7] both chips were actually paired with an 8087.
The 80C187 interface to the main processor is the same as that of the 8087, but its core is essentially that of an 80387SX and is thus fully IEEE 754-compliant and capable of executing all the 80387's extra instructions.
[16] Shortly afterwards, it was made available through Intel's Personal Computer Enhancement Operation for a retail market price of USD $795.
[23] Marketed as "Intel387 SL Mobile Math CoProcessor", it included power-management features which allowed it to run without significantly reducing battery life.