Micro-operation

[1] Various forms of μops have long been the basis for traditional microcode routines used to simplify the implementation of a particular CPU design or perhaps just the sequencing of certain multi-step operations or addressing modes.

More recently, μops have also been employed in a different way in order to let modern CISC processors more easily handle asynchronous parallel and speculative execution: As with traditional microcode, one or more table lookups (or equivalent) is done to locate the appropriate μop-sequence based on the encoding and semantics of the machine instruction (the decoding or translation step), however, instead of having rigid μop-sequences controlling the CPU directly from a microcode-ROM, μops are here dynamically buffered for rescheduling before being executed.

[4]: 6–7, 9–11 This buffering means that the fetch and decode stages can be more detached from the execution units than is feasible in a more traditional microcoded (or hard-wired) design.

As this allows a degree of freedom regarding execution order, it makes some extraction of instruction-level parallelism out of a normal single-threaded program possible (provided that dependencies are checked, etc.).

It opens up for more analysis and therefore also for reordering of code sequences in order to dynamically optimize mapping and scheduling of μops onto machine resources (such as ALUs, load/store units, etc.).

A high-level illustration showing the decomposition of machine instructions into micro-operations, performed during typical fetch-decode-execute cycles [ 1 ] : 11