Memory-level parallelism

In computer architecture, memory-level parallelism (MLP) is the ability to have pending multiple memory operations, in particular cache misses or translation lookaside buffer (TLB) misses, at the same time.

In a single processor, MLP may be considered a form of instruction-level parallelism (ILP).

However, ILP is often conflated with superscalar, the ability to execute more than one instruction at the same time, e.g. a processor such as the Intel Pentium Pro is five-way superscalar, with the ability to start executing five different microinstructions in a given cycle, but it can handle four different cache misses for up to 20 different load microinstructions at any time.

Arguably a machine that has no ILP, which is not superscalar, which executes one instruction at a time in a non-pipelined manner, but which performs hardware prefetching (not software instruction-level prefetching) exhibits MLP (due to multiple prefetches outstanding) but not ILP.

Often, however, we restrict the terms MLP and ILP to refer to extracting such parallelism from what appears to be non-parallel single threaded code.