In modern systems, programs generally have addresses that access the theoretical maximum memory of the computer architecture, 32 or 64 bits.
The MMU maps the addresses from each program into separate areas in physical memory, which is generally much smaller than the theoretical maximum.
Most modern operating systems (OS) work in concert with an MMU to provide virtual memory (VM) support.
Another common technique, found mostly on larger machines, was segmented translation, which allowed for variable-size blocks of memory that better mapped onto program requests.
An associative cache of PTEs is called a translation lookaside buffer (TLB) and is used to avoid the necessity of accessing the main memory every time a virtual address is mapped.
[4] Other MMUs may have a private array of memory,[5] registers,[6] or static RAM[7] that holds a set of mapping information.
[4] This style of access, over time, became common in the mainframe market[citation needed] and was known as segmented translation, although a variety of terms are used here as well.
This style has the advantage of simplicity; the memory blocks are continuous and thus only the two values, base and limit, need to be stored.
On systems where programs start and stop over time, this can eventually lead to memory being highly fragmented and no large blocks remaining.
This implemented a very simple MMU inside the CPU, with four processor registers holding base values accessed directly by the program.
The first is that as the virtual address space expands, the amount of memory needed to hold the mapping increases as well.
[4] The paged translation approach was widely used by microprocessor MMUs in the 1970s and early 80s, including the Signetics 68905 (which could operate in either mode).
For instance, the Atari MMU would express additional bits on the address bus to select among several banks of DRAM memory based on which of the chips was currently active, normally the CPU or ANTIC.
Some systems, mainly older RISC designs, trap into the OS when a page translation is not found in the TLB.
The IBM System/360 Model 67, which was introduced August, 1965, included an MMU called a dynamic address translation (DAT) box.
[17] This reduces overhead for the OS, which would otherwise need to propagate accessed and dirty bits from the page tables to a more physically oriented data structure.
[18]: 211–212 Thus, there is effectively a two-level tree, allowing applications to have sparse memory layout without wasting a lot of space on unused page table entries.
After a TLB miss, low-level firmware machine code (here called PALcode) walks a page table.
A TLB modified exception is generated when a store instruction references a mapped address and the matching entry's dirty status is not set.
The context register is important in a multitasking operating system because it allows the CPU to switch between processes without reloading all the translation state information.
Sharing of virtual address space and inter-context communications can be provided by writing the same values in to the segment or page maps of different contexts.
The other lookup, not directly supported by all processors in this family, is via a so-called inverted page table, which acts as a hashed off-chip extension of the TLB.
The OS may generate the new entry from a more normal tree-like page table or from per-mapping data structures which are likely to be slower and more space-efficient.
The OS may avoid reusing segment values to delay facing this, or it may elect to suffer the waste of memory associated with per-process hash tables.
Minor revisions of the MMU introduced with the Pentium have allowed very large 4 MB pages by skipping the bottom level of the tree (this leaves 10 bits for indexing the first level of page hierarchy with the remaining 10+12 bits being directly copied to the result).
Minor revisions of the MMU introduced with the Pentium Pro introduced the physical address extension (PAE) feature, enabling 36-bit physical addresses with 2+9+9 bits for three-level page tables and 12 lowest bits being directly copied to the result.
The W^X, Exec Shield, and PaX mechanisms described above emulate per-page non-execute support on machines x86 processors lacking the NX bit by setting the length of the code segment, with a performance loss and a reduction in the available address space.
x86-64, the 64-bit version of the x86 architecture, almost entirely removes segmentation in favor of the flat memory model used by almost all operating systems for the 386 or newer processors.
There are no such calls as malloc or dealloc, since memory blocks are also automatically allocated on pbit interrupt or discarded.
The MCP system is inherently secure and thus has no need of an MMU to provide this level of memory protection.