[4][5] The instruction set architecture was also entirely new and a significant departure from Intel's previous 8008 and 8080 processors as the iAPX 432 programming model is a stack machine with no visible general-purpose registers.
It supports object-oriented programming,[5] garbage collection and multitasking as well as more conventional memory management directly in hardware and microcode.
Direct support for various data structures is also intended to allow modern operating systems to be implemented using far less program code than for ordinary processors.
These properties and features resulted in a hardware and microcode design that was more complex than most processors of the era, especially microprocessors.
[NB 2] Using the semiconductor technology of its day, Intel's engineers weren't able to translate the design into a very efficient first implementation.
Along with the lack of optimization in a premature Ada compiler, this contributed to rather slow but expensive computer systems, performing typical benchmarks at roughly 1/4 the speed of the new 80286 chip at the same clock frequency (in early 1982).
[7] This initial performance gap to the rather low-profile and low-priced 8086 line was probably the main reason why Intel's plan to replace the latter (later known as x86) with the iAPX 432 failed.
Although engineers saw ways to improve a next generation design, the iAPX 432 capability architecture had now started to be regarded more as an implementation overhead rather than as the simplifying support it was intended to be.
Meanwhile, Intel urgently needed a simpler interim product to meet the immediate competition from Motorola, Zilog, and National Semiconductor.
The architects had total freedom to do a novel design from scratch, using whatever techniques they guessed would be best for large-scale systems and software.
In many cases, the iAPX 432 had a significantly slower instruction throughput than conventional microprocessors of the era, such as the National Semiconductor 32016, Motorola 68010 and Intel 80286.
A larger issue was the capability architecture needed large associative caches to run efficiently, but the chips had no room left for that.
In addition, the BIU was designed to support fault-tolerant systems, and in doing so up to 40% of the bus time was held up in wait states.
When running the Dhrystone benchmark, parameter passing took ten times longer than all other computations combined.
However, some hold that the OO support was not the primary problem with the 432, and that the implementation shortcomings (especially in the compiler) mentioned above would have made any CPU design slow.
[citation needed] Intel had spent considerable time, money, and mindshare on the 432, had a skilled team devoted to it, and was unwilling to abandon it entirely after its failure in the marketplace.
According to the New York Times, Intel's collaboration with HP on the Merced processor (later known as Itanium) was the company's comeback attempt for the very high-end market.
[17] Unusually, they are not byte-aligned, that is, they may contain odd numbers of bits and directly follow each other without regard to byte boundaries.
This reduced the number of object table lookups dramatically, and doubled the maximum virtual address space.
Instead, the microcode implements part of the marking portion of Edsger Dijkstra's on-the-fly parallel garbage collection algorithm (a mark-and-sweep style collector).
"The Format field permits the GDP to appear to the programmer as a zero-, one-, two-, or three-address architecture."