Complex instruction set computer

Examples of CISC architectures include complex mainframe computers to simplistic microcontrollers where memory load and store operations are not separated from arithmetic instructions.

The benefits of semantically rich instructions with compact encodings can be seen in modern processors as well, particularly in the high-performance segment where caches are a central component (as opposed to most embedded systems).

Of course, the fundamental reason they are needed is that main memories (i.e., dynamic RAM today) remain slow compared to a (high-performance) CPU core.

While many designs achieved the aim of higher throughput at lower cost and also allowed high-level language constructs to be expressed by fewer instructions, it was observed that this was not always the case.

One reason for this was that architects (microcode writers) sometimes "over-designed" assembly language instructions, including features that could not be implemented efficiently on the basic hardware available.

There could, for instance, be "side effects" (above conventional flags), such as the setting of a register or memory location that was perhaps seldom used; if this was done via ordinary (non duplicated) internal buses, or even the external bus, it would demand extra cycles every time, and thus be quite inefficient.

Even in balanced high-performance designs, highly encoded and (relatively) high-level instructions could be complicated to decode and execute efficiently within a limited transistor budget.

Such architectures therefore required a great deal of work on the part of the processor designer in cases where a simpler, but (typically) slower, solution based on decode tables and/or microcode sequencing is not appropriate.

In the early 1970s, this gave rise to ideas to return to simpler processor designs in order to make it more feasible to cope without (then relatively large and expensive) ROM tables and/or PLA structures for sequencing and/or decoding.

The frequent memory accesses for operands of a typical CISC machine may limit the instruction-level parallelism that can be extracted from the code, although this is strongly mediated by the fast cache structures used in modern designs, as well as by other measures.

Due to inherently compact and semantically rich instructions, the average amount of work performed per machine code unit (i.e. per byte or bit) is higher for a CISC than a RISC processor, which may give it a significant advantage in a modern cache-based implementation.