*** Welcome to piglix ***

Superscalar processor


A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. It therefore allows for more throughput (the number of instructions that can be executed in a unit of time) than would otherwise be possible at a given clock rate. A superscalar processor can execute more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor. Each execution unit is not a separate processor (or a core if the processor is a multi-core processor), but an execution resource within a single CPU such as an arithmetic logic unit.

In Flynn's taxonomy, a single-core superscalar processor is classified as an SISD processor (Single Instruction stream, Single Data stream), though many superscalar processors support short vector operations and so could be classified as SIMD (Single Instruction stream, Multiple Data streams). A multi-core superscalar processor is classified as an MIMD processor (Multiple Instruction streams, Multiple Data streams).

While a superscalar CPU is typically also pipelined, pipelining and superscalar execution are considered different performance enhancement techniques. The former executes multiple instructions in the same execution unit in parallel by dividing the execution unit into different phases, whereas the latter executes multiple instructions in parallel by using multiple execution units.

The superscalar technique is traditionally associated with several identifying characteristics (within a given CPU):

Seymour Cray's CDC 6600 from 1966 is often mentioned as the first superscalar design. The Motorola MC88100 (1988), the Intel i960CA (1989) and the AMD 29000-series 29050 (1990) microprocessors were the first commercial single-chip superscalar microprocessors. RISC microprocessors like these were the first to have superscalar execution, because RISC architectures frees transistors and die area which could be used to include multiple execution units (this was why RISC designs were faster than CISC designs through the 1980s and into the 1990s).


...
Wikipedia

...