2-9
INTRODUCTION TO THE INTEL ARCHITECTURE
The Pentium
®
II processor and now the Pentium
®
III processor are based on the Pentium
®
Pro
processor architecture. Changes or enhancements to the Pentium
®
Pro processor architecture are
noted where appropriate.
2.5.DETAILED DESCRIPTION OF THE P6 FAMILY PROCESSOR
MICROARCHITECTURE
Figure 2-2 shows a functional block diagram of the P6 Family processor microarchitecture. In
this diagram, the following blocks make up the four processing units and the memory subsystem
shown in Figure 2-1:
Memory subsystemSystem bus, L2 cache, bus interface unit, instruction cache (L1),
data cache unit (L1), memory interface unit, and memory reorder buffer.
Fetch/decode unitInstruction fetch unit, branch target buffer, instruction decoder,
microcode sequencer, and register alias table.
Instruction poolReorder buffer
Dispatch/execute unitReservation station, two integer units, one x87 floating-point unit,
two address generation units, and two SIMD floating-point units.
Retire unitRetire unit and retirement register file.
2.5.1.Memory Subsystem
The memory subsystem for the P6 Family processor consists of main system memory, the
primary cache (L1), and the secondary cache (L2). The bus interface unit accesses system
memory through the external system bus. This 64-bit bus is a transaction-oriented bus, meaning
that each bus access is handled as separate request and response operations. While the bus inter-
face unit is waiting for a response to one bus request, it can issue numerous additional requests.
The bus interface unit accesses the close-coupled L2 cache through a 64-bit cache bus. This bus
is also transactional oriented, supporting up to four concurrent cache accesses, and operates at
the full clock speed of the processor.
Access to the L1 caches is through internal buses, also at full clock speed. The 8-KByte L1
instruction cache is four-way set associative; the 8-KByte L1 data cache is dual-ported and two-
way set associative, supporting one load and one store operation per cycle.
Coherency between the caches and system memory are maintained using the MESI (modified,
exclusive, shared, invalid) cache protocol. This protocol fosters cache coherency in single- and
multiple-processor systems. It is also able to detect coherency problems created by self-modi-
fying code.
Memory requests from the processors execution units go through the memory interface unit and
the memory order buffer. These units have been designed to support a smooth flow of memory
access requests through the cache and system memory hierarchy to prevent memory access