pg_0039

Index

2-9

INTRODUCTION TO THE INTEL ARCHITECTURE

The Pentium

II processor and now the Pentium

III processor are based on the Pentium

Pro

processor architecture. Changes or enhancements to the Pentium

Pro processor architecture are

noted where appropriate.

2.5.DETAILED DESCRIPTION OF THE P6 FAMILY PROCESSOR

MICROARCHITECTURE

Figure 2-2 shows a functional block diagram of the P6 Family processor microarchitecture. In

this diagram, the following blocks make up the four processing units and the memory subsystem

shown in Figure 2-1:

Memory subsystemSystem bus, L2 cache, bus interface unit, instruction cache (L1),

data cache unit (L1), memory interface unit, and memory reorder buffer.

Fetch/decode unitInstruction fetch unit, branch target buffer, instruction decoder,

microcode sequencer, and register alias table.

Instruction poolReorder buffer

Dispatch/execute unitReservation station, two integer units, one x87 floating-point unit,

two address generation units, and two SIMD floating-point units.

Retire unitRetire unit and retirement register file.

2.5.1.Memory Subsystem

The memory subsystem for the P6 Family processor consists of main system memory, the

primary cache (L1), and the secondary cache (L2). The bus interface unit accesses system

memory through the external system bus. This 64-bit bus is a transaction-oriented bus, meaning

that each bus access is handled as separate request and response operations. While the bus inter-

face unit is waiting for a response to one bus request, it can issue numerous additional requests.

The bus interface unit accesses the close-coupled L2 cache through a 64-bit cache bus. This bus

is also transactional oriented, supporting up to four concurrent cache accesses, and operates at

the full clock speed of the processor.

Access to the L1 caches is through internal buses, also at full clock speed. The 8-KByte L1

instruction cache is four-way set associative; the 8-KByte L1 data cache is dual-ported and two-

way set associative, supporting one load and one store operation per cycle.

Coherency between the caches and system memory are maintained using the MESI (modified,

exclusive, shared, invalid) cache protocol. This protocol fosters cache coherency in single- and

multiple-processor systems. It is also able to detect coherency problems created by self-modi-

fying code.

Memory requests from the processors execution units go through the memory interface unit and

the memory order buffer. These units have been designed to support a smooth flow of memory

access requests through the cache and system memory hierarchy to prevent memory access

Index