9-23
PROGRAMMING WITH THE STREAMING SIMD EXTENSIONS
tion is not necessary when integrating a Streaming SIMD Extensions module with existing
MMX technology modules or existing x87-FP modules. Streaming SIMD Extensions also do
not affect the floating-point tag word (FTW), floating-point control word (FCW), floating-point
status word (FSW) or floating-point exception state (FIP, FOP, FCS, FDS and FDP).
The SIMD integer instructions that are included in Streaming SIMD Extensions behave identi-
cally to original MMX instructions, in the presence of x87-FP instructions; this includes:
Transition from x87-FP to MMX technology (TOS=0, FP valid bits set to all valid).
MMX instructions write ones (1s) to the exponent part of the corresponding x87-FP
register.
Use of EMMS for transition from MMX technology to x87-FP.
The Streaming SIMD Extensions that follow this behavior are: CVTPI2PS, CVTPS2PI,
CVTTPS2PI, MASKMOVQ, MOVNTQ, PEXTRW, PINSRW, PMOVMSKB, PMULHUW,
PSHUFW.
9.5.3.1.CACHEABILITY HINT INSTRUCTIONS
The Pentium
®
III processors cacheability control instructions enable the programmer to control
caching and prefetching of data. When correctly used, these instructions can significantly
improve application performance.
The PREFETCH instruction can minimize the latency of data access in performance-critical
sections of application code by allowing data to be fetched in advance of actual usage. The
instruction fetches 32 aligned bytes (or more, depending on the implementation) containing the
addressed byte, to a location in the processor cache hierarchy as specified by the temporal
locality hint (Table 9-7). In this table, cache level 0 is closest to the processor and cache level 2
is farthest from the processor. The hints specify fetch of either temporal or non-temporal data.
Subsequent accesses to temporal data are treated like normal accesses, while those to non-
temporal data will continue to minimize cache pollution. If the data is already present in a level
of the cache hierarchy that is closer to the processor, the PREFETCH instruction will not result
in any data movement.