9-24
PROGRAMMING WITH THE STREAMING SIMD EXTENSIONS
Table 9-7. Cache Hints
The PREFETCH instruction does not change the user-visible semantics of a program, although
it may affect the performance of a program. The operation of this instruction is implementation-
dependent and can be overloaded to a subset of the hints (for example, T0, T1, and T2 may have
the same behavior) or altogether ignored by an implementation. The programmer will have to
tune his application for each implementation to take advantage of these instructions. These
instructions do not generate exceptions or faults. Excessive usage of prefetch instructions may
be throttled by the processor. For more detailed information on prefetch hints, refer to Chapter
6, Optimizing Cache Utilization for Pentium
®
III Processors, in the Intel Architecture Opti-
mization Reference Manual (Order Number 245127-001).
Some common usage models that may be affected in this way by weakly-ordered stores are:
library functions, which use weakly-ordered memory to write results
compiler-generated code, which also benefit from writing weakly-ordered results
hand-crafted code
The degree to which a consumer of data knows that the data is weakly-ordered can vary for these
cases. As a result, the SFENCE instruction should be used to ensure ordering between routines
that produce weakly-ordered data and routines that consume this data. The SFENCE instruction
provides a performance-efficient way to ensure ordering, by guaranteeing that every store
instruction that precedes the store fence instruction in program order is globally visible before
any store instruction that follows the fence.
HINTS
ACTIONS
T0
Temporal data - fetch data into all levels of cache hierarchy
(L1 or L2 on Pentium® III)
T1
Temporal data - fetch data into level 2 cache and higher
(L2 on Pentium® III)
T2
Temporal data - fetch data into level 2 cache and higher
(L2 on Pentium® III)
NTA
Non-temporal data - fetch data into location close to the processor, minimizing cache
pollution (for level 1 cache)
(L1 on Pentium® III)