E-21
GUIDELINES FOR WRITING FPU EXCEPTIONS HANDLERS
E.3.5.Considerations When FPU Shared Between Tasks
The IA allows speculative deferral of floating-point state swaps on task switches. This feature
allows postponing an FPU state swap until an FPU instruction is actually encountered in another
task. Since kernel tasks rarely use floating-point, and some applications do not use floating-point
or use it infrequently, the amount of time saved by avoiding unnecessary stores of the floating-
point state is significant. Speculative deferral of FPU saves does, however, place an extra burden
on the kernel in three key ways:
1.The kernel must keep track of which thread owns the FPU, which may be different from
the currently executing thread.
2.The kernel must associate any floating-point exceptions with the generating task. This
requires special handling since floating-point exceptions are delivered asynchronous with
other system activity.
3.There are conditions under which spurious floating-point exception interrupts are
generated, which the kernel must recognize and discard.
E.3.5.1.SPECULATIVELY DEFERRING FPU SAVES, GENERAL OVERVIEW
In order to support multitasking, each thread in the system needs a save area for the general-pur-
pose registers, and each task that is allowed to use floating-point needs an FPU save area large
enough to hold the entire FPU stack and associated FPU state such as the control word and status
word. (Refer to Section 7.3.9., Saving the FPUs State in Chapter 7, Floating-Point Unit, for
a complete description of the FPU save image.) If the processor and the operating system sup-
port Streaming SIMD Extensions, the save area should be large enough and aligned correctly to
hold FPU and Streaming SIMD Extensions state.
On a task switch, the general-purpose registers are swapped out to their save area for the sus-
pending thread, and the registers of the resuming thread are loaded. The FPU state does not need
to be saved at this point. If the resuming thread does not use the FPU before it is itself suspended,
then both a save and a load of the FPU state has been avoided. It is often the case that several
threads may be executed without any usage of the FPU.
The processor supports speculative deferral of FPU saves via interrupt 7 Device Not Available
(DNA), used in conjunction with CR0 bit 3, the Task Switched bit (TS). (Refer to Section 2.5.,
Control Registers, in Chapter 2, System Architecture Overview of the Intel Architecture Soft-
ware Developers Manual, Volume 3.) Every task switch via the hardware supported task switch-
ing mechanism (refer to Section 6.3., Task Switching in Chapter 6, Task Management of the
Intel Architecture Software Developers Manual, Volume 3) sets TS. Multi-threaded kernels that
use software task switchingI can set the TS bit by reading CR0, ORing a 1 intoII bit 3, and
writing back CR0. Any subsequent floating-point instructions (now being executed in a new
thread context) will fault via interrupt 7 before execution. This allows a DNA handler to save
the old floating-point context and reload the FPU state for the current thread. The handler should
NOTES
I In a software task switch, the operating system uses a sequence of instructions to save the suspending
threads state and restore the resuming threads state, instead of the single long non-interruptible task
switch operation provided by the IA.