3-10
INSTRUCTION SET REFERENCE
3.1.3.2.MMX TECHNOLOGY INTRINSICS
The MMX technology intrinsics are based on a new __m64 data type to represent the specific
contents of an MMX technology register. You can specify values in bytes, short integers, 32-
bit values, or a 64-bit object. The __m64 data type, however, is not a basic ANSI C data type,
and therefore you must observe the following usage restrictions:
Use __m64 data only on the left-hand side of an assignment, as a return value, or as a
parameter. You cannot use it with other arithmetic expressions ("+", ">>", and so on).
Use
__m64 objects in aggregates, such as unions to access the byte elements and structures; the
address of an __m64 object may be taken.
Use __m64 data only with the MMX technology intrinsics described in this guide and
the
Intel C/C++ Compiler Users Guide for Win32* Systems With Streaming SIMD Extension
Support (Order Number 718195-00B). Refer to Appendix C, Compiler Intrinsics and Functional
Equivalents for more information on using intrinsics.
3.1.3.3.SIMD FLOATING-POINT INTRINSICS
The __m128 data type is used to represent the contents of an xmm register, which is either four
packed single-precision floating-point values or one scalar single-precision number. The
__m128 data type is not a basic ANSI C datatype and therefore some restrictions are placed on
its usage:
Use __m128 only on the left-hand side of an assignment, as a return value, or as a
parameter. Do not use it in other arithmetic expressions such as "+" and ">>".
Do not initialize __m128 with literals; there is no way to express 128-bit constants.
Use __m128 objects in aggregates, such as unions (for example, to access the float
elements) and structures. The address of an __m128 object may be taken.
Use __m128 data only with the intrinsics described in this users guide.
Refer to
Appendix C, Compiler Intrinsics and Functional Equivalents for more information on using
intrinsics.
The compiler aligns __m128 local data to 16B boundaries on the stack. Global __m128 data is
also 16B-aligned. (To align float arrays, you can use the alignment declspec described in the
following section.) Because the new instruction set treats the SIMD floating-point registers in
the same way whether you are using packed or scalar data, there is no __m32 datatype to repre-
sent scalar data as you might expect. For scalar operations, you should use the __m128 objects
and the scalar forms of the intrinsics; the compiler and the processor implement these opera-
tions with 32-bit memory references.
The suffixes ps and ss are used to denote packed single and scalar single precision opera-
tions. The packed floats are represented in right-to-left order, with the lowest word (right-most)
being used for scalar operations: [z, y, x, w]. To explain how memory storage reflects this,
consider the following example.