pg_0050

Index

3-10

INSTRUCTION SET REFERENCE

3.1.3.2.MMX TECHNOLOGY INTRINSICS

The MMX technology intrinsics are based on a new __m64 data type to represent the specific

contents of an MMX technology register. You can specify values in bytes, short integers, 32-

bit values, or a 64-bit object. The __m64 data type, however, is not a basic ANSI C data type,

and therefore you must observe the following usage restrictions:

Use __m64 data only on the left-hand side of an assignment, as a return value, or as a

parameter. You cannot use it with other arithmetic expressions ("+", ">>", and so on).

Use

__m64 objects in aggregates, such as unions to access the byte elements and structures; the

address of an __m64 object may be taken.

Use __m64 data only with the MMX technology intrinsics described in this guide and

the

Intel C/C++ Compiler Users Guide for Win32* Systems With Streaming SIMD Extension

Support (Order Number 718195-00B). Refer to Appendix C, Compiler Intrinsics and Functional

Equivalents for more information on using intrinsics.

3.1.3.3.SIMD FLOATING-POINT INTRINSICS

The __m128 data type is used to represent the contents of an xmm register, which is either four

packed single-precision floating-point values or one scalar single-precision number. The

__m128 data type is not a basic ANSI C datatype and therefore some restrictions are placed on

its usage:

Use __m128 only on the left-hand side of an assignment, as a return value, or as a

parameter. Do not use it in other arithmetic expressions such as "+" and ">>".

Do not initialize __m128 with literals; there is no way to express 128-bit constants.

Use __m128 objects in aggregates, such as unions (for example, to access the float

elements) and structures. The address of an __m128 object may be taken.

Use __m128 data only with the intrinsics described in this users guide.

Refer to

Appendix C, Compiler Intrinsics and Functional Equivalents for more information on using

intrinsics.

The compiler aligns __m128 local data to 16B boundaries on the stack. Global __m128 data is

also 16B-aligned. (To align float arrays, you can use the alignment declspec described in the

following section.) Because the new instruction set treats the SIMD floating-point registers in

the same way whether you are using packed or scalar data, there is no __m32 datatype to repre-

sent scalar data as you might expect. For scalar operations, you should use the __m128 objects

and the scalar forms of the intrinsics; the compiler and the processor implement these opera-

tions with 32-bit memory references.

The suffixes ps and ss are used to denote packed single and scalar single precision opera-

tions. The packed floats are represented in right-to-left order, with the lowest word (right-most)

being used for scalar operations: [z, y, x, w]. To explain how memory storage reflects this,

consider the following example.

Index