C-4
COMPILER INTRINSICS AND FUNCTIONAL EQUIVALENTS
LDMXCSR_mm_setcsr(unsigned int i)
Sets the control register to the value
specified.
MASKMOVQvoid _m_maskmovq(__m64 d, __m64 n, char * p)
void _mm_maskmove_si64(__m64 d, __m64 n, char *p)
Conditionally store byte elements of d to
address p. The high bit of each byte in the
selector n determines whether the
corresponding byte in d will be stored.
MAXPS
__m128 _mm_max_ps(__m128 a, __m128 b)
Computes the maximums of the four SP FP
values of a and b.
MAXSS
__m128 _mm_max_ss(__m128 a, __m128 b)
Computes the maximum of the lower SP
FP values of a and b; the upper three SP
FP values are passed through from a.
MINPS
__m128 _mm_min_ps(__m128 a, __m128 b)
Computes the minimums of the four SP FP
values of a and b.
MINSS
__m128 _mm_min_ss(__m128 a, __m128 b)
Computes the minimum of the lower SP FP
values of a and b; the upper three SP FP
values are passed through from a.
MOVAPS__m128 _mm_load_ps(float * p)
Loads four SP FP values. The address
must be 16-byte-aligned.
void_mm_store_ps(float *p, __m128 a)
Stores four SP FP values. The address
must be 16-byte-aligned.
MOVHLPS__m128 _mm_movehl_ps(__m128 a, __m128 b)Moves the upper 2 SP FP values of b to the
lower 2 SP FP values of the result. The
upper 2 SP FP values of a are passed
through to the result.
MOVHPS__m128 _mm_loadh_pi(__m128 a, __m64 * p)
Sets the upper two SP FP values with 64
bits of data loaded from the address p; the
lower two values are passed through from
a.
void_mm_storeh_pi(__m64 * p, __m128 a)
Stores the upper two SP FP values of a to
the address p.
MOVLPS__m128 _mm_loadl_pi(__m128 a, __m64 *p)
Sets the lower two SP FP values with 64
bits of data loaded from the address p; the
upper two values are passed through from
a.
void_mm_storel_pi(__m64 * p, __m128 a)
Stores the lower two SP FP values of a to
the address p.
MOVLHPS__m128 _mm_movelh_ps(__m128 a, __m128 b)Moves the lower 2 SP FP values of b to the
upper 2 SP FP values of the result. The
lower 2 SP FP values of a are passed
through to the result.
MOVMSKPSint_mm_movemask_ps(__m128 a)
Creates a 4-bit mask from the most
significant bits of the four SP FP values.
MOVNTPSvoid_mm_stream_ps(float * p, __m128 a)
Stores the data in a to the address p
without polluting the caches. The address
must be 16-byte-aligned.
MOVNTQvoid_mm_stream_pi(__m64 * p, __m64 a)
Stores the data in a to the address p
without polluting the caches.
Table C-1. Simple Intrinsics
Mnemonic
Intrinsic
Description