pg_0840

Index

C-8

COMPILER INTRINSICS AND FUNCTIONAL EQUIVALENTS

PSRLW

__m64 _m_psrlw (__m64 m, __m64 count)

__m64 _mm_srl_pi16 (__m64 m, __m64 count)

Shift four 16-bit values in m right the

amount specified by count while shifting in

zeroes.

__m64 _m_psrlwi (__m64 m, int count)

__m64 _mm_srli_pi16(__m64 m, int count)

Shift four 16-bit values in m right the

amount specified by count while shifting in

zeroes. For the best performance, count

should be a constant.

PSRLD

__m64 _m_psrld (__m64 m, __m64 count)

__m64 _mm_srl_pi32 (__m64 m, __m64 count)

Shift two 32-bit values in m right the

amount specified by count while shifting in

zeroes.

__m64 _m_psrldi (__m64 m, int count)

__m64 _mm_srli_pi32 (__m64 m, int count)

Shift two 32-bit values in m right the

amount specified by count while shifting in

zeroes. For the best performance, count

should be a constant.

PSRLQ

__m64 _m_psrlq (__m64 m, __m64 count)

__m64 _mm_srl_si64 (__m64 m, __m64 count)

Shift the 64-bit value in m right the amount

specified by count while shifting in zeroes.

__m64 _m_psrlqi (__m64 m, int count)

__m64 _mm_srli_si64 (__m64 m, int count)

Shift the 64-bit value in m right the amount

specified by count while shifting in zeroes.

For the best performance, count should be

a constant.

PSUBB

__m64 _m_psubb(__m64 m1, __m64 m2)

__m64 _mm_sub_pi8(__m64 m1, __m64 m2)

Subtract the eight 8-bit values in m2 from

the eight 8-bit values in m1.

PSUBW

__m64 _m_psubw(__m64 m1, __m64 m2)

__m64 _mm_sub_pi16(__m64 m1, __m64 m2)

Subtract the four 16-bit values in m2 from

the four 16-bit values in m1.

PSUBD

__m64 _m_psubd(__m64 m1, __m64 m2)

__m64 _mm_sub_pi32(__m64 m1, __m64 m2)

Subtract the two 32-bit values in m2 from

the two 32-bit values in m1.

PSUBSB__m64 _m_psubsb(__m64 m1, __m64 m2)

__m64 _mm_subs_pi8(__m64 m1, __m64 m2)

Subtract the eight signed 8-bit values in m2

from the eight signed 8-bit values in m1 and

saturate.

PSUBSW__m64 _m_psubsw(__m64 m1, __m64 m2)

__m64 _mm_subs_pi16(__m64 m1, __m64 m2)

Subtract the four signed 16-bit values in m2

from the four signed 16-bit values in m1

and saturate.

PSUBUSB__m64 _m_psubusb(__m64 m1, __m64 m2)

__m64 _mm_sub_pu8(__m64 m1, __m64 m2)

Subtract the eight unsigned 8-bit values in

m2 from the eight unsigned 8-bit values in

m1 and saturate.

PSUBUSW__m64 _m_psubusw(__m64 m1, __m64 m2)

__m64 _mm_sub_pu16(__m64 m1, __m64 m2)

Subtract the four unsigned 16-bit values in

m2 from the four unsigned 16-bit values in

m1 and saturate.

PUNPCKHBW__m64 _m_punpckhbw (__m64 m1, __m64 m2)

__m64 _mm_unpackhi_pi8(__m64 m1, __m64 m2)

Interleave the four 8-bit values from the

high half of m1 with the four values from the

high half of m2 and take the least

significant element from m1.

PUNPCKHWD__m64 _m_punpckhwd (__m64 m1, __m64 m2)

__m64 _mm_unpackhi_pi16(__m64 m1,__m64 m2)

Interleave the two 16-bit values from the

high half of m1 with the two values from the

high half of m2 and take the least

significant element from m1.

PUNPCKHDQ__m64 _m_punpckhdq (__m64 m1, __m64 m2)

__m64 _mm_unpackhi_pi32(__m64 m1, __m64 m2)

Interleave the 32-bit value from the high

half of m1 with the 32-bit value from the

high half of m2 and take the least

significant element from m1.

PUNPCKLBW__m64 _m_punpcklbw (__m64 m1, __m64 m2)

__m64 _mm_unpacklo_pi8 (__m64 m1, __m64 m2)

Interleave the four 8-bit values from the low

half of m1 with the four values from the low

half of m2 and take the least significant

element from m1.

Table C-1. Simple Intrinsics

Mnemonic

Intrinsic

Description

Index