Publican en el Visual C++ Team Blog, un inventario exhaustivo de las funcionalidades intrísecas que se han incorporado al Visual C++ 9 (2008).

Las funciones intrínsecas, permiten aprovechar a bajo nivel los juegos de instrucciones de procesadores CISC recientes, de forma más o menos portable. Así, mientras las compilaciones para x64 no soportan ensamblador en linea, las funciones intrínsecas siguen estando disponibles.

Parece ser también, que al compilador, y sobre todo al optimizador, les resulta más fácil manejarlas que el assembler inline, por lo que es probable que ofrezcan también un mejor rendimiento.

El listado de intrínsecas añadidas, es el siguiente:

SSE
CVTSI2SS – Converts a 64-bit signed integer to a floating point value and inserts it into a 128-bit parameter. Intrinsics: _mm_cvtsi64_ss
CVTSS2SI – Extracts a 32-bit floating point value and rounds it to a 64-bit integer. Intrinsics: _mm_cvtss_si64
CVTTSS2SI – Extracts a 32-bit floating point value and truncates it to a 64-bit integer. Intrinsics: _mm_cvttss_si64
SSE2
CVTSD2SI – Extracts the lowest 64-bit floating point value and rounds it to an integer. Intrinsics: _mm_cvtsd_si64
CVTSI2SD – Extracts the lowest 64-bit integer and converts it to a floating point value. Intrinsics: _mm_cvtsi64_sd
CVTTSD2SI – Extracts a 64-bit floating point value and truncates it to a 64-bit integer. Intrinsics: _mm_cvttsd_si64
MOVNTI – Writes 64 bits to a specified memory location. Intrinsics: _mm_stream_si64
MOVQ – Moves a 64-bit integer either to or from a 128-bit parameter. Intrinsics: _mm_cvtsi64_si128, _mm_cvtsi128_si64

SSE3
PABSB / PABSW / PABSD – Gets the absolute value of signed integers. Intrinsics: _mm_abs_epi8, _mm_abs_epi16, _mm_abs_epi32, _mm_abs_pi8, _mm_abs_pi16, _mm_abs_pi32
PALIGNR – Combines two parameters and right-shifts the result. Intrinsics: _mm_alignr_epi8, _mm_alignr_pi8
PHADDSW – Adds two parameters that contain 16-bit signed integers, saturating the result at the maximum value for 16 bits. Intrinsics: _mm_hadds_epi16, _mm_hadds_pi16
PHADDW / PHADDD – Adds two parameters that contain signed integers. Intrinsics: _mm_hadd_epi16, _mm_hadd_epi32, _mm_hadd_pi16, _mm_hadd_pi32
PHSUBSW – Subtracts two parameters that contain 16-bit signed integers, saturating the result at the maximum value for 16 bits. Intrinsics: _mm_hsubs_epi16, _mm_shubs_pi16
PHSUBW / PHSUBD – Subtracts two parameters that contain signed integers. Intrinsics: _mm_hsub_epi16, _mm_hsub_epi32, _mm_hsub_pi16, _mm_hsub_pi32
PMADDUBSW – Multiplies and adds together 8-bit integers. Intrinsics: _mm_maddubs_epi16, _mm_maddubs_pi16
PMULHRSW – Multiplies 16-bit signed integers and right shifts the results. Intrinsics: _mm_mulhrs_epi16, _mm_mulhrs_pi16
PSHUFB – Selects and shuffles 8-bit chunks from a 128-bit parameter. Intrinsics: _mm_shuffle_epi8, _mm_shuffle_pi8
PSIGNB / PSIGNW / PSIGND – Negates, zeroes, or preserves signed integers. Intrinsics: _mm_sign_epi8, _mm_sign_epi16, _mm_sign_epi32, _mm_sign_pi8, _mm_sign_pi16, _mm_sign_pi32

SSE4A
EXTRQ – Extracts specified bits from the parameter. Intrinsics: _mm_extract_si64, _mm_extracti_si64
INSERTQ – Inserts specified bits into a given parameter. Intrinsics: _mm_insert_si64, _mm_inserti_si64
MOVNTSD / MOVNTSS – Writes bits directly to a specified memory location without polluting the caches. Intrinsics: _mm_stream_sd, _mm_stream_ss

SSE4.1
DPPD / DPPS – Calculates the dot product of two parameters. Intrinsics: _mm_dp_pd, _mm_dp_ps
EXTRACTPS – Extracts a specified 32-bit floating point value from the parameter. Intrinsics: _mm_extract_ps
INSERTPS – Inserts a 32-bit integer into a 128-bit parameter and potentially zeroes out some bits. Intrinsics: _mm_insert_ps
MOVNTDQA – Loads 128 bits of data from a specified memory location. Intrinsics: _mm_stream_load_si128
MPSADBW – Calculates eight offset sums of absolute difference. Intrinsics: _mm_mpsadbw_epu8
PACKUSDW – Converts 32-bit signed integers to signed 16-bit integers using 16-bit saturation. Intrinsics: _mm_packus_epi32
PBLENDW / BLENDPD / BLENDPS / PBLENDVB / BLENDVPD / BLENDVPS – Blends two parameters together various chunk sizes. Intrinsics: _mm_blend_epi16, _mm_blend_pd, _mm_blend_ps, _mm_blendv_epi8, _mm_blendv_pd, _mm_blendv_ps
PCMPEQQ – Compares 64-bit integers for equality. Intrinsics: _mm_cmpeq_epi64
PEXTRB / PEXTRW / PEXTRD / PEXTRQ – Extracts an integer from the input parameter. Intrinsics: _mm_extract_epi8, _mm_extract_epi16, _mm_extract_epi32, _mm_extract_epi64
PHMINPOSUW – Selects the minimum 16-bit unsigned integer and determines its index. Intrinsics: _mm_minpos_epu16
PINSRB / PINSRD / PINSRQ – Inserts an integer into a 128-bit parameter. Intrinsics: _mm_insert_epi8, _mm_insert_epi32, _mm_insert_epi64
PMAXSB / PMAXSD – Takes signed integers from two parameters and selects the maximum. Intrinsics: _mm_max_epi8, _mm_max_epi32
PMAXUW / PMAXUD – Takes unsigned integers from two parameters and selects the maximum. Intrinsics: _mm_max_epu16, _mm_max_epu32
PMINSB / PMINSD – Takes signed integers from two parameters and selects the minimum. Intrinsics: _mm_min_epi8, _mm_min_epi32
PMINUW / PMINUD – Takes unsigned integers from two parameters and selects the minimum. Intrinsics: _mm_min_epu16, _mm_min_epu32
PMOVSXBW / PMOVSXBD / PMOVSXBQ / PMOVSXWD / PMOVSXWQ / PMOVSXDQ – Converts signed integers of one size to a larger size. Intrinsics: _mm_cvtepi8_epi16, _mm_cvtepi8_epi32, _mm_cvtepi8_epi64, _mm_cvtepi16_epi32, _mm_cvtepi16_epi64, _mm_cvtepi32_epi64
PMOVZXBW / PMOVZXBD / PMOVZXBQ / PMOVZXWD / PMOVZXWQ / PMOVZXDQ – Converts unsigned integers of one size to a larger size. Intrinsics: _mm_cvtepu8_epi16, _mm_cvtepu8_epi32, _mm_cvtepu8_epi64, _mm_cvtepu16_epi32, _mm_cvtepu16_epi64, _mm_cvtepu32_epi64
PMULDQ – Multiplies 32-bit signed integers and stores the result as 64-bit signed integers. Intrinsics: _mm_mul_epi32
PMULLUD – Multiplies 32-bit signed integers. Intrinsics: _mm_mullo_epi32
PTEST – Calculates a bitwise test of two 128-bit parameters and returns a value based on the CF and ZF bits of the CC flags register. Intrinsics: _mm_testc_si128¸ _mm_testnzc_si128, _mm_testz_si128
ROUNDPD / ROUNDPS – Rounds floating point values. Intrinsics: _mm_ceil_pd, _mm_ceil_ps, _mm_floor_pd, _mm_floor_ps, _mm_round_pd, _mm_round_ps
ROUNDSD / ROUNDSS – Combines two parameters, rounding a floating point value from one of them. Intrinsics: _mm_ceil_sd, _mm_ceil_ss, _mm_floor_sd, _mm_floor_ss, _mm_round_sd, _mm_round_ss

SSE4.2
CRC32 – Calculates the CRC-32C checksum of a parameter. Intrinsics: _mm_crc32_u8¸ _mm_crc32_u16, _mm_crc32_u32, _mm_crc32_u64
PCMPESTRI / PCMPESTRM – Compares two parameters of specified length. Intrinsics: _mm_cmpestra, _mm_cmpestrc, _mm_cmpestri, _mm_cmpestrm, _mm_cmpestro, _mm_cmpestrs, _mm_cmpestrz
PCMPGTQ – Compares two parameters. Intrinsics: _mm_cmpgt_epi64
PCMPISTRI / PCMPISTRM – Compares two parameters. Intrinsics: _mm_cmpistra, _mm_cmpistrc, _mm_cmpistri, _mm_cmpistrm, _mm_cmpistro, _mm_cmpistrs, _mm_cmpistrz
POPCNT – Counts the number of bits set to 1. Intrinsics: _mm_popcnt_u32, _mm_popcnt_u64, __popcnt16, __popcnt, __popcnt64

Advanced Bit Manipulation
LZCNT – Counts the number of zeroes at the start of a parameter. Intrinsics: __lzcnt16, __lzcnt, __lzcnt64
POPCNT – Counts the number of bits set to 1. Intrinsics: _mm_popcnt_u32, _mm_popcnt_u64, __popcnt16, __popcnt, __popcnt64

Other new intrinsics
_InterlockedCompareExchange128 – Compares two parameters.
_mm_castpd_ps / _mm_castpd_si128 / _mm_castps_pd / _mm_castps_si128 / _mm_castsi128_pd / _mm_castsi128_ps – Reinterprets between 32-bit floating point values (ps), 64-bit floating point values (pd), and 32-bit integers (si128).
_mm_cvtsd_f64 – Extracts the lowest 64-bit floating point value from the parameter.
_mm_cvtss_f32 – Extracts a 32-bit floating point value.
_rdtscp – Generates RDTSCP. Writes TSC AUX[31:0] to memory and returns the 64-bit Time Stamp Counter result.