目录

「SIMD」更快更高更强

记录并整理单指令多数据技术(Single Instruction Multiple Data, SIMD)及实践知识。

简介

SIMD

Streaming SIMD Extensions (SSE)

Advanced Vector Extensions (AVX)

NEON

Neon is the implementation of Arm’s Advanced SIMD architecture.

The purpose of Neon is to accelerate data manipulation by providing:

  • Thirty-two 128-bit vector registers, each capable of containing multiple lanes of data.
  • SIMD instructions to operate simultaneously on those multiple lanes of data.

Applications that can benefit from Neon technology include multimedia and signal processing, 3D graphics, speech, image processing, or other applications where fixed and floating-point performance is critical.

As a programmer, there are a number of ways you can make use of Neon technology:

  • Neon-enabled open source libraries such as the Arm Compute Library provide one of the easiest ways to take advantage of Neon.
  • Auto-vectorization features in your compiler can automatically optimize your code to take advantage of Neon.
  • Neon intrinsics are function calls that the compiler replaces with appropriate Neon instructions. This gives you direct, low-level access to the exact Neon instructions you want, all from C, or C++ code.
  • For very high performance, hand-coded Neon assembler can be the best approach for experienced programmers.
assembler
intrinsics
automatic vectorization
NEON optimized libraries

Practice