Auto-vectorization in AArch64.

- November 23, 2022

Auto-vectorization:

Hi everyone, I am writing a blog about concept of automatic vectorization in parallel computing which can help the reduction of cycles and chains in loops. It means that instead of scalar implementation, code can be converted to perform vector operations which means performing a single operation on multiple operands. GCC compiler is advanced enough to perform vectorization to optimize the code and boost performance. To learn more about auto-vectorization please visit this link.

It can be achieved by using optimization flags such as -03, -ftree-vectorize. There are three extensions to AArch64 - SMID (Single Instruction, Multi Data), SVE, SVE2.

SMID:
Single Instruction, Multi Data is a type of parallel processing SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously. To learn more about SMID please visit link
We can build the program using following commands for armv8 system-

gcc -g -O3 -c march=armv8-a ...

SVE & SVE2:

The SVE stands for Scalable Vector Extension, Scalable Vector Extension 2 is an armv9 extension to provide a variable-width SMID capability. SVE2 is minor refinement of SVE, but the main difference between SVE2 and SVE is the functional coverage of the instruction set. SVE and SVE2 both enable the collection and processing of a large amount of data. To learn more about SVE and SVE2 visit this link.

To use SVE we can use the following command for aarch64 system.

gcc -g -O3 -c march=armv8-a+sve ...

To use SVE2 we can use the following command.

gcc -g -O3 -c march=armv8-a+sve2 ...

Some older system are not compatible with SVE or SVE2. But can use the emulator to build programs with SVE, use the following command to use the emulator - qemu-aarch64.

Search This Blog

SPO600