Skip to content

Add SVE2 BgrToYuv420pV2 optimization#423

Merged
ermig1979 merged 2 commits into
devfrom
cursor/sve2-bgr-yuv420p-v2-3523
Jun 13, 2026
Merged

Add SVE2 BgrToYuv420pV2 optimization#423
ermig1979 merged 2 commits into
devfrom
cursor/sve2-bgr-yuv420p-v2-3523

Conversation

@ermig1979

@ermig1979 ermig1979 commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add SVE2 implementation for BgrToYuv420pV2.
  • Route SimdBgrToYuv420pV2 through SVE2 when available.
  • Add SVE2 coverage to BgrToYuv420pV2AutoTest, VS project entries, and release notes.
  • Fix SVE2 chroma averaging to use adjacent byte lanes for 2x2 UV samples.

Validation

  • cmake ./prj/cmake -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=gcc -DSIMD_TOOLCHAIN="g++" -DSIMD_TARGET="" -DSIMD_AVX512VNNI=ON -DSIMD_AMXBF16=ON -DSIMD_TEST_FLAGS="-march=native" -DSIMD_SHARED=ON && cmake --build build --parallel$(nproc)
  • export LD_LIBRARY_PATH="$(pwd):$LD_LIBRARY_PATH" && ./Test "-r=.." -fi=BgrToYuv420pV2 -tt=1 -ts=1
  • QEMU AArch64/SVE2 standalone Base-vs-SVE2 comparison for 640x480: errors=0
  • QEMU AArch64/SVE2 matrix comparison across tails and all YUV types: cases=148 errors=0
  • aarch64-linux-gnu-g++ -fsyntax-only -std=c++17 -march=armv9-a+sve2 -msve-vector-bits=scalable -I./src ./src/Simd/SimdSve2BgrToYuvV2.cpp
  • aarch64-linux-gnu-g++ -fsyntax-only -std=c++17 -march=armv9-a+sve2 -msve-vector-bits=scalable -I./src ./src/Simd/SimdLib.cpp
  • aarch64-linux-gnu-g++ -fsyntax-only -std=c++17 -march=armv9-a+sve2 -msve-vector-bits=scalable -I./src ./src/Test/TestAnyToYuv.cpp
Open in Web Open in Cursor 

cursoragent and others added 2 commits June 13, 2026 16:09
Co-authored-by: Ihar Yermalayeu <ermig1979@gmail.com>
Co-authored-by: Ihar Yermalayeu <ermig1979@gmail.com>
@ermig1979 ermig1979 marked this pull request as ready for review June 13, 2026 16:36
@ermig1979 ermig1979 merged commit 885910a into dev Jun 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants