Skip to content

feat(hygon-gemm): add Hygon backend support for Add/Gemm#31

Open
gongchensu wants to merge 4 commits intoInfiniTensor:masterfrom
gongchensu:feat/hygon-gemm
Open

feat(hygon-gemm): add Hygon backend support for Add/Gemm#31
gongchensu wants to merge 4 commits intoInfiniTensor:masterfrom
gongchensu:feat/hygon-gemm

Conversation

@gongchensu
Copy link
Copy Markdown
Contributor

@gongchensu gongchensu commented Mar 24, 2026

Summary

  • Add Hygon backend infrastructure:
    • src/hygon/device_.h
    • src/hygon/runtime_.h
    • src/hygon/runtime_utils.h
    • src/hygon/blas.h
    • src/hygon/blas_utils.h
  • Add Hygon build integration through WITH_HYGON:
    • Update backend/device detection in CMakeLists.txt.
    • Register Hygon sources and backend wiring in src/CMakeLists.txt.
    • Preserve the existing CUDA-like backend mutual-exclusion behavior.
  • Add Hygon operator implementations:
    • Add in src/hygon/add/kernel.h.
    • Gemm in src/hygon/gemm/cublas.h.
  • Update Python binding device-name resolution in src/pybind11_utils.h so CUDA-compatible PyTorch device names can map to the active InfiniOps backend, including Hygon.
  • Update Hygon-facing example and documentation hooks:
    • examples/CMakeLists.txt
    • examples/runtime_api.h
    • README.md
  • Update .gitignore for generated local build artifacts.

Motivation

This PR introduces initial Hygon backend support to InfiniOps.

Hygon DCU platforms expose a CUDA/HIP-compatible programming model, so InfiniOps can reuse the existing CUDA-style operator organization while adding Hygon-specific runtime, device, BLAS, and build-system integration. This enables InfiniOps to compile and dispatch selected operators on Hygon hardware without changing the public operator API.

The first supported operators are Add and Gemm, which provide a minimal but useful backend foundation:

  • Add validates the basic Hygon kernel dispatch path.
  • Gemm validates the Hygon BLAS integration path.
  • The shared Hygon runtime/device utilities prepare the backend for additional operators in follow-up PRs.
  • The Python binding update allows user-facing PyTorch device names to resolve to the active compiled backend, including Hygon.

This PR is intentionally scoped to backend infrastructure plus Add and Gemm. Follow-up PRs can add more Hygon operator implementations on top of the same backend layer.

No linked issue.

Type of Change

  • feat — new feature / new operator / new platform
  • fix — bug fix
  • perf — performance improvement (no behavioral change)
  • refactor — code restructuring without behavior change
  • test — adding or fixing tests only
  • docs — documentation only
  • build / ci — build system or CI configuration
  • chore — tooling, formatting, or other non-code changes
  • Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

  • CPU (WITH_CPU)
  • NVIDIA (WITH_NVIDIA)
  • Iluvatar (WITH_ILUVATAR)
  • MetaX (WITH_METAX)
  • Cambricon (WITH_CAMBRICON)
  • Moore (WITH_MOORE)
  • Ascend (WITH_ASCEND)
  • Hygon (WITH_HYGON)
  • PyTorch C++ bindings (WITH_TORCH)
  • Build system / CMake / CI
  • Python bindings / user-facing API

Test Results on Supported Platforms

Platform Built pytest Result Notes / Hardware
Hygon TODO: paste build result, e.g. Successfully installed InfiniOps-0.1.0 TODO: paste result for tests/test_add.py and tests/test_gemm.py TODO: Hygon DCU model / driver / runtime version
clang-format N/A Passed: git ls-files | rg '\.(h|cc|cuh|mlu)$' | xargs clang-format --dry-run --Werror Local clang-format version 21.1.2
Full `pytest` output (optional)
TODO: paste full or trimmed pytest output here.

Benchmark / Performance Impact

No benchmark numbers are included in this PR.

This PR adds initial Hygon backend support and validates functionality through Add and Gemm. It does not change existing CPU, NVIDIA, Iluvatar, MetaX, Cambricon, Moore, or Ascend implementations. Performance tuning for Hygon kernels and BLAS usage can be handled in follow-up PRs once the backend integration is established.

Notes for Reviewers

  • The Hygon backend is added as a CUDA-like backend and participates in the existing CUDA-like backend mutual-exclusion logic.
  • src/hygon/gemm/cublas.h intentionally follows the existing BLAS-backed GEMM structure used by CUDA-like platforms.
  • src/pybind11_utils.h now accepts backend-specific internal device names in addition to PyTorch device names, so a CUDA-compatible PyTorch device type can resolve to the enabled InfiniOps backend.
  • This PR is split into three reviewable commits:
    • feat(hygon): add Hygon backend infrastructure
    • feat(hygon-add): add Hygon backend support for Add
    • feat(hygon-gemm): add Hygon backend support for Gemm

Checklist

Every contributor must verify every item below before requesting
review. Tick each box only after the check has actually been performed —
do not tick speculatively. If an item truly does not apply, replace the
checkbox with N/A and briefly explain why in an inline comment.

Title, Branch, and Commits

  • PR title follows Conventional Commits (e.g. feat(hygon): add Add and Gemm backend support).
  • Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
  • Each commit message follows Conventional Commits.
  • Small PR is a single squashable commit; or, for a large PR, every commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests feat: support GEMM on CPU & MetaX and add generic dispatcher #1).
  • No stray merge commits from master — the branch is rebased cleanly on top of the current master.
  • No fixup! / squash! / wip commits remain.

Scope and Design

  • Changes are minimal — this PR only adds Hygon backend infrastructure plus Hygon Add and Gemm support (CONTRIBUTING.md §Code/General feat: support GEMM on CPU & MetaX and add generic dispatcher #1).
  • No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
  • No unrelated formatting churn that would obscure the diff.
  • Public API changes are intentional, documented, and reflected in affected callers/tests.

General Code Hygiene (applies to all languages)

C++ Specific (if C++ files changed)

Python Specific (if Python files changed)

  • N/A — no Python source files were changed in this PR.
  • N/A — ruff check is not applicable to this PR because no Python source files were changed.
  • N/A — ruff format --check is not applicable to this PR because no Python source files were changed.
  • N/A — no Python comments or error messages were added.
  • N/A — no Python control-flow formatting was changed.
  • N/A — no Python docstrings were added or changed.
  • N/A — no Python type hints were added or changed.

Testing

  • pytest was run locally on Hygon hardware, and the results are recorded in the "Test Results" table above (CONTRIBUTING.md §Pull Requests feat(gemm-iluvatar): add Iluvatar GEMM backend support #3).
  • For any platform that could not be tested, an explicit reason is given in the table and a reviewer with access has been tagged.
  • New functionality has matching coverage through existing tests under tests/, especially tests/test_add.py and tests/test_gemm.py (CONTRIBUTING.md §Adding an Operator feat(gemm-iluvatar): add Iluvatar GEMM backend support #3).
  • Tests use pytest.mark.parametrize correctly through the existing project test patterns.
  • Where appropriate, existing pytest.mark.auto_act_and_assert coverage is reused and the test returns a Payload whose func and ref share the same calling convention.
  • Default dtype / device parameterization is relied on, or overridden with an explicit pytest.mark.parametrize when necessary.
  • N/A — no new flaky test was added.
  • N/A — this is a new backend feature, not a bug fix requiring a regression test that fails on master.

Build, CI, and Tooling

  • The project builds cleanly from a fresh directory with pip install .[dev] on Hygon.
  • compile_commands.json still regenerates (CMake option CMAKE_EXPORT_COMPILE_COMMANDS=ON in pyproject.toml — required by the code-lint skill and clang-tidy -p).
  • Hygon has been added to auto-detection in CMakeLists.txt under if(AUTO_DETECT_DEVICES) and to if(AUTO_DETECT_BACKENDS) where applicable.
  • Only one CUDA-like GPU backend is selectable at a time — the existing mutual-exclusion check in CMakeLists.txt is not broken.
  • clang-format.yml is green locally.
  • N/A — ruff.yml is not applicable to this PR because no Python source files were changed.
  • No new runtime dependency was added without updating pyproject.toml's [project.optional-dependencies].

Documentation

  • README.md, examples, or inline docs were updated where Hygon behavior, build flags, or developer workflow changed.
  • New Hygon dispatch helpers and runtime utilities follow the existing backend layout and naming conventions.
  • N/A — no user-visible breaking change is introduced.

Security and Safety

  • No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
  • Third-party code is license-compatible and attributed where applicable.
  • No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were intentionally introduced.

@gongchensu gongchensu self-assigned this Mar 24, 2026
@gongchensu
Copy link
Copy Markdown
Contributor Author

gongchensu commented Mar 25, 2026

海光编译及算子测试:
image

@gongchensu gongchensu force-pushed the feat/hygon-gemm branch 2 times, most recently from 9b9dda2 to e397d93 Compare March 26, 2026 09:03
@gongchensu gongchensu marked this pull request as draft April 13, 2026 08:19
@gongchensu gongchensu changed the base branch from feat/dev-infra to master April 27, 2026 06:27
@gongchensu gongchensu requested a review from baominghelly April 27, 2026 08:40
@gongchensu gongchensu force-pushed the feat/hygon-gemm branch 2 times, most recently from aa6ed00 to 8fd111a Compare April 28, 2026 02:25
@gongchensu gongchensu marked this pull request as ready for review April 28, 2026 02:44
@gongchensu gongchensu requested a review from a team April 28, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant