feat(hygon-gemm): add Hygon backend support for Add/Gemm by gongchensu · Pull Request #31 · InfiniTensor/InfiniOps

gongchensu · 2026-03-24T01:50:34Z

Summary

Add Hygon backend infrastructure:
- src/hygon/device_.h
- src/hygon/runtime_.h
- src/hygon/runtime_utils.h
- src/hygon/blas.h
- src/hygon/blas_utils.h
Add Hygon build integration through WITH_HYGON:
- Update backend/device detection in CMakeLists.txt.
- Register Hygon sources and backend wiring in src/CMakeLists.txt.
- Preserve the existing CUDA-like backend mutual-exclusion behavior.
Add Hygon operator implementations:
- Add in src/hygon/add/kernel.h.
- Gemm in src/hygon/gemm/cublas.h.
Update Python binding device-name resolution in src/pybind11_utils.h so CUDA-compatible PyTorch device names can map to the active InfiniOps backend, including Hygon.
Update Hygon-facing example and documentation hooks:
- examples/CMakeLists.txt
- examples/runtime_api.h
- README.md
Update .gitignore for generated local build artifacts.

Motivation

This PR introduces initial Hygon backend support to InfiniOps.

Hygon DCU platforms expose a CUDA/HIP-compatible programming model, so InfiniOps can reuse the existing CUDA-style operator organization while adding Hygon-specific runtime, device, BLAS, and build-system integration. This enables InfiniOps to compile and dispatch selected operators on Hygon hardware without changing the public operator API.

The first supported operators are Add and Gemm, which provide a minimal but useful backend foundation:

Add validates the basic Hygon kernel dispatch path.
Gemm validates the Hygon BLAS integration path.
The shared Hygon runtime/device utilities prepare the backend for additional operators in follow-up PRs.
The Python binding update allows user-facing PyTorch device names to resolve to the active compiled backend, including Hygon.

This PR is intentionally scoped to backend infrastructure plus Add and Gemm. Follow-up PRs can add more Hygon operator implementations on top of the same backend layer.

No linked issue.

Type of Change

feat — new feature / new operator / new platform
fix — bug fix
perf — performance improvement (no behavioral change)
refactor — code restructuring without behavior change
test — adding or fixing tests only
docs — documentation only
build / ci — build system or CI configuration
chore — tooling, formatting, or other non-code changes
Breaking change (requires a ! in the Conventional Commits prefix or a BREAKING CHANGE: footer)

Platforms Affected

Test Results on Supported Platforms

Platform	Built	`pytest` Result	Notes / Hardware
Hygon	TODO: paste build result, e.g. `Successfully installed InfiniOps-0.1.0`	TODO: paste result for `tests/test_add.py` and `tests/test_gemm.py`	TODO: Hygon DCU model / driver / runtime version
clang-format	N/A	Passed: `git ls-files \| rg '\.(h\|cc\|cuh\|mlu)$' \| xargs clang-format --dry-run --Werror`	Local `clang-format` version `21.1.2`

Full `pytest` output (optional)

TODO: paste full or trimmed pytest output here.

Benchmark / Performance Impact

No benchmark numbers are included in this PR.

This PR adds initial Hygon backend support and validates functionality through Add and Gemm. It does not change existing CPU, NVIDIA, Iluvatar, MetaX, Cambricon, Moore, or Ascend implementations. Performance tuning for Hygon kernels and BLAS usage can be handled in follow-up PRs once the backend integration is established.

Notes for Reviewers

The Hygon backend is added as a CUDA-like backend and participates in the existing CUDA-like backend mutual-exclusion logic.
src/hygon/gemm/cublas.h intentionally follows the existing BLAS-backed GEMM structure used by CUDA-like platforms.
src/pybind11_utils.h now accepts backend-specific internal device names in addition to PyTorch device names, so a CUDA-compatible PyTorch device type can resolve to the enabled InfiniOps backend.
This PR is split into three reviewable commits:
- feat(hygon): add Hygon backend infrastructure
- feat(hygon-add): add Hygon backend support for Add
- feat(hygon-gemm): add Hygon backend support for Gemm

Checklist

Every contributor must verify every item below before requesting
review. Tick each box only after the check has actually been performed —
do not tick speculatively. If an item truly does not apply, replace the
checkbox with N/A and briefly explain why in an inline comment.

Title, Branch, and Commits

PR title follows Conventional Commits (e.g. feat(hygon): add Add and Gemm backend support).
Branch name follows <type>/xxx-yyyy-zzzz where <type> matches the PR title's Conventional Commits type and words are joined with hyphens (see CONTRIBUTING.md §Branches).
Each commit message follows Conventional Commits.
Small PR is a single squashable commit; or, for a large PR, every commit is meaningful, well-formed, and independently reviewable (see CONTRIBUTING.md §Pull Requests feat: support GEMM on CPU & MetaX and add generic dispatcher #1).
No stray merge commits from master — the branch is rebased cleanly on top of the current master.
No fixup! / squash! / wip commits remain.

Scope and Design

Changes are minimal — this PR only adds Hygon backend infrastructure plus Hygon Add and Gemm support (CONTRIBUTING.md §Code/General feat: support GEMM on CPU & MetaX and add generic dispatcher #1).
No dead code, commented-out blocks, debug prints, printf/std::cout/print(...) left behind, or TODO without an owner and issue link.
No unrelated formatting churn that would obscure the diff.
Public API changes are intentional, documented, and reflected in affected callers/tests.

General Code Hygiene (applies to all languages)

The code is self-explanatory; comments were added only where the why is non-obvious (CONTRIBUTING.md §Code/General build: add CMake build system and README #2).
Every modified or added file ends with a single trailing newline (CONTRIBUTING.md §Code/General feat(gemm-iluvatar): add Iluvatar GEMM backend support #3).
No trailing whitespace, tab/space mixing, or stray BOMs.
Identifiers in comments and error messages are wrapped in backticks where applicable (e.g. the `seqlens_k` tensor) (CONTRIBUTING.md §Code/General feat: add the implementation of Add operator on CPU, NVIDIA, and MetaX #4).
All comments and error messages are in English (CONTRIBUTING.md §Code/General refactor: adapt dispatcher for full C++17 compatibility and support pip install on MetaX #5).
Comments and error messages are complete sentences — capitalized first letter, terminal punctuation — unless the language/framework convention says otherwise (CONTRIBUTING.md §Code/General feat(ops): add RmsNorm with Iluvatar, NVIDIA, CPU backends and fp16/bf16 support #6; Python build: add CMake build system and README #2).

C++ Specific (if C++ files changed)

Python Specific (if Python files changed)

N/A — no Python source files were changed in this PR.
N/A — ruff check is not applicable to this PR because no Python source files were changed.
N/A — ruff format --check is not applicable to this PR because no Python source files were changed.
N/A — no Python comments or error messages were added.
N/A — no Python control-flow formatting was changed.
N/A — no Python docstrings were added or changed.
N/A — no Python type hints were added or changed.

Testing

pytest was run locally on Hygon hardware, and the results are recorded in the "Test Results" table above (CONTRIBUTING.md §Pull Requests feat(gemm-iluvatar): add Iluvatar GEMM backend support #3).
For any platform that could not be tested, an explicit reason is given in the table and a reviewer with access has been tagged.
New functionality has matching coverage through existing tests under tests/, especially tests/test_add.py and tests/test_gemm.py (CONTRIBUTING.md §Adding an Operator feat(gemm-iluvatar): add Iluvatar GEMM backend support #3).
Tests use pytest.mark.parametrize correctly through the existing project test patterns.
Where appropriate, existing pytest.mark.auto_act_and_assert coverage is reused and the test returns a Payload whose func and ref share the same calling convention.
Default dtype / device parameterization is relied on, or overridden with an explicit pytest.mark.parametrize when necessary.
N/A — no new flaky test was added.
N/A — this is a new backend feature, not a bug fix requiring a regression test that fails on master.

Build, CI, and Tooling

The project builds cleanly from a fresh directory with pip install .[dev] on Hygon.
compile_commands.json still regenerates (CMake option CMAKE_EXPORT_COMPILE_COMMANDS=ON in pyproject.toml — required by the code-lint skill and clang-tidy -p).
Hygon has been added to auto-detection in CMakeLists.txt under if(AUTO_DETECT_DEVICES) and to if(AUTO_DETECT_BACKENDS) where applicable.
Only one CUDA-like GPU backend is selectable at a time — the existing mutual-exclusion check in CMakeLists.txt is not broken.
clang-format.yml is green locally.
N/A — ruff.yml is not applicable to this PR because no Python source files were changed.
No new runtime dependency was added without updating pyproject.toml's [project.optional-dependencies].

Documentation

README.md, examples, or inline docs were updated where Hygon behavior, build flags, or developer workflow changed.
New Hygon dispatch helpers and runtime utilities follow the existing backend layout and naming conventions.
N/A — no user-visible breaking change is introduced.

Security and Safety

No secrets, access tokens, internal URLs, customer data, or personal hardware identifiers have been committed.
Third-party code is license-compatible and attributed where applicable.
No unsafe pointer arithmetic, uninitialized reads, or missing bounds checks were intentionally introduced.

gongchensu · 2026-03-25T07:28:37Z

海光编译及算子测试：

gongchensu self-assigned this Mar 24, 2026

gongchensu force-pushed the feat/hygon-gemm branch from a56e674 to 2290578 Compare March 25, 2026 06:51

gongchensu force-pushed the feat/hygon-gemm branch 2 times, most recently from 9b9dda2 to e397d93 Compare March 26, 2026 09:03

gongchensu marked this pull request as draft April 13, 2026 08:19

gongchensu force-pushed the feat/hygon-gemm branch from e397d93 to c8d8b56 Compare April 27, 2026 06:26

gongchensu changed the base branch from feat/dev-infra to master April 27, 2026 06:27

gongchensu force-pushed the feat/hygon-gemm branch from c8d8b56 to ceb9a4c Compare April 27, 2026 07:53

gongchensu requested a review from baominghelly April 27, 2026 08:40

gongchensu force-pushed the feat/hygon-gemm branch 2 times, most recently from aa6ed00 to 8fd111a Compare April 28, 2026 02:25

gongchensu marked this pull request as ready for review April 28, 2026 02:44

gongchensu requested a review from a team April 28, 2026 02:44

gongchensu added 4 commits April 28, 2026 11:15

feat(hygon): add Hygon backend infrastructure

13e9119

ci(hygon): add Hygon test runner config

593ca40

feat(hygon-add): add Hygon backend support for Add

5c7c351

feat(hygon-gemm): add Hygon backend support for Gemm

b1117d4

gongchensu force-pushed the feat/hygon-gemm branch from 8fd111a to b1117d4 Compare April 28, 2026 03:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hygon-gemm): add Hygon backend support for Add/Gemm#31

feat(hygon-gemm): add Hygon backend support for Add/Gemm#31
gongchensu wants to merge 4 commits intoInfiniTensor:masterfrom
gongchensu:feat/hygon-gemm

gongchensu commented Mar 24, 2026 •

edited

Loading

Uh oh!

gongchensu commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gongchensu commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Type of Change

Platforms Affected

Test Results on Supported Platforms

Benchmark / Performance Impact

Notes for Reviewers

Checklist

Title, Branch, and Commits

Scope and Design

General Code Hygiene (applies to all languages)

C++ Specific (if C++ files changed)

Python Specific (if Python files changed)

Testing

Build, CI, and Tooling

Documentation

Security and Safety

Uh oh!

gongchensu commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gongchensu commented Mar 24, 2026 •

edited

Loading

gongchensu commented Mar 25, 2026 •

edited

Loading