Skip to content

fix(perf): resolve concrete EP for the analyzer instead of ep=None (#931)#941

Merged
xieofxie merged 1 commit into
mainfrom
hualxie/perf_analyze_ep
Jun 23, 2026
Merged

fix(perf): resolve concrete EP for the analyzer instead of ep=None (#931)#941
xieofxie merged 1 commit into
mainfrom
hualxie/perf_analyze_ep

Conversation

@xieofxie

Copy link
Copy Markdown
Contributor

Summary

Fixes #931.

In winml perf without --ep, the EP was never resolved before the build, so the static analyzer ran with ep=None and aggregated across all EPs (logging analyze_onnx called with ep=None — results will aggregate all EPs), even though inference runs on a single device's EP.

This resolves a concrete device + EP from the request and passes it down to the build:

  • PerfBenchmark resolves device + EP internally (_resolve_device_ep), at the start of _load_model so an unavailable/invalid combo fails fast before the export/optimize/quantize/compile pipeline runs (previously it only surfaced at session.compile()). BenchmarkConfig keeps only the raw request; the resolved values live on the instance and drive from_pretrained/from_onnx.
  • _perf_modules derives a concrete EP from the resolved device when none is given. Explicit EPs are kept verbatim (downstream stages normalize aliases).
  • The CLI perf() no longer pre-resolves — it just builds the config and dispatches.

WinMLAutoModel stays permissive: ep=None remains a valid library mode (aggregate across EPs). The fix makes perf — which targets one device — always pass an explicit EP, which is what the analyzer warning asked for.

Verification

End-to-end A/B on a real CPU build (--no-skip-build, no --ep):

analyzer target ep=None warning
before None on cpu fires
after OpenVINOExecutionProvider on cpu gone

Unit tests (tests/unit/commands/test_perf_cli.py, test_perf_module.py, 51 passed) cover: derived concrete EP reaches from_onnx; explicit EP passes through verbatim; an unavailable combo raises before the build is kicked off.

Follow-up

#939 tracks folding _perf_modules into PerfBenchmark so the two resolution sites become one.

)

In `winml perf` without `--ep`, the EP was never resolved before the build,
so the static analyzer ran with `ep=None` and aggregated across all EPs (and
logged "analyze_onnx called with ep=None — results will aggregate all EPs"),
even though inference runs on a single device's EP.

Resolve a concrete device + EP from the request and pass it down to the build:

- PerfBenchmark resolves device + EP internally (_resolve_device_ep), at the
  start of _load_model so an unavailable/invalid combo fails fast before the
  export/optimize/quantize/compile pipeline runs. BenchmarkConfig keeps only the
  raw request; the resolved values live on the instance and drive from_pretrained
  / from_onnx.
- _perf_modules derives a concrete EP from the resolved device when none is given
  (explicit EPs are kept verbatim; downstream stages normalize aliases).
- The CLI no longer pre-resolves: it just builds the config / dispatches.

Verified end-to-end on a CPU build: analyzer target goes from "None on cpu" to
"OpenVINOExecutionProvider on cpu" and the ep=None warning no longer fires.

Follow-up #939 tracks folding _perf_modules into PerfBenchmark to unify the two
resolution sites.
@xieofxie xieofxie requested a review from a team as a code owner June 23, 2026 06:06

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix. The EP resolution logic is correct, idempotent, and placed at the right point (before the expensive build pipeline). Tests cover all the important cases: derived EP reaches from_onnx, explicit EP passes verbatim, unavailable combo fails fast. One minor nit: when resolve_eps returns [], _resolved_ep silently stays None — a comment making that invariant explicit would help future readers.

Comment thread src/winml/modelkit/commands/perf.py
Comment thread src/winml/modelkit/commands/perf.py
@xieofxie xieofxie merged commit 6977294 into main Jun 23, 2026
9 checks passed
@xieofxie xieofxie deleted the hualxie/perf_analyze_ep branch June 23, 2026 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: in perf, no ep means analyze all but actually only run one ep

2 participants