Skip to content

chore: type checking pattern folder#932

Open
xieofxie wants to merge 6 commits into
mainfrom
hualxie/type_pattern
Open

chore: type checking pattern folder#932
xieofxie wants to merge 6 commits into
mainfrom
hualxie/type_pattern

Conversation

@xieofxie

@xieofxie xieofxie commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Cleans the pattern folder (including op_input_gen) to 0 mypy errors under the strict [tool.mypy] config and adds winml.modelkit.pattern to the required "Type check (required)" gate in lint.yml (now 24 packages, 342 source files).

The cardinal preference throughout: narrow before you widen — fix at the source, not the call site. Two places deviate from that and instead carry a deliberate, documented choice. They're called out below because they're the non-obvious ones a reviewer should weigh in on.

Design decision 1 — object (not Any) value type for get_input_and_infinite_attribute_combinations

The op input generators return heterogeneous combination dicts:

def get_input_and_infinite_attribute_combinations(self) -> list[dict[str, object]]:
    return [
        {"data": InputShapeConstraint((1,)), "shape": InputValueConstraint(...)},
        {"input": InputShapeConstraint(shape), "axis": axis},  # int values too
        ...
    ]

Values are a mix of InputShapeConstraint, InputValueConstraint, int, None, etc. — there is no single common base. The choice was between dict[str, Any] and dict[str, object]; object was chosen (the more type-safe option):

  • object forces every consumer to narrow with isinstance/cast before dereferencing — mypy will not let you call .shape / .value on a bare object. That's the point: the dict is genuinely heterogeneous, so the read sites should be explicit about which variant they expect.
  • Any would silently disable checking at every read site — a value typed Any accepts any attribute access, so a real bug (wrong key, wrong variant) survives type-checking.

Cost of object (worth knowing): because dict is invariant in its value type, dict[str, object] is not interchangeable with dict[str, InputConstraint]. That forced explicit local annotations (combinations: list[dict[str, object]] = [...]) and isinstance/cast narrowing at the consumption sites across ~48 generator files. We accepted that churn in exchange for the consumers staying honest.

Why not a precise union — InputConstraint | int | None | …? InputConstraint is a real ABC base (covering InputValueConstraint / InputShapeConstraint / VariadicInputConstraint), so a union looks tempting. It loses to object for four reasons:

  1. The set isn't closed. Constraints are covered, but the infinite-attribute values are open — axis is int, other ops put float / bool / str / lists, and get_finite_attribute_sets is genuinely typed dict[str, list[Any]]. A correct union would have to enumerate every scalar any of ~48 generators ever inserts, and stay synced forever; miss one and that generator stops type-checking. object is the honest "this set is open."
  2. It buys nothing at the read sites — the dominant cost. Even with InputConstraint | int | None, mypy still rejects v.shape without narrowing (int / None have no .shape), so every consumer still needs isinstance / cast. The union only tightens the write side; the read side — where the bulk of the burden is — is identical to object.
  3. dict value-invariance makes a union interoperate with nothing. dict[str, InputConstraint | int | None] is its own invariant type — not assignable to dict[str, object] nor dict[str, InputConstraint]. object is at least the conventional sink general-purpose helpers are written against.
  4. A type A = … alias has a literal-inference catch. Under list[dict[str, A]], mypy checks each dict-literal value against A; the first generator with a value outside A errors at that literal, making the alias a perpetually-tightened bottleneck across 48 files. object accepts every literal with zero per-file coupling.

In short: if these dicts held only constraints, dict[str, InputConstraint] would be the correct tight type. They mix constraints with open-ended scalar attribute values, so the real choice is Any vs object — and object wins by forcing honest narrowing.

A Sequence[Mapping[str, object]] return (covariant, read-only) was also considered and is arguably cleaner, but it's a broader change to the consumer contracts; deferred to keep this PR a typing-only pass.

Design decision 2 — # type: ignore[misc] on the dynamic base class in gelu_patterns.py

class Gelu1PatternInputGenerator(PatternInputGenerator, get_runtime_checker_op("Gelu")):  # type: ignore[misc]  # dynamic base class (runtime-checker op)
    ...

get_runtime_checker_op("Gelu") is a runtime registry lookup that returns type[OpInputGenerator]. mypy cannot model a class whose base is a runtime call expression — it can't compute the MRO / inherited members / super() resolution at definition time. This is the documented "Unsupported dynamic base class" limitation; [misc] is the error code for it.

Casting does not help (verified):

Attempt Result
get_runtime_checker_op("Gelu") directly Unsupported dynamic base class [misc]
cast("type[OpInputGenerator]", ...) as base Unsupported dynamic base class "cast" [misc]
typed module var _Base: type[OpInputGenerator] = ... Variable "_Base" is not valid as a type + Invalid base class [misc]

The return type is already precisely type[OpInputGenerator] — there is nothing to cast. The [misc] is about using a runtime value as a base class at all, not about its type being unknown. cast() is itself a call expression, so it's rejected identically.

The only way to remove the ignore is to drop the dynamic base and inherit from the concrete class statically (from .op_input_gen.unary_like_input_generator import GeluInputGenerator). That was rejected on purpose: it couples the pattern module to a concrete class + import path and abandons the registry's late-binding and domain resolution (get_runtime_checker_op("Gelu", domain="com.microsoft") → a different class). The registry indirection is the intended architecture, so the narrow, single-line # type: ignore[misc] — with an inline reason — is the correct handling.

@xieofxie xieofxie marked this pull request as ready for review June 23, 2026 04:06
@xieofxie xieofxie requested a review from a team as a code owner June 23, 2026 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant