Skip to content

actionGrammar: cursor-based wildcard alternatives + shortestWildcard option#2248

Open
curtisman wants to merge 7 commits intomicrosoft:mainfrom
curtisman:wild
Open

actionGrammar: cursor-based wildcard alternatives + shortestWildcard option#2248
curtisman wants to merge 7 commits intomicrosoft:mainfrom
curtisman:wild

Conversation

@curtisman
Copy link
Copy Markdown
Member

Summary

Wildcard-matching refactor in actionGrammar: replaces the previous "clone-into-pending" approach for wildcard-length alternatives with a per-state linked-list of resumable wildcard frames. Adds a shortestWildcard option for callers that don't need every ambiguous wildcard placement enumerated, and opts the cache into it.

What changed

actionGrammar

  • Cursor-based wildcard alternatives. Longer-wildcard alternatives are now stored on the live state as a linked-list (MatchState.wildcardFrames) and restored in place via tryExtendWildcard. Eliminates the isWildcardExpansion / ruleIndex LIFO pruning hack from the old design.
  • Single-owner frame invariant, enforced at the type level.
    • PendingMatchState is the base shape (work-list element type and snapshot payload).
    • MatchState = PendingMatchState & { wildcardFrames? } — only the live state has frames.
    • forkMatchState(state) returns PendingMatchState, so spawn sites (optional-skip, nested-rule alternatives, repeat continuation) cannot leak frame ownership into siblings.
    • pending: PendingMatchState[] — the work-list element type makes a future spawn site that uses a raw spread a compile error.
  • Snapshot completeness. captureSnapshot() is the single source of truth for both wildcard frames and spawn/snapshot helpers; the strict SnapshotState mapped type (with -?) makes a missing field a compile error. Regression spec covers fields assigned after a snapshot is taken (wildcardFrameSnapshot.spec.ts).
  • shortestWildcard option (GrammarMatchOptions / GrammarCompletionOptions):
    • Matcher: stop enumerating wildcard alternatives after the first success.
    • Completion: stop only when an exact-match parse is found (matched && cleanFinalize); partial / dirty attempts keep extending so completion candidates remain available even when no full parse exists.

cache

  • Opt into shortestWildcard: true in GrammarStoreImpl partial-prefix completion path (only the shortest exact-match parse is needed to drive prefix / afterWildcard detection).

Tests (~1,250 lines)

  • shortestWildcard.spec.ts — option semantics across many existing match scenarios.
  • grammarCompletionShortestWildcard.spec.ts — completion-side coverage of the option.
  • wildcardFrameSnapshot.spec.ts — regression cases for snapshot completeness.
  • wildcardFrameInteractions.spec.ts — default-mode parity coverage of the per-state frame mechanism.
  • wildcardFrameSpawnInvariant.spec.ts (new in latest commit) — exercises every spawn path and asserts spawned states arrive frame-free.

Verification

  • pnpm --filter action-grammar test — 2,276 tests pass.
  • pnpm --filter agent-cache test — 197 tests pass.

Risk

Low. The refactor is isolated to actionGrammar internals plus the cache opt-in. Default-mode behavior (matcher enumerates every wildcard alternative) is unchanged for all existing callers; shortestWildcard is purely opt-in.

Replace the clone-and-push wildcard alternative mechanism with a
per-state stack of WildcardFrame snapshots:

- captureWildcard pushes a snapshot onto state.wildcardFrames before
  consuming the shortest viable wildcard length.
- tryExtendWildcard restores the most recent frame in place; called
  by matchStateOrExtend on failure and by matchGrammar to enumerate
  alternatives in default mode.
- Removed isWildcardExpansion, MatchState.ruleIndex, and the LIFO
  cross-rule pruning block that depended on processing order.
- grammarCompletion mirrors matchGrammar's extend-on-success/failure
  loop in place rather than draining frames into pending, keeping
  pending growth proportional to genuinely independent paths.
- Added GrammarCompletionOptions.shortestWildcard mirroring the
  matcher option for use cases that don't need ambiguous wildcard
  enumeration.

Adds completion-path coverage for shortestWildcard.
Cache only needs the shortest-wildcard match per state to surface
completion candidates; longer-wildcard alternatives produce
spurious completions for keywords already present inside the input.
Passing shortestWildcard:true keeps pending-stack growth and
candidate enumeration proportional to genuinely independent paths.
pushWildcardFrame now normalizes optional MatchState fields (values,
pendingWildcard, parent, inRepeat, lastMatchedPartInfo) to be own
properties on the snapshot — even when undefined.  Without this, a
plain spread omits inherited-undefined fields and Object.assign in
tryExtendWildcard leaves stale post-snapshot state behind (e.g. a
parent set when entering a nested rule after the snapshot was
taken), which can over-decrement nestedLevel during finalizeNestedRule
and corrupt subsequent matches.

Also store wildcard frames as a persistent linked list (head + prev)
rather than a mutable array shared by sibling spread states.  Pushing
allocates a new head node and rebinds the owning state's pointer;
existing nodes are never mutated, so shallow MatchState copies (from
optional-skip and alternative-rules spreads) safely share the tail
without explicit copy-on-write.

Documentation updates:
- Tighten GrammarMatchOptions.shortestWildcard JSDoc to clarify it
  collapses only the wildcard-length axis; rule/optional/repeat
  enumeration is unchanged.
- Add a Mutation contract note on MatchState.wildcardFrames and an
  IMPORTANT note on tryExtendWildcard documenting that it swaps almost
  every field via Object.assign and that external snapshots must drop
  wildcardFrames.

Add wildcardFrameSnapshot.spec.ts with two regression cases that
crash with RangeError under the pre-fix snapshot (optional-skip and
nested-rule alternatives where the only valid parse requires
extending the wildcard after a nested-rule entry).
…iver

- Split type-only re-exports in index.ts (GrammarMatchResult,
  GrammarMatchOptions, GrammarCompletionResult).
- Replace ad-hoc spread-with-undefineds in pushWildcardFrame with an
  explicit per-field snapshot guarded by a Required<Omit<...>> mapped
  type, so adding a new MatchState field becomes a compile-time error.
- Introduce snapshotMatchState() helper that drops wildcardFrames at
  one chokepoint; grammarCompletion uses it instead of spreading.
- Replace private matchStateOrExtend / exported tryExtendWildcard with
  a single exported runStateWithExtensions generator shared by
  matchGrammar and collectCandidates.
- Add regression tests asserting unique results in default mode for
  multi-anchor two-wildcard grammars; document a known
  duplicate-emission case for nested-rule alternatives.
…n sites

Sibling MatchStates spawned for nested-rule alternatives, optional-skip,
and repeat continuation previously inherited the parent's wildcardFrames
head pointer via shallow spread.  Each sibling could then independently
extend the same frame and re-explore the same downstream traversal,
producing duplicate match results (regression test:
'wildcard before a nested-rule with alternatives').

Route every spawn site through snapshotMatchState, which drops
wildcardFrames on the forked copy.  The inline-continued state retains
ownership; siblings start with an empty chain.  Each (placement x
alternative) combination is still visited because the inline state's
extension re-spawns all alternatives at each new placement.

Upgrade the previously-tolerant test assertion from toContain to a
strict uniqueness + exact-match check.
- Extract captureSnapshot() as single source of truth for both
  pushWildcardFrame() and snapshot/fork helpers; SnapshotState mapped
  type with -? makes a missing field a compile error.
- Flip type hierarchy: PendingMatchState is the base shape;
  MatchState extends it with the mutating wildcardFrames chain.
- Introduce forkMatchState() for spawn sites (return type
  PendingMatchState statically excludes wildcardFrames, enforcing the
  single-owner invariant at the type level).  Drop the runtime check
  in matchGrammar.
- Thread PendingMatchState[] through matchState / finalizeNestedRule
  / initialMatchState so spawn sites cannot push frame-owning
  states.
- grammarCompletion: only break the wildcard extend-loop on
  exact-match success (matched && cleanFinalize); partial / dirty
  attempts keep extending so completion candidates remain available
  even when no full parse exists.
- Add wildcardFrameSpawnInvariant.spec.ts exercising the three spawn
  paths (optional-skip, nested-rule alternative, repeat
  continuation).
- cache/grammarStore: add intent comment on shortestWildcard opt-in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant