Skip to content

Better delta semantics for capability changes#182

Merged
pengfei-threemoonslab merged 3 commits into
mainfrom
codex/better-delta-semantics
Jun 6, 2026
Merged

Better delta semantics for capability changes#182
pengfei-threemoonslab merged 3 commits into
mainfrom
codex/better-delta-semantics

Conversation

@pengfei-threemoonslab

Copy link
Copy Markdown
Contributor

Summary

  • Add a shared semantic capability delta engine for scope/resource, effect, risk, authority, controls, schema, and evidence-only changes.
  • Extend public capability_change members in report schema v0.23 with semantic metadata while preserving the existing buckets and release gate.
  • Reuse the same classifier in experimental capability-lock diffs, keeping lock exports compatible while bumping the diff artifact to v0.2.
  • Refresh schemas, docs, sample JSON goldens, and tests for the v0.23 report contract.

Validation

  • python -m pytest tests/test_capability_domain.py tests/test_capability_delta.py tests/test_capability_lock.py tests/test_verifier_blocks.py tests/test_cross_block_consistency.py tests/test_schema_boundaries.py tests/test_public_surface_contract.py -q
  • python -m pytest tests/test_action_scope_domain.py tests/test_action_surface_diff.py tests/test_verifier_scenarios.py tests/test_github_action_outputs.py -q
  • python -m pytest tests/test_reports.py tests/test_docs_links.py -q
  • python scripts/generate_schemas.py --check
  • ruff check .
  • AGENTS_SHIPGATE_AGENT_MODE=1 PYTHONPATH=src python -m agents_shipgate verify --workspace . --config shipgate.yaml --ci-mode advisory --format json

Shipgate verifier

release_decision.decision=review_required and merge_verdict=human_review_required because this PR edits release trust-root agent instruction files (AGENTS.md, skills/agents-shipgate/SKILL.md). No release blockers were reported in advisory mode.

Copy link
Copy Markdown
Contributor Author

Addressed the follow-up review notes in 6601231:

  • Hoisted the duplicated effect rank into shared core/action_semantics.py and updated both action-surface escalation detection and capability semantic classification to use it.
  • Added capability_semantic_change_sort_key and reused it in the delta engine, report model normalization, and _dedup_members merge path.
  • Simplified _semantic_subject to the intended public-report contract: semantic report rows remain action-scoped and field-level scope/control/schema detail lives in semantic_changes[]; added assertions for that behavior.
  • Trimmed sample golden diffs so they only carry the intended report_schema_version: "0.23" bump.
  • Confirmed idempotency=False semantics: it remains visible as controls.safeguard_idempotency=False, but does not become positive effect.idempotency_known evidence; added a test and comment.

Validation rerun:

  • python -m pytest tests/test_capability_delta.py tests/test_verifier_blocks.py tests/test_action_surface_diff.py tests/test_reports.py -q
  • ruff check .
  • python -m pytest tests/test_capability_domain.py tests/test_capability_delta.py tests/test_capability_lock.py tests/test_verifier_blocks.py tests/test_cross_block_consistency.py tests/test_schema_boundaries.py tests/test_public_surface_contract.py -q
  • python -m pytest tests/test_action_scope_domain.py tests/test_action_surface_diff.py tests/test_verifier_scenarios.py tests/test_github_action_outputs.py -q
  • python scripts/generate_schemas.py --check
  • AGENTS_SHIPGATE_AGENT_MODE=1 PYTHONPATH=src python -m agents_shipgate verify --workspace . --config shipgate.yaml --base origin/main --head HEAD --ci-mode advisory --format json

Shipgate result remains advisory exit 0 with release_decision.decision=review_required / merge_verdict=human_review_required because the PR still edits trust-root instruction files.

@pengfei-threemoonslab pengfei-threemoonslab changed the title [codex] Better delta semantics for capability changes Better delta semantics for capability changes Jun 6, 2026
@pengfei-threemoonslab pengfei-threemoonslab marked this pull request as ready for review June 6, 2026 05:08

Copy link
Copy Markdown
Contributor Author

Fixed the failing CI test job in ef4ce77.

Root cause: the checked-in Claude skill source had been updated for report schema v0.23, but the bundled adoption-kits/claude-code-skill/SKILL.md renderer source still emitted the v0.22 report contract. test_claude_code_skill_source_matches_renderer caught that drift.

Fix:

  • Updated the Claude adoption-kit SKILL source to the v0.23 report contract text.
  • Updated the intentional Claude render hash snapshot.
  • Added the previous SKILL render hash to the adoption-kit metadata prior hash list.

Validation:

  • PYTHONPATH=src python -m pytest tests/test_agent_instructions_renderers.py::test_claude_code_skill_source_matches_renderer tests/test_agent_instructions_renderers.py::test_claude_code_skill_render_hashes_change_intentionally -q
  • PYTHONPATH=src python -m pytest tests/test_agent_instructions_renderers.py -q
  • ruff check .
  • python scripts/generate_schemas.py --check
  • PYTHONPATH=src python -m pytest -n auto -m "not perf" --ignore=tests/test_adapter_static_only.py --cov=agents_shipgate --cov-report=term-missing --cov-fail-under=85
  • AGENTS_SHIPGATE_AGENT_MODE=1 PYTHONPATH=src python -m agents_shipgate verify --workspace . --config shipgate.yaml --base origin/main --head HEAD --ci-mode advisory --format json

Shipgate remains advisory exit 0 with release_decision.decision=review_required / merge_verdict=human_review_required because the PR edits agent-instruction trust roots.

@pengfei-threemoonslab pengfei-threemoonslab merged commit 379638d into main Jun 6, 2026
2 checks passed
@pengfei-threemoonslab pengfei-threemoonslab deleted the codex/better-delta-semantics branch June 6, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant