feat(hermes_governance): opt-in skill-budget governance plugin by rinadelph · Pull Request #374 · mpfaffenberger/code_puppy

rinadelph · 2026-05-31T18:26:22Z

Summary

Adds an opt-in hermes_governance plugin that ports a "Hermes-style" governance loop into Code Puppy: a skill-budget gate on tool calls, nudges that steer the agent toward creating/reusing skills, and background curation of stale agent-created skills. Implemented entirely through callback hooks — no core files are modified.

Opinionated by design / disabled by default. Enforcement is fully opt-in and a no-op unless explicitly armed. Opening as a draft for discussion on whether this belongs in core vs. as an external plugin.

What's included

code_puppy/plugins/hermes_governance/:

enforcer.py — pre_tool_call/post_tool_call gate that enforces a skill budget and emits nudges.
budget.py — the budget primitive (onboarding budget → expanded budget after first skill use).
carrier.py / carrier_processor.py — state rides in the conversation via wrap_pydantic_agent; reset per run on agent_run_start.
curator.py — session_end background curation that archives stale agent-created skills.
nudges.py — system-reminder injection on user_prompt_submit.
skill_manage.py — registers a skill_manage tool for on-demand skill lifecycle ops.
config.py — configuration surface (see below).

Control surface

No standalone slash command. Governance is controlled entirely through puppy.cfg keys (auto-exposed in /set tab-completion):

/set hermes_governance_enabled=true          # arm the gate + nudges
/set hermes_governance_enabled=false         # disarm (default)
/set hermes_governance_onboarding_budget=5
/set hermes_governance_max_budget=90

Design

Plugin-only; no edits to core. All hooks fail gracefully and are no-ops while disarmed.
Every referenced hook (pre_tool_call, post_tool_call, wrap_pydantic_agent, agent_run_start, session_end, user_prompt_submit, register_tools) is part of the documented callback surface.

Testing

ruff check / ruff format clean.
All files under the project's 600-line cap.

Risk

Low when disabled (default). When enabled, it intentionally gates tool calls — that's the feature. Happy to adjust scope/behaviour based on review.

Plugin porting Hermes' governance loop into Code Puppy via callback hooks. No core edits, no standalone slash command — controlled entirely through puppy.cfg keys (auto-exposed in /set tab-completion): /set hermes_governance_enabled=true /set hermes_governance_onboarding_budget=5 /set hermes_governance_max_budget=90 - pre/post_tool_call enforcer gates tool use against a skill budget and emits nudges to push the agent toward creating/using skills. - State rides in the conversation via a carrier processor (wrap_pydantic_agent), reset per run on agent_run_start. - session_end runs Hermes-style background curation that archives stale agent-created skills. - Registers a skill_manage tool for on-demand skill lifecycle ops. Every hook fails gracefully and is a no-op while enforcement is disarmed.

Outgoing wire body's tools array was cp_-prefixed but historical tool_use blocks in messages[*].content[*] were not, since pydantic_patches strips the prefix from call.tool_name in-place and that mutation persists into _message_history. The mismatch can wedge follow-up turns once history accumulates. Now we prefix both the live tools catalog and every tool_use block in the messages array. tool_result blocks reference by tool_use_id (not name) so they're left untouched. Adds three regression tests covering: history-only prefixing, mixed catalog+history, and idempotence when already prefixed.

Appending the carrier as a standalone ModelRequest produced two consecutive user messages on the wire (the real prompt + the <<<HERMES_GOVERNANCE_STATE>>> blob). Claude Code OAuth's endpoint silently stalls on consecutive user turns instead of erroring, which hung the agent on the very first call when both plugins were armed. write_state now merges the carrier as an extra UserPromptPart on the last existing ModelRequest, falling back to a standalone message only when no user turn exists yet. _strip_carriers already supported this inline layout, so the original design accommodates it — we just weren't taking advantage on write. Adds six regression tests pinning the new layout contract: merge-into-last, mid-conversation merge, fallback-when-empty, no-duplicate-on-rewrite, find_state-reads-inline-carrier, and strip-preserves-real-content.

rinadelph marked this pull request as ready for review May 31, 2026 18:51

rinadelph force-pushed the feat/hermes-governance branch from 53b3eeb to 7d843fa Compare May 31, 2026 21:23

rinadelph force-pushed the feat/hermes-governance branch from 7d843fa to c2f01d7 Compare May 31, 2026 21:36

mpfaffenberger added 3 commits May 31, 2026 19:02

chore: ruff --fix unused MagicMock import in test_run_stats

a397a9f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hermes_governance): opt-in skill-budget governance plugin#374

feat(hermes_governance): opt-in skill-budget governance plugin#374
rinadelph wants to merge 4 commits into
mpfaffenberger:mainfrom
rinadelph:feat/hermes-governance

rinadelph commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rinadelph commented May 31, 2026

Summary

What's included

Control surface

Design

Testing

Risk

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants