You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The maestro's orchestration ('what to do') could itself be a skill written as a tree that the maestro interprets — 'one skill explains to the maestro what to do' — rather than hardcoded Python. Recursive with the skill-tree (#458): the maestro skill is just another node, so the same harvest -> inject -> curate machinery improves how the loop orchestrates, not only how it does a ticket. Self-improvement at two levels.
The boundary that makes it safe (non-negotiable)
Two layers, kept separate:
Mechanism (stays deterministic Python): side-effecting primitives (spawn worker, run critic, git/gh/sqlite) AND the safety gates — risk:high -> park, critic -> block merge, spend caps, merge-gate. NEVER interpretable. The loop is money-safe because these are hardcoded.
Policy (becomes an interpreted skill tree): the soft judgment — next-ticket selection, triage promotion order, repair-vs-new priority, frontier-complete/next-arc. This is what 'a skill explains to the maestro'.
A thin interpreter over hard primitives, driven by an editable + evolvable policy skill. forge-loop is already half-way: the brainstormer is LLM policy; the tick is deterministic mechanism. This epic pushes more policy into declarative, learned skills WITHOUT touching mechanism/gates.
S1 (zero behavior risk): extract the CURRENT hardcoded tick policy into an explicit declarative maestro-policy skill (a tree/document) — the literal 'maestro as a skill' written down. No runtime change yet; the tick still executes Python. Adversarial: the policy doc round-trips to the same decisions the tick makes on a fixed fixture.
S2: introduce a policy-interpreter seam for ONE soft decision (next-ticket selection) — the tick consults the policy skill, with the hardcoded rule as fallback; prove parity + measure. Gates stay hard.
Invariant for every slice: removing/garbling a policy skill degrades to the hardcoded default; a safety gate is NEVER read from a skill (enforced by a test that a tampered policy cannot bypass the merge/risk gate).
Reuses the skill-card store, tags, supersession, harvest/inject/curate. Adds a 'kind' or area namespace for orchestration-policy skills vs repo-procedure skills. Built BY hand / carefully (loop's own brain; the self-loop stays stopped) to the adversarial bar — not a big-bang rewrite.
Vision (CRO)
The maestro's orchestration ('what to do') could itself be a skill written as a tree that the maestro interprets — 'one skill explains to the maestro what to do' — rather than hardcoded Python. Recursive with the skill-tree (#458): the maestro skill is just another node, so the same harvest -> inject -> curate machinery improves how the loop orchestrates, not only how it does a ticket. Self-improvement at two levels.
The boundary that makes it safe (non-negotiable)
Two layers, kept separate:
A thin interpreter over hard primitives, driven by an editable + evolvable policy skill. forge-loop is already half-way: the brainstormer is LLM policy; the tick is deterministic mechanism. This epic pushes more policy into declarative, learned skills WITHOUT touching mechanism/gates.
Safe incremental arc (each slice falsifiable, gates untouched)
Relation to #458
Reuses the skill-card store, tags, supersession, harvest/inject/curate. Adds a 'kind' or area namespace for orchestration-policy skills vs repo-procedure skills. Built BY hand / carefully (loop's own brain; the self-loop stays stopped) to the adversarial bar — not a big-bang rewrite.