hadamrd · hadamrd · Jun 27, 2026 · Jun 27, 2026 · Jun 27, 2026
diff --git a/docs/design/81-skill-tree.md b/docs/design/81-skill-tree.md
@@ -0,0 +1,92 @@
+# Skill-Tree — learned procedural memory the maestro grows and retrieves
+
+Status: in implementation (branch `feat/skill-tree`). Epic: hadamrd/forge-loop#458.
+
+## Problem
+
+Every worker starts cold: it re-derives repo conventions, file maps, and test
+recipes on every ticket — burning tokens, making inconsistent choices, and
+repeating known mistakes. The biggest faster+cheaper+more-accurate lever is to
+**retrieve learned procedure instead of re-deriving it**.
+
+## What already exists (substrate — reuse, do not duplicate)
+
+- `memory/store.py` `SqliteMemoryStore` at `.forge/memory.db` — `put` / `get` /
+  `list_active(kind=)` / `supersede`.
+- `memory/models.py` `MemoryItem` (kind/title/body/provenance/tags/superseded_by),
+  `MemoryKind.PROCEDURAL`, `derive_skill_key(failing_signal, target)`,
+  `skill_tag` / `skill_from_tags`, tag-prefix pattern (`axis:`, `skill:`).
+- `runner/learning.py` `record_procedural_skill(...)` — writes a procedural card
+  and supersedes the prior card of the same skill-key (lineage preserved).
+- Episodic memory is already harvested on merge (`record_merged_outcomes` via
+  `runner/tick.py::_record_merged_memory`) and injected into briefs
+  (`worker_brief.py::_render_prior_episodes`).
+
+**Gap:** procedural skills are never harvested, never retrieved/injected, and
+have no tree structure.
+
+## Design
+
+### Tree structure (leaves + internal nodes)
+
+Each card carries an **area path** as a tag: `area:<path>` (e.g.
+`area:pulsar-node/http-route`). The `/`-delimited path IS the tree.
+
+- **Leaf** = a concrete procedure card (has both a `skill:<sig>` tag and an
+  `area:<path>` tag). Body schema (plain text sections): `trigger`, `procedure`
+  (file map + template + test recipe), `pitfalls`.
+- **Internal node** = a generalized card for an area subtree (has an
+  `area-node` marker tag + `area:<path>`), distilled from ≥N descendant leaves.
+
+New helpers in `memory/models.py` (mirror `axis_tag`): `AREA_TAG_PREFIX`,
+`area_tag(path)`, `area_from_tags(tags)`, `AREA_NODE_TAG`.
+
+### Lifecycle
+
+1. **Harvest** (`skill_librarian.py` + `runner/learning.py`): after a
+   critic-clean merge, a one-shot LLM "librarian" distills the merged diff +
+   acceptance criteria into a `SkillCard(area, trigger, procedure, pitfalls,
+   failing_signal, target, confidence)`. `harvest_skills_from_merge(...)` is pure
+   and injectable (the LLM call is a passed-in callable) so the wiring is unit
+   tested without network. It calls `record_procedural_skill` with the `area:`
+   tag and provenance carrying the merged SHA in `evidence_refs`. Emits
+   `skill_harvested`.
+2. **Retrieve + inject** (`memory/skills.py` + `worker_brief.py`):
+   `retrieve_skills_for(store, query, *, k)` ranks active procedural cards by
+   area/title/tag token overlap with the ticket, walking most-specific-leaf →
+   ancestor-node, capped at `k` (token budget). `render_skill_section(...)`
+   formats them; injected into `make_brief` and `make_repair_brief`. Emits
+   `skill_injected` (rank) per card — enables measuring the turns/tokens win.
+   v1 retrieval is deterministic keyword/area ranking; a clean seam is left for
+   semantic (Lumen) retrieval as a fast-follow (NOT in this PR).
+3. **Curate/evolve** (`memory/skills.py`): `expire_stale_skills(...)` supersedes
+   cards whose proof-SHA is absent from the current history / older than a
+   horizon; `promote_internal_nodes(store, *, min_leaves)` distills an
+   internal-node card once an area has ≥`min_leaves` leaves. Emits
+   `skill_expired` / `skill_promoted`.
+
+### Traps engineered around
+
+- **Stale > empty:** every card carries provenance SHA + confidence; expiry is
+  mandatory. Adversarial test: an expired card is not retrieved.
+- **Retrieval is the hard part:** ranking is the real work; honest v1 is
+  keyword/area, semantic is a seam.
+- **Contradiction sprawl:** one librarian writes; supersession (existing)
+  preserves lineage; no free-for-all.
+
+## Events (new, `events.py`)
+
+`SkillHarvestedEvent`, `SkillInjectedEvent`, `SkillPromotedEvent`,
+`SkillExpiredEvent`.
+
+## Testing
+
+Pure modules (models helpers, ranking, expiry, promotion, harvest-wiring with an
+injected fake librarian) are fully unit-tested against a real
+`SqliteMemoryStore` (`tmp_path`). Each behavior has an adversarial test: remove
+the harvest call / the inject / the expiry filter and a test goes red. The
+LLM-dependent distiller is isolated behind a callable seam and tested with a fake.
+
+## Out of scope (this PR)
+
+Semantic/Lumen retrieval (seam only); a UI surface; cross-repo skill sharing.
diff --git a/src/forge_loop/events.py b/src/forge_loop/events.py
@@ -445,6 +445,53 @@ class WorkerPolicyEnforcedEvent(EventBase):
     withheld_secrets: list[str] = Field(default_factory=list)
 
 
+@register_event
+class SkillHarvestedEvent(EventBase):
+    """A procedural skill card was distilled from a critic-clean merge."""
+
+    KIND: ClassVar[str] = "skill_harvested"
+    issue: int = Field(ge=1)
+    area: str = ""
+    skill_key: str = ""
+    memory_id: str = ""
+    sha: str = ""
+    confidence: float = Field(ge=0.0, le=1.0, default=1.0)
+
+
+@register_event
+class SkillInjectedEvent(EventBase):
+    """A skill card was retrieved and injected into a worker brief.
+
+    ``rank`` is its position in the retrieval ranking (0 = best match) so the
+    measurement track can correlate which cards actually drove a cheaper run.
+    """
+
+    KIND: ClassVar[str] = "skill_injected"
+    issue: int = Field(ge=1)
+    memory_id: str = ""
+    area: str = ""
+    rank: int = Field(ge=0, default=0)
+
+
+@register_event
+class SkillPromotedEvent(EventBase):
+    """Leaves under an area were generalised into an internal-node card."""
+
+    KIND: ClassVar[str] = "skill_promoted"
+    area: str = ""
+    memory_id: str = ""
+    leaf_count: int = Field(ge=1, default=1)
+
+
+@register_event
+class SkillExpiredEvent(EventBase):
+    """A stale skill card was superseded/retired (its proof SHA aged out)."""
+
+    KIND: ClassVar[str] = "skill_expired"
+    memory_id: str = ""
+    reason: str = ""
+
+
 # ---------------------------------------------------------------------------
 # Emit + back-compat shim. ``emit`` is the typed path; ``append_event_with_
 # registry_check`` is the back-compat wrapper called by state.append_event.

diff --git a/src/forge_loop/memory/models.py b/src/forge_loop/memory/models.py
@@ -81,6 +81,50 @@ def skill_from_tags(tags: tuple[str, ...]) -> str:
     return ""
 
 
+#: Tag prefix carrying a procedural skill's *area path* — its address in the
+#: skill tree (e.g. ``area:pulsar-node/http-route``). The ``/``-delimited path
+#: IS the tree: a leaf sits at a full path; an internal node summarises a
+#: subtree prefix. Mirrors ``AXIS_TAG_PREFIX`` / ``SKILL_TAG_PREFIX``.
+AREA_TAG_PREFIX = "area:"
+
+#: Marker tag distinguishing an *internal node* card (a generalised pattern
+#: distilled over an area subtree) from a *leaf* card (a concrete procedure).
+AREA_NODE_TAG = "area-node"
+
+#: Marker tag retiring a skill card whose proof commit has aged out of history
+#: (rebased/squashed away). An expired card stays in the store for provenance
+#: but is filtered out of retrieval — a confidently-stale recipe is worse than
+#: none. Distinct from supersession, which means "replaced by a better version".
+EXPIRED_TAG = "expired"
+
+
+def area_tag(path: str) -> str:
+    """Render an ``area:<path>`` tag for ``path`` (stripped)."""
+    return f"{AREA_TAG_PREFIX}{path.strip()}"
+
+
+def area_from_tags(tags: tuple[str, ...]) -> str:
+    """Extract the area path from an ``area:<path>`` tag, or ``""`` if absent."""
+    for tag in tags:
+        if tag.startswith(AREA_TAG_PREFIX):
+            return tag[len(AREA_TAG_PREFIX) :]
+    return ""
+
+
+def area_ancestors(path: str) -> tuple[str, ...]:
+    """Return the area path and its ancestors, most-specific first.
+
+    ``"a/b/c"`` → ``("a/b/c", "a/b", "a")`` — the retrieval walk order: the most
+    specific leaf area first, then its ancestor internal nodes. A blank path
+    yields ``()``. Leading/trailing slashes are ignored.
+    """
+    cleaned = path.strip().strip("/")
+    if not cleaned:
+        return ()
+    segments = cleaned.split("/")
+    return tuple("/".join(segments[:i]) for i in range(len(segments), 0, -1))
+
+
 def derive_memory_id(source_key: str, *, prefix: str) -> str:
     """Derive a stable ``memory_id`` from a source key.
 

diff --git a/src/forge_loop/memory/skills.py b/src/forge_loop/memory/skills.py
@@ -0,0 +1,194 @@
+"""Retrieve and render learned procedural skills (the skill tree) for a brief.
+
+Retrieval is deliberately deterministic for v1: a card is ranked by token
+overlap between the ticket text and the card's *area path* (weighted heavily,
+since the area IS the tree address) and its title/body. A clean seam is left for
+a future semantic (embedding) ranker — :func:`retrieve_skills_for` is the single
+entry point a Lumen-backed implementation would replace, with the same shape.
+
+Cards that overlap nothing with the query are NOT returned: injecting an
+irrelevant or stale recipe is worse than injecting none.
+"""
+
+from __future__ import annotations
+
+import re
+from collections import defaultdict
+from collections.abc import Callable
+from dataclasses import dataclass, replace
+
+from forge_loop.memory.models import (
+    AREA_NODE_TAG,
+    EXPIRED_TAG,
+    MemoryItem,
+    MemoryKind,
+    area_from_tags,
+    area_tag,
+)
+from forge_loop.memory.store import MemoryStore
+
+__all__ = [
+    "PromotedNode",
+    "expire_stale_skills",
+    "promote_internal_nodes",
+    "render_skill_section",
+    "retrieve_skills_for",
+]
+
+_COMMIT_PREFIX = "commit:"
+
+#: An area-path token match counts for more than an incidental body-word match:
+#: the tree address is the strongest relevance signal.
+_AREA_WEIGHT = 3
+_MIN_TOKEN_LEN = 3
+_TOKEN_RE = re.compile(r"[a-z0-9]+")
+
+
+def _tokens(text: str) -> set[str]:
+    return {tok for tok in _TOKEN_RE.findall(text.lower()) if len(tok) >= _MIN_TOKEN_LEN}
+
+
+def _area_tokens(area: str) -> set[str]:
+    # Split the path on "/" and "-" so "pulsar-node/http-route" contributes
+    # {pulsar, node, http, route}.
+    return _tokens(area.replace("/", " ").replace("-", " "))
+
+
+def _score(item: MemoryItem, query_tokens: set[str]) -> int:
+    area = area_from_tags(item.tags)
+    area_score = _AREA_WEIGHT * len(query_tokens & _area_tokens(area))
+    text_score = len(query_tokens & _tokens(f"{item.title} {item.body}"))
+    return area_score + text_score
+
+
+def retrieve_skills_for(store: MemoryStore, query: str, *, k: int = 3) -> tuple[MemoryItem, ...]:
+    """Return up to ``k`` active procedural skill cards most relevant to ``query``.
+
+    Ranks active ``PROCEDURAL`` cards by area-weighted token overlap; cards with
+    zero overlap are excluded. Ties break by confidence (desc) then insertion
+    order (stable). An empty/blank query returns ``()``.
+    """
+    query_tokens = _tokens(query)
+    if not query_tokens:
+        return ()
+    scored = [
+        (score, item)
+        for item in store.list_active(kind=MemoryKind.PROCEDURAL)
+        if EXPIRED_TAG not in item.tags and (score := _score(item, query_tokens)) > 0
+    ]
+    # Stable sort: equal (score, confidence) keeps list_active's insertion order.
+    scored.sort(key=lambda si: (si[0], si[1].provenance.confidence), reverse=True)
+    return tuple(item for _score_value, item in scored[:k])
+
+
+def render_skill_section(items: tuple[MemoryItem, ...]) -> str:
+    """Render retrieved skill cards as a brief section, or ``""`` if none."""
+    if not items:
+        return ""
+    lines = [
+        "## Learned skills for this repo (retrieved from past merges)",
+        (
+            "Distilled from prior successful, reviewed changes. Apply when "
+            "relevant, but verify against current code — they can go stale."
+        ),
+        "",
+    ]
+    for item in items:
+        area = area_from_tags(item.tags) or "general"
+        provenance = ", ".join(item.provenance.evidence_refs) or "—"
+        lines.append(f"### {area} (confidence {item.provenance.confidence:.2f})")
+        lines.append(item.body)
+        lines.append(f"_provenance: {provenance}_")
+        lines.append("")
+    return "\n".join(lines).rstrip() + "\n"
+
+
+def _commit_shas(item: MemoryItem) -> tuple[str, ...]:
+    return tuple(
+        ref[len(_COMMIT_PREFIX) :]
+        for ref in item.provenance.evidence_refs
+        if ref.startswith(_COMMIT_PREFIX)
+    )
+
+
+def expire_stale_skills(store: MemoryStore, *, live_shas: set[str]) -> tuple[str, ...]:
+    """Retire procedural cards whose proof commit is no longer in history.
+
+    A card carries its proof as ``commit:<sha>`` evidence. When none of a card's
+    proof commits are in ``live_shas`` (the set of commits currently reachable in
+    the repo), the change it was distilled from has been rebased/squashed away —
+    the recipe may no longer match the code, so the card is tagged
+    :data:`~forge_loop.memory.models.EXPIRED_TAG` and dropped from retrieval. A
+    card with no commit provenance cannot be judged and is left untouched.
+    Returns the ``memory_id`` of each card expired this pass.
+    """
+    expired: list[str] = []
+    for item in store.list_active(kind=MemoryKind.PROCEDURAL):
+        if EXPIRED_TAG in item.tags:
+            continue
+        shas = _commit_shas(item)
+        if not shas:
+            continue
+        if any(sha in live_shas for sha in shas):
+            continue
+        store.put(replace(item, tags=(*item.tags, EXPIRED_TAG)))
+        expired.append(item.memory_id)
+    return tuple(expired)
+
+
+@dataclass(frozen=True)
+class PromotedNode:
+    """Descriptor of one internal node promoted from sibling leaves."""
+
+    memory_id: str
+    area: str
+    leaf_count: int
+
+
+def promote_internal_nodes(
+    store: MemoryStore,
+    *,
+    min_leaves: int,
+    summarize: Callable[[tuple[MemoryItem, ...]], str],
+) -> tuple[PromotedNode, ...]:
+    """Generalise sibling leaves into an internal-node card per parent area.
+
+    Groups active *leaf* cards (a leaf has an ``area:`` tag and is neither an
+    :data:`~forge_loop.memory.models.AREA_NODE_TAG` node nor expired) by their
+    parent area path. When a parent gathers at least ``min_leaves`` leaves, the
+    injected ``summarize`` distils their shared pattern into the body of an
+    internal-node card filed at the parent path (tagged ``AREA_NODE_TAG``).
+    Internal nodes never count as leaves, so promotion does not cascade upward.
+    Idempotent: re-running upserts the node in place (stable per parent).
+    """
+    from forge_loop.runner.learning import record_procedural_skill
+
+    groups: dict[str, list[MemoryItem]] = defaultdict(list)
+    for item in store.list_active(kind=MemoryKind.PROCEDURAL):
+        if AREA_NODE_TAG in item.tags or EXPIRED_TAG in item.tags:
+            continue
+        area = area_from_tags(item.tags)
+        if not area:
+            continue
+        parent = "/".join(area.split("/")[:-1])
+        if parent:
+            groups[parent].append(item)
+
+    promoted: list[PromotedNode] = []
+    for parent, leaves in groups.items():
+        if len(leaves) < min_leaves:
+            continue
+        body = summarize(tuple(leaves))
+        memory_id = record_procedural_skill(
+            store,
+            failing_signal="",
+            target=parent,
+            title=f"{parent} (pattern)",
+            body=body,
+            source_key=f"node:{parent}",
+            source_task_ref=f"area:{parent}",
+            extra_tags=(area_tag(parent), AREA_NODE_TAG),
+            confidence=0.7,
+        )
+        promoted.append(PromotedNode(memory_id, parent, len(leaves)))
+    return tuple(promoted)