chore: root reorg + GEPA skill optimization by jonathanpeterwu · Pull Request #12 · stackmemoryai/stackmemory

jonathanpeterwu · 2026-04-19T13:53:23Z

Summary

Reorganize root: move SPEC.md, RELEASE_NOTES.md, vision.md, tomorrow.md into docs/
Clean up .gitignore and remove stale lint artifacts
GEPA v3: phase-level prompt optimization with auto-targeting from outcomes.jsonl
GEPA skill optimization: audit hook, 5 skill targets, eval harness for .md files
Fix session tests: mock canonicalStateStore
Add /learn context to CLAUDE.md maintenance section

Test plan

npm run lint passes
npm run test:run passes
npm run build passes
Verify skill-audit hook fires on /next or other skill invocations
Run node scripts/gepa/optimize.js skill-stats after a few skill calls

This is a merge commit the virtual branches in your workspace. Due to GitButler managing multiple virtual branches, you cannot switch back and forth between git branches and virtual branches easily. If you switch to another branch, GitButler will need to be reinitialized. If you commit on this branch, GitButler will throw it away. Here are the branches that are currently applied: - sbc-branch-1 (refs/gitbutler/sbc-branch-1) branch head: 2bba656 For more information about what we're doing here, check out our docs: https://docs.gitbutler.com/features/branch-management/integration-branch

…480) - Add CrossProjectSearch engine with FTS5/BM25 ranking across N databases - Project registry (~/.stackmemory/projects.json) with CRUD + auto-discovery - Read-only SQLite connections for safety, LIKE fallback for non-FTS databases - 4 MCP tools: sm_cross_search, sm_cross_discover, sm_cross_register, sm_cross_list - CLI: `stackmemory search --all-projects "query"` for cross-project search - 17 tests: registry CRUD, multi-db FTS5 search, ranking, LIKE fallback, graceful skip

Consolidate duplicate docs, relocate wandering files, and tighten .gitignore for agent scratch dirs. - Move SPEC.md, RELEASE_NOTES.md, tomorrow.md, vision.md to docs/ (replacing stale docs/ copies with the up-to-date root versions) - Move mcp_review_config.json to config/ - Untrack .lint-fix-log.json (ephemeral lint artifact) - Delete stale .tsbuildinfo-* and .lint-errors.log - Ignore agent scratch dirs (.ralph/, .swarm/, .bjarne/, .entire/, .opencode/, .git.backup/) and local trees (archive/, site/, voyager/, plugins/) - Update README.md Vision link to docs/vision.md

Session tests mocked fs/promises but not the canonical-store module. The canonicalStateStore singleton inherited the mocked fs, causing pathExists to return true while readFile returned undefined — crashing JSON.parse. Mock the entire canonical-store module with stubs for upsertSession, appendEvent, and endSession.

Split conductor prompt-template.md into 5 phase files (system, understand, implement, validate, deliver). GEPA now auto-targets the worst-performing phase from outcomes.jsonl instead of mutating the entire template as a monolith. - Phase-aware prompt building in orchestrator with DSPy bridge - Assertion-based retry injects phase-specific error guidance - promptVersions hash map in AgentOutcomeEntry for attribution - Stop hook fires GEPA session accumulator (auto-optimize at threshold) - after-run.sh triggers GEPA + DSPy (every 50 runs) automatically - Gold sets mined from 71 outcomes across 4 phases - eval-phases.js harness validates mutations before applying - npm run gepa:eval / gepa:mine scripts

Add GEPA support for optimizing Claude Code slash command .md files: - skill-audit.js hook logs Skill tool calls to skill-audit.jsonl - 5 skill targets in config (start, stop, learn, next, summary) - skill-tasks.jsonl with 8 eval tasks for skill quality - skill-stats and run-skills CLI commands - getSkillAuditContext() feeds usage data into mutation prompts

- Add API key validation at startup (fail fast before burning budget) - Fix callJudge() to log errors, use config timeout (120s vs 30s) - Add ASI feedback field to judge schema (CoT + actionable suggestions) - Persist judge feedback to results/feedback-{gen}.json - Inject ASI feedback into mutation prompts via getRecentFeedback() - Add extractCodeBlocks() for regex judge (focus on code, not prose) - Add 10 new regex criterion patterns (shows_branch, concise_output, etc) - Support custom regex from eval task definitions - Add elitism tiebreaker (prefer baseline/incumbent on score ties) - Add crossover operator (recombine sections from two parent variants) - Add eval response cache (record/replay for deterministic baselines) - Expand skill eval tasks from 8 to 30 with adversarial cases - Add held-out eval partition (train/test split for Goodhart detection) - Increase population 4→8, add crossoverCount=2, judge timeout 120s

- Keep cache/ in .gitignore - Remove tracked generation files (now gitignored) - Take theirs for .before-optimize.md

StackMemory Bot (CLI) and others added 20 commits April 8, 2026 12:25

feat(conductor): GitButler virtual branch mode for workspace management

2bba656

fix(conductor): state filter + labels flatten for issue dispatch

6c756c7

fix(linear): flatten labels in getIssues response

48d1d68

feat(shared-state): add canonical instance coordination

39c1b39

feat: add deterministic harness smoke tooling

10db093

docs: add design principles architecture note

6bf62e9

chore: update gepa baselines and clean GitButler hooks

b6c3afb

fix(conductor): harden lane mode cleanup

2f8ed5f

chore: handoff checkpoint on chore/root-reorg

b1ca885

chore: handoff checkpoint on chore/root-reorg

19171e3

chore(gepa): update baseline generations with current CLAUDE.md

acc477e

fix(gepa): judge CLI fallback + filter phase variants for skill targets

80bd13a

merge: resolve conflicts with origin/main

046113b

- Keep cache/ in .gitignore - Remove tracked generation files (now gitignored) - Take theirs for .before-optimize.md

jonathanpeterwu merged commit b9cedca into main Apr 20, 2026
4 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: root reorg + GEPA skill optimization#12

chore: root reorg + GEPA skill optimization#12
jonathanpeterwu merged 20 commits intomainfrom
chore/root-reorg

jonathanpeterwu commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jonathanpeterwu commented Apr 19, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants