Improve Claude setup based on NTL battle-testing#2847
Improve Claude setup based on NTL battle-testing#2847Dominik1999 wants to merge 6 commits intommagician-claude-setupfrom
Conversation
Consolidated pre-push hook: - Merged pre-push-test.sh into pre-push-review.sh (tests run first, then reviewers only if tests pass - saves reviewer tokens on broken code) - Severity-based blocking: nits and notes don't block push, only Critical/Important/Warning findings do - Proper diff base resolution: tries upstream, falls back to default branch via gh, then HEAD~1 - Handles reviewer crashes gracefully (empty/malformed output = block) - Escape hatch: SKIP_PRE_PUSH=1 Input guards on all hooks: - Each hook validates the command matches before acting (defense in depth against settings.json filter regressions) Improved settings.json: - Removed separate pre-push-test.sh entry (now in pre-push-review.sh) - Added proper `if` filter on pre-pr-draft.sh (was firing on all Bash) - Removed post-pr-create-ci-monitor.sh (superseded) Lessons system: - Added .claude/lessons.md for codifying review feedback - Added .claude/commands/codify-lesson.md slash command - Updated CLAUDE.md to reference in-repo lessons file instead of global ~/.claude path Agent improvements: - Code reviewer and security reviewer use gh-based default branch resolution instead of hardcoded branch names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| # Internal guard: only fire for actual git commit invocations. Defense in depth | ||
| # against settings.json filter regressions. | ||
| COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null) | ||
| echo "$COMMAND" | grep -qE '(^|[[:space:]])git[[:space:]]+(-c[[:space:]]+[^ ]+[[:space:]]+)*commit([[:space:]]|$)' || exit 0 |
There was a problem hiding this comment.
TBH I'm not sure these internal guards add value; were there cases when the hook fired on wrong invocations?
to me this just looks like AI trying very hard to add boilerplate code to the scripts which make them much less human-readable.
There was a problem hiding this comment.
The guards exist because the settings.json if-filter on pre-pr-draft.sh was missing in the original PR. Then it fired on all Bash commands. The guard caught that.
But now that the if-filters are correct, the guards are indeed defense-in-depth that adds ~3 lines of noise per hook. Let's remove the guards and rely on the settings.json filters. If a filter regresses, we'll notice it quickly.
| # Internal guard: only fire for actual git commit invocations. Defense in depth | ||
| # against settings.json filter regressions. | ||
| COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null) | ||
| echo "$COMMAND" | grep -qE '(^|[^a-zA-Z0-9_-])gh[[:space:]]+pr[[:space:]]+create([[:space:]]|$)' || exit 0 | ||
| REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) |
There was a problem hiding this comment.
same comments, let's remove them.
| # CLAUDE.md | ||
|
|
||
| > Per-project lessons: ~/.claude/projects/protocol/lessons.md | ||
| > Per-project lessons: `.claude/lessons.md`. Read at the start of every non-trivial task. Append new entries with `/codify-lesson <description>`. |
There was a problem hiding this comment.
also, do we need LESSONS.md? (I think I originally had it because I was testing this setup, but TBH it's a non-standard Claude file)
I think the codification of review rules should happen in SKILLS.md instead
There was a problem hiding this comment.
Should we use SKILLS.md or maybe better CLAUDE.md to codify them? Not sure if accurate but Claude told me that skills are slash commands, not soft rules.
There was a problem hiding this comment.
A skill can be invoked explicitly, but Claude also reads the skills and sees which ones should be applied.
| "type": "command", | ||
| "if": "Bash(*git push*)", | ||
| "command": ".claude/hooks/pre-push-review.sh" |
There was a problem hiding this comment.
I think there's value in keeping the test & review hooks (and thus scripts) separate.
First of all, they are conceptually different, and second, we might decide that we should e.g. run tests on every commit not just before push
There was a problem hiding this comment.
Then let's separate them again.
- Remove input guards from all hooks (boilerplate, unclear value) - Restore pre-push-test.sh as separate hook (keep tests and reviews conceptually separate per review feedback) - Remove lessons.md and codify-lesson.md (use SKILLS.md instead) - Revert CLAUDE.md lessons reference - Keep: severity-based blocking, diff base resolution, crash handling, proper if filters, agent branch resolution via gh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mmagician
left a comment
There was a problem hiding this comment.
I'd have another pass at trying to minimize the boilerplate code
| fi | ||
|
|
||
| exit 0 | ||
| exit 0 No newline at end of file |
| # Resolve the unreleased version dynamically. Strategy: | ||
| # 1. PR's own milestone title (most authoritative) | ||
| # 2. lowest open milestone with a version-like title | ||
| # 3. give up; tell the user |
There was a problem hiding this comment.
our PRs dont have milestones.
There was a problem hiding this comment.
Good catch on the milestones, removed the resolution function entirely. The agent now finds the unreleased section by reading CHANGELOG.md directly. Kept the PR_URL / PR_NUMBER / CWD extraction since those are needed to invoke the agent in the right repo and reference the PR in the verdict messages, but happy to drop those too if you'd prefer the agent discover the PR itself.
There was a problem hiding this comment.
removed the resolution function entirely.
Have these changes been pushed? I don't see any new commits here
There was a problem hiding this comment.
I forgot to push, now it should be there
| PR_URL=$(printf '%s' "$INPUT" | jq -r '.tool_response // empty' \ | ||
| | grep -oE 'https://github\.com/[^\s"]+/pull/[0-9]+' | head -1) | ||
| PR_NUMBER=$(printf '%s' "$PR_URL" | grep -oE '[0-9]+$') | ||
| CWD=$(printf '%s' "$INPUT" | jq -r '.cwd // empty') | ||
|
|
||
| if [ -z "$PR_URL" ]; then | ||
| exit 0 | ||
| fi | ||
| [ -z "$PR_URL" ] || [ -z "$PR_NUMBER" ] || [ -z "$CWD" ] && exit 0 | ||
|
|
||
| # Extract PR number | ||
| PR_NUMBER=$(echo "$PR_URL" | grep -oP '\d+$') | ||
| # ---------------------------------------------------------------------------- | ||
| # Resolve the unreleased version dynamically. Strategy: | ||
| # 1. PR's own milestone title (most authoritative) | ||
| # 2. lowest open milestone with a version-like title | ||
| # 3. give up; tell the user |
There was a problem hiding this comment.
this looks like unnecessary code - again, was the changelog agent failing to add a correct entry?
There was a problem hiding this comment.
see above comment and new file.
| SUGGESTED="${SUGGESTED} --draft" | ||
| fi | ||
|
|
||
| REASON=$(printf 'PRs must be created as drafts. Re-run with --draft:\n\n %s\n\nPromote to ready-for-review later with: gh pr ready <num>\nBypass this hook with: SKIP_DRAFT_CHECK=1 gh pr create ...' "$SUGGESTED") |
There was a problem hiding this comment.
We should not by telling the agent how to bypass the hook...
There was a problem hiding this comment.
Good point, removed. The env var still works for human maintainers but the deny reason no longer mentions it.
There was a problem hiding this comment.
The env var still works for human maintainers
But human maintainers don't use these workflows to create PRs and are not meant to
There was a problem hiding this comment.
Removed the escape hatch completely. The last changes I did manually and not with Claude anymore ...
Summary
Improvements to the Claude-first setup based on running it against 6 PRs on the note-transport-service repo. These changes address issues found during real use.
Consolidated pre-push hook
pre-push-test.shintopre-push-review.sh: tests run first as a fail-fast gate. Reviewers only spawn if tests pass, saving reviewer API tokens on broken code.@{upstream}, falls back to default branch viagh repo view, thenHEAD~1. The original version failed when no upstream was set.SKIP_PRE_PUSH=1bypasses everything for emergencies.Input guards on all hooks
Each hook now validates the actual command (via
jq+ regex on stdin) before acting. This is defense in depth againstsettings.jsonfilter regressions - if a hook fires on the wrong command, it exits 0 instead of misbehaving.Settings.json fixes
pre-push-test.shentry (now integrated intopre-push-review.sh)iffilter onpre-pr-draft.sh(was previously matching ALL Bash commands)post-pr-create-ci-monitor.sh(superseded)Lessons system
.claude/lessons.md- structured file for codifying review feedback across categories (conventions, architecture, testing, security, process).claude/commands/codify-lesson.md-/codify-lessonslash command that appends lessons with proper formatting and proposes promotion paths (keep as lesson, promote to hook, or promote to CLAUDE.md)CLAUDE.mdto reference in-repo lessons file instead of global~/.claudepathAgent improvements
gh repo view --json defaultBranchReffor diff base resolution instead of hardcoded branch names. Works across repos with different default branches.What this looked like in practice
On the NTL repo, these improvements caught real issues during automated review:
u32 -> i32conversion that would panic on valid inputs (block numbers >i32::MAX)note_metadatabypassedmax_note_size)limit=0edge case that silently returned empty responsesThe severity-based blocking was essential - without it, every push was blocked by doc comment backtick nits while real bugs slipped through in earlier iterations.
🤖 Generated with Claude Code