Improve Claude setup based on NTL battle-testing by Dominik1999 · Pull Request #2847 · 0xMiden/protocol

Dominik1999 · 2026-04-29T12:07:42Z

Summary

Improvements to the Claude-first setup based on running it against 6 PRs on the note-transport-service repo. These changes address issues found during real use.

Consolidated pre-push hook

Merged pre-push-test.sh into pre-push-review.sh: tests run first as a fail-fast gate. Reviewers only spawn if tests pass, saving reviewer API tokens on broken code.
Severity-based blocking: nits and notes surface in the output but don't block push. Only Critical/Important/Warning findings block. This prevents trivial style comments from halting the workflow.
Robust diff base resolution: tries @{upstream}, falls back to default branch via gh repo view, then HEAD~1. The original version failed when no upstream was set.
Crash handling: empty or malformed reviewer output is treated as a block rather than silently passing.
Escape hatch: SKIP_PRE_PUSH=1 bypasses everything for emergencies.

Input guards on all hooks

Each hook now validates the actual command (via jq + regex on stdin) before acting. This is defense in depth against settings.json filter regressions - if a hook fires on the wrong command, it exits 0 instead of misbehaving.

Settings.json fixes

Removed the separate pre-push-test.sh entry (now integrated into pre-push-review.sh)
Added missing if filter on pre-pr-draft.sh (was previously matching ALL Bash commands)
Removed post-pr-create-ci-monitor.sh (superseded)

Lessons system

Added .claude/lessons.md - structured file for codifying review feedback across categories (conventions, architecture, testing, security, process)
Added .claude/commands/codify-lesson.md - /codify-lesson slash command that appends lessons with proper formatting and proposes promotion paths (keep as lesson, promote to hook, or promote to CLAUDE.md)
Updated CLAUDE.md to reference in-repo lessons file instead of global ~/.claude path

Agent improvements

Code reviewer and security reviewer now use gh repo view --json defaultBranchRef for diff base resolution instead of hardcoded branch names. Works across repos with different default branches.

What this looked like in practice

On the NTL repo, these improvements caught real issues during automated review:

A u32 -> i32 conversion that would panic on valid inputs (block numbers > i32::MAX)
Missing size validation on a new proto field (note_metadata bypassed max_note_size)
A limit=0 edge case that silently returned empty responses

The severity-based blocking was essential - without it, every push was blocked by doc comment backtick nits while real bugs slipped through in earlier iterations.

🤖 Generated with Claude Code

Consolidated pre-push hook: - Merged pre-push-test.sh into pre-push-review.sh (tests run first, then reviewers only if tests pass - saves reviewer tokens on broken code) - Severity-based blocking: nits and notes don't block push, only Critical/Important/Warning findings do - Proper diff base resolution: tries upstream, falls back to default branch via gh, then HEAD~1 - Handles reviewer crashes gracefully (empty/malformed output = block) - Escape hatch: SKIP_PRE_PUSH=1 Input guards on all hooks: - Each hook validates the command matches before acting (defense in depth against settings.json filter regressions) Improved settings.json: - Removed separate pre-push-test.sh entry (now in pre-push-review.sh) - Added proper `if` filter on pre-pr-draft.sh (was firing on all Bash) - Removed post-pr-create-ci-monitor.sh (superseded) Lessons system: - Added .claude/lessons.md for codifying review feedback - Added .claude/commands/codify-lesson.md slash command - Updated CLAUDE.md to reference in-repo lessons file instead of global ~/.claude path Agent improvements: - Code reviewer and security reviewer use gh-based default branch resolution instead of hardcoded branch names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mmagician · 2026-04-30T08:37:17Z

+# Internal guard: only fire for actual git commit invocations. Defense in depth
+# against settings.json filter regressions.
+COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null)
+echo "$COMMAND" | grep -qE '(^|[[:space:]])git[[:space:]]+(-c[[:space:]]+[^ ]+[[:space:]]+)*commit([[:space:]]|$)' || exit 0


TBH I'm not sure these internal guards add value; were there cases when the hook fired on wrong invocations?
to me this just looks like AI trying very hard to add boilerplate code to the scripts which make them much less human-readable.

The guards exist because the settings.json if-filter on pre-pr-draft.sh was missing in the original PR. Then it fired on all Bash commands. The guard caught that.

But now that the if-filters are correct, the guards are indeed defense-in-depth that adds ~3 lines of noise per hook. Let's remove the guards and rely on the settings.json filters. If a filter regresses, we'll notice it quickly.

mmagician · 2026-04-30T08:37:32Z

+# Internal guard: only fire for actual git commit invocations. Defense in depth
+# against settings.json filter regressions.
+COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null)
+echo "$COMMAND" | grep -qE '(^|[^a-zA-Z0-9_-])gh[[:space:]]+pr[[:space:]]+create([[:space:]]|$)' || exit 0
+REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)


same comments, let's remove them.

mmagician · 2026-04-30T08:41:04Z

 # CLAUDE.md

-> Per-project lessons: ~/.claude/projects/protocol/lessons.md
+> Per-project lessons: `.claude/lessons.md`. Read at the start of every non-trivial task. Append new entries with `/codify-lesson <description>`.


also, do we need LESSONS.md? (I think I originally had it because I was testing this setup, but TBH it's a non-standard Claude file)

I think the codification of review rules should happen in SKILLS.md instead

Should we use SKILLS.md or maybe better CLAUDE.md to codify them? Not sure if accurate but Claude told me that skills are slash commands, not soft rules.

A skill can be invoked explicitly, but Claude also reads the skills and sees which ones should be applied.

mmagician · 2026-04-30T08:43:33Z

+                "type": "command",
+                "if": "Bash(*git push*)",
+                "command": ".claude/hooks/pre-push-review.sh"


I think there's value in keeping the test & review hooks (and thus scripts) separate.
First of all, they are conceptually different, and second, we might decide that we should e.g. run tests on every commit not just before push

Then let's separate them again.

- Remove input guards from all hooks (boilerplate, unclear value) - Restore pre-push-test.sh as separate hook (keep tests and reviews conceptually separate per review feedback) - Remove lessons.md and codify-lesson.md (use SKILLS.md instead) - Revert CLAUDE.md lessons reference - Keep: severity-based blocking, diff base resolution, crash handling, proper if filters, agent branch resolution via gh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mmagician

I'd have another pass at trying to minimize the boilerplate code

mmagician · 2026-04-30T15:26:04Z

 fi

-exit 0
+exit 0


why no newline :(

mmagician · 2026-04-30T15:28:33Z

+# Resolve the unreleased version dynamically. Strategy:
+#   1. PR's own milestone title (most authoritative)
+#   2. lowest open milestone with a version-like title
+#   3. give up; tell the user


our PRs dont have milestones.

Good catch on the milestones, removed the resolution function entirely. The agent now finds the unreleased section by reading CHANGELOG.md directly. Kept the PR_URL / PR_NUMBER / CWD extraction since those are needed to invoke the agent in the right repo and reference the PR in the verdict messages, but happy to drop those too if you'd prefer the agent discover the PR itself.

removed the resolution function entirely.

Have these changes been pushed? I don't see any new commits here

I forgot to push, now it should be there

mmagician · 2026-04-30T15:29:42Z

+PR_URL=$(printf '%s' "$INPUT" | jq -r '.tool_response // empty' \
+          | grep -oE 'https://github\.com/[^\s"]+/pull/[0-9]+' | head -1)
+PR_NUMBER=$(printf '%s' "$PR_URL" | grep -oE '[0-9]+$')
+CWD=$(printf '%s' "$INPUT" | jq -r '.cwd // empty')

-if [ -z "$PR_URL" ]; then
-  exit 0
-fi
+[ -z "$PR_URL" ] || [ -z "$PR_NUMBER" ] || [ -z "$CWD" ] && exit 0

-# Extract PR number
-PR_NUMBER=$(echo "$PR_URL" | grep -oP '\d+$')
+# ----------------------------------------------------------------------------
+# Resolve the unreleased version dynamically. Strategy:
+#   1. PR's own milestone title (most authoritative)
+#   2. lowest open milestone with a version-like title
+#   3. give up; tell the user


this looks like unnecessary code - again, was the changelog agent failing to add a correct entry?

see above comment and new file.

mmagician · 2026-04-30T15:31:05Z

+  SUGGESTED="${SUGGESTED} --draft"
+fi
+
+REASON=$(printf 'PRs must be created as drafts. Re-run with --draft:\n\n  %s\n\nPromote to ready-for-review later with: gh pr ready <num>\nBypass this hook with: SKIP_DRAFT_CHECK=1 gh pr create ...' "$SUGGESTED")


We should not by telling the agent how to bypass the hook...

Good point, removed. The env var still works for human maintainers but the deny reason no longer mentions it.

The env var still works for human maintainers

But human maintainers don't use these workflows to create PRs and are not meant to

Removed the escape hatch completely. The last changes I did manually and not with Claude anymore ...

mmagician requested changes Apr 30, 2026

View reviewed changes

Dominik1999 requested a review from mmagician May 4, 2026 15:23

Dominik1999 added 4 commits May 4, 2026 17:52

Adding new lines at the end of the file

c7294aa

Removing unecessary code from post-pr-create-changelog.sh

c59498c

removing code from pre-pr-draft.sh

cd6f5cd

remove escape hatch from pre-pr-draft.sh

283e867

               fi
-              exit 0
+              exit 0

                
                    No newline at end of file

Conversation

Dominik1999 commented Apr 29, 2026

Summary

Consolidated pre-push hook

Input guards on all hooks

Settings.json fixes

Lessons system

Agent improvements

What this looked like in practice

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmagician left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants