Skip to content

Improve Claude setup based on NTL battle-testing#2847

Draft
Dominik1999 wants to merge 6 commits intommagician-claude-setupfrom
dominik1999-claude/improve-claude-setup
Draft

Improve Claude setup based on NTL battle-testing#2847
Dominik1999 wants to merge 6 commits intommagician-claude-setupfrom
dominik1999-claude/improve-claude-setup

Conversation

@Dominik1999
Copy link
Copy Markdown
Collaborator

Summary

Improvements to the Claude-first setup based on running it against 6 PRs on the note-transport-service repo. These changes address issues found during real use.

Consolidated pre-push hook

  • Merged pre-push-test.sh into pre-push-review.sh: tests run first as a fail-fast gate. Reviewers only spawn if tests pass, saving reviewer API tokens on broken code.
  • Severity-based blocking: nits and notes surface in the output but don't block push. Only Critical/Important/Warning findings block. This prevents trivial style comments from halting the workflow.
  • Robust diff base resolution: tries @{upstream}, falls back to default branch via gh repo view, then HEAD~1. The original version failed when no upstream was set.
  • Crash handling: empty or malformed reviewer output is treated as a block rather than silently passing.
  • Escape hatch: SKIP_PRE_PUSH=1 bypasses everything for emergencies.

Input guards on all hooks

Each hook now validates the actual command (via jq + regex on stdin) before acting. This is defense in depth against settings.json filter regressions - if a hook fires on the wrong command, it exits 0 instead of misbehaving.

Settings.json fixes

  • Removed the separate pre-push-test.sh entry (now integrated into pre-push-review.sh)
  • Added missing if filter on pre-pr-draft.sh (was previously matching ALL Bash commands)
  • Removed post-pr-create-ci-monitor.sh (superseded)

Lessons system

  • Added .claude/lessons.md - structured file for codifying review feedback across categories (conventions, architecture, testing, security, process)
  • Added .claude/commands/codify-lesson.md - /codify-lesson slash command that appends lessons with proper formatting and proposes promotion paths (keep as lesson, promote to hook, or promote to CLAUDE.md)
  • Updated CLAUDE.md to reference in-repo lessons file instead of global ~/.claude path

Agent improvements

  • Code reviewer and security reviewer now use gh repo view --json defaultBranchRef for diff base resolution instead of hardcoded branch names. Works across repos with different default branches.

What this looked like in practice

On the NTL repo, these improvements caught real issues during automated review:

  • A u32 -> i32 conversion that would panic on valid inputs (block numbers > i32::MAX)
  • Missing size validation on a new proto field (note_metadata bypassed max_note_size)
  • A limit=0 edge case that silently returned empty responses

The severity-based blocking was essential - without it, every push was blocked by doc comment backtick nits while real bugs slipped through in earlier iterations.

🤖 Generated with Claude Code

Consolidated pre-push hook:
- Merged pre-push-test.sh into pre-push-review.sh (tests run first,
  then reviewers only if tests pass - saves reviewer tokens on broken code)
- Severity-based blocking: nits and notes don't block push, only
  Critical/Important/Warning findings do
- Proper diff base resolution: tries upstream, falls back to default
  branch via gh, then HEAD~1
- Handles reviewer crashes gracefully (empty/malformed output = block)
- Escape hatch: SKIP_PRE_PUSH=1

Input guards on all hooks:
- Each hook validates the command matches before acting (defense in depth
  against settings.json filter regressions)

Improved settings.json:
- Removed separate pre-push-test.sh entry (now in pre-push-review.sh)
- Added proper `if` filter on pre-pr-draft.sh (was firing on all Bash)
- Removed post-pr-create-ci-monitor.sh (superseded)

Lessons system:
- Added .claude/lessons.md for codifying review feedback
- Added .claude/commands/codify-lesson.md slash command
- Updated CLAUDE.md to reference in-repo lessons file instead of
  global ~/.claude path

Agent improvements:
- Code reviewer and security reviewer use gh-based default branch
  resolution instead of hardcoded branch names

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread .claude/hooks/pre-commit-lint.sh Outdated
Comment on lines +2 to +5
# Internal guard: only fire for actual git commit invocations. Defense in depth
# against settings.json filter regressions.
COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null)
echo "$COMMAND" | grep -qE '(^|[[:space:]])git[[:space:]]+(-c[[:space:]]+[^ ]+[[:space:]]+)*commit([[:space:]]|$)' || exit 0
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I'm not sure these internal guards add value; were there cases when the hook fired on wrong invocations?
to me this just looks like AI trying very hard to add boilerplate code to the scripts which make them much less human-readable.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guards exist because the settings.json if-filter on pre-pr-draft.sh was missing in the original PR. Then it fired on all Bash commands. The guard caught that.

But now that the if-filters are correct, the guards are indeed defense-in-depth that adds ~3 lines of noise per hook. Let's remove the guards and rely on the settings.json filters. If a filter regresses, we'll notice it quickly.

Comment thread .claude/hooks/pre-pr-draft.sh Outdated
Comment on lines +2 to +6
# Internal guard: only fire for actual git commit invocations. Defense in depth
# against settings.json filter regressions.
COMMAND=$(jq -r '.tool_input.command // empty' 2>/dev/null)
echo "$COMMAND" | grep -qE '(^|[^a-zA-Z0-9_-])gh[[:space:]]+pr[[:space:]]+create([[:space:]]|$)' || exit 0
REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments, let's remove them.

Comment thread .claude/CLAUDE.md Outdated
# CLAUDE.md

> Per-project lessons: ~/.claude/projects/protocol/lessons.md
> Per-project lessons: `.claude/lessons.md`. Read at the start of every non-trivial task. Append new entries with `/codify-lesson <description>`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, do we need LESSONS.md? (I think I originally had it because I was testing this setup, but TBH it's a non-standard Claude file)

I think the codification of review rules should happen in SKILLS.md instead

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use SKILLS.md or maybe better CLAUDE.md to codify them? Not sure if accurate but Claude told me that skills are slash commands, not soft rules.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A skill can be invoked explicitly, but Claude also reads the skills and sees which ones should be applied.

Comment thread .claude/settings.json Outdated
Comment on lines +13 to +15
"type": "command",
"if": "Bash(*git push*)",
"command": ".claude/hooks/pre-push-review.sh"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's value in keeping the test & review hooks (and thus scripts) separate.
First of all, they are conceptually different, and second, we might decide that we should e.g. run tests on every commit not just before push

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's separate them again.

- Remove input guards from all hooks (boilerplate, unclear value)
- Restore pre-push-test.sh as separate hook (keep tests and reviews
  conceptually separate per review feedback)
- Remove lessons.md and codify-lesson.md (use SKILLS.md instead)
- Revert CLAUDE.md lessons reference
- Keep: severity-based blocking, diff base resolution, crash handling,
  proper if filters, agent branch resolution via gh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@mmagician mmagician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have another pass at trying to minimize the boilerplate code

Comment thread .claude/hooks/pre-commit-lint.sh Outdated
fi

exit 0
exit 0 No newline at end of file
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no newline :(

Comment on lines +29 to +32
# Resolve the unreleased version dynamically. Strategy:
# 1. PR's own milestone title (most authoritative)
# 2. lowest open milestone with a version-like title
# 3. give up; tell the user
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our PRs dont have milestones.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the milestones, removed the resolution function entirely. The agent now finds the unreleased section by reading CHANGELOG.md directly. Kept the PR_URL / PR_NUMBER / CWD extraction since those are needed to invoke the agent in the right repo and reference the PR in the verdict messages, but happy to drop those too if you'd prefer the agent discover the PR itself.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the resolution function entirely.

Have these changes been pushed? I don't see any new commits here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to push, now it should be there

Comment on lines +21 to +32
PR_URL=$(printf '%s' "$INPUT" | jq -r '.tool_response // empty' \
| grep -oE 'https://github\.com/[^\s"]+/pull/[0-9]+' | head -1)
PR_NUMBER=$(printf '%s' "$PR_URL" | grep -oE '[0-9]+$')
CWD=$(printf '%s' "$INPUT" | jq -r '.cwd // empty')

if [ -z "$PR_URL" ]; then
exit 0
fi
[ -z "$PR_URL" ] || [ -z "$PR_NUMBER" ] || [ -z "$CWD" ] && exit 0

# Extract PR number
PR_NUMBER=$(echo "$PR_URL" | grep -oP '\d+$')
# ----------------------------------------------------------------------------
# Resolve the unreleased version dynamically. Strategy:
# 1. PR's own milestone title (most authoritative)
# 2. lowest open milestone with a version-like title
# 3. give up; tell the user
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like unnecessary code - again, was the changelog agent failing to add a correct entry?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment and new file.

Comment thread .claude/hooks/pre-pr-draft.sh Outdated
SUGGESTED="${SUGGESTED} --draft"
fi

REASON=$(printf 'PRs must be created as drafts. Re-run with --draft:\n\n %s\n\nPromote to ready-for-review later with: gh pr ready <num>\nBypass this hook with: SKIP_DRAFT_CHECK=1 gh pr create ...' "$SUGGESTED")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not by telling the agent how to bypass the hook...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, removed. The env var still works for human maintainers but the deny reason no longer mentions it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env var still works for human maintainers

But human maintainers don't use these workflows to create PRs and are not meant to

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the escape hatch completely. The last changes I did manually and not with Claude anymore ...

Comment thread .claude/hooks/pre-pr-draft.sh Outdated
@Dominik1999 Dominik1999 requested a review from mmagician May 4, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants