Skip to content

Docker tag monitor: one PR per config-key + push trigger#1318

Open
Oseltamivir wants to merge 15 commits into
mainfrom
docker-monitor-per-config-key-prs
Open

Docker tag monitor: one PR per config-key + push trigger#1318
Oseltamivir wants to merge 15 commits into
mainfrom
docker-monitor-per-config-key-prs

Conversation

@Oseltamivir
Copy link
Copy Markdown
Collaborator

Switch the auto-issue guidance from grouping PRs by framework/image family to one PR per top-level config-key. Add a push trigger scoped to this workflow file so changes can be validated without waiting on the weekly cron.

Switch the auto-issue guidance from grouping PRs by framework/image
family to one PR per top-level config-key. Add a push trigger scoped
to this workflow file so changes can be validated without waiting on
the weekly cron.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace PAT_WITH_WORKFLOW_SCOPE with CLAUDE_PAT so the action can read
collaborator permissions for actor checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +4 to +6
push:
paths:
- '.github/workflows/docker-tag-monitor.yml'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new push trigger (lines 4-6) has no branches: filter and no github.event_name guard on the "Create issue for updates" step, so it fires on every push to any branch that touches this workflow file — including the merge of this PR to main. The inputs.dry_run != true gate at line 167 cannot help: for push events, the inputs context is empty (only populated for workflow_dispatch/workflow_call), so null != true is true and a real @claude-tagged issue gets filed using secrets.CLAUDE_PAT. Fix by adding && github.event_name != 'push' to the create-issue step, or scope the push trigger to a non-default branch.

Extended reasoning...

What the bug is. The PR adds a push trigger filtered only by path (.github/workflows/docker-tag-monitor.yml). There is no branches: filter, so any push to any branch that modifies this file fires the workflow. The "Create issue for updates" step at line 166 is gated by:\n\nyaml\nif: steps.check.outputs.has_updates == 'true' && inputs.dry_run != true\n\n\nIn GitHub Actions semantics, the inputs context is populated only for workflow_dispatch and workflow_call events. For push (and schedule) events, inputs.dry_run is null. The expression null != true evaluates to true, so the gate is satisfied and the step runs with GH_TOKEN: ${{ secrets.CLAUDE_PAT }}, hitting Docker Hub and creating/commenting on a real issue with the body @claude Please update the configurations:.\n\nStep-by-step proof that this fires on the merge of this PR itself.\n1. PR #1318 is merged to main. The merge commit modifies .github/workflows/docker-tag-monitor.yml, so the new push-trigger path filter matches.\n2. The workflow runs with github.event_name == 'push'. There is no branches: filter, so main qualifies.\n3. The "Check for new Docker image releases" step queries Docker Hub. If any monitored repo (lmsysorg/sglang, vllm/vllm-openai, etc.) has released a higher-version tag in the last 7 days that matches a suffix already in use, has_updates=true is set.\n4. The "Create issue for updates" step evaluates its if: 'true' == 'true' && null != truetrue.\n5. The step uses CLAUDE_PAT to either open a new docker-update issue with title [Auto] Docker Image Updates Available - <date> or comment on an existing one — and the body @-mentions claude, which is the trigger for downstream Claude automation.\n\nThe "Dry run output" step at line 187 has the opposite gate (inputs.dry_run == true); since null == true is false, it never runs on push either. So there is no path where a push event executes as a true dry-run — every push is a production run.\n\nAddressing the refutation. A reasonable counter-reading is that "validate without waiting on the cron" means "run it for real, like cron does, but on demand," and that the author deliberately omitted any push-side dry-run guard because dry_run was only ever meant as a workflow_dispatch opt-in, with the existing open-issue dedup at lines 185–189 acting as the spam mitigation. That reading is internally consistent. The reason this still warrants a comment, not silence: (a) the gate inputs.dry_run != true looks like it should protect non-dispatch runs and silently does not — that's a footgun even if intentional; (b) the push trigger has no branches: filter, so every feature-branch push that edits this file also fires a real run, which is rarely what "validate via push" implies; and (c) the merge of this very PR will trigger a real Docker-Hub check and possibly an @claude-tagged issue with CLAUDE_PAT, which is a one-shot side effect worth surfacing to the author before merge so they can confirm it's intended.\n\nHow to fix. If the author wants push to be truly validating, the minimal patch is one of:\n\nyaml\n# Option A: force-bypass issue creation on push\n- name: Create issue for updates\n if: steps.check.outputs.has_updates == 'true' && inputs.dry_run != true && github.event_name != 'push'\n\n\nor restrict the trigger to a non-default branch:\n\nyaml\non:\n push:\n branches: [docker-monitor-test]\n paths: ['.github/workflows/docker-tag-monitor.yml']\n\n\nIf instead the author intends push to behave like cron (real issues, dedup handles spam), an explicit comment in the workflow stating that — and adding branches: [main] to the push trigger so feature-branch edits don't fire real runs — would make the intent unambiguous.

Oseltamivir and others added 13 commits May 12, 2026 12:52
The action rejected runs initiated by the 'claude' bot because only
'Klaud-Cold' was in allowed_bots. Add 'claude' so bot-triggered
flows (e.g. auto-created issues) can run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch allowed_bots from an explicit list to '*' so any bot-triggered
flow can run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
No-op comment bump to re-fire the path-filtered push trigger after the
Claude Code action token + allowed_bots fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Temporary diagnostic for the 403 from anthropics/claude-code-action's
actor permission check. Calls the same endpoint with the secret value
so we can confirm whether the in-Actions PAT differs from the one
verified manually.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace path filter with branch filter so every push to
docker-monitor-per-config-key-prs re-runs the monitor end-to-end,
including changes outside the workflow file itself (e.g. claude.yml).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a push trigger scoped to docker-monitor-per-config-key-prs so we
can exercise the branch's claude.yml without going through the
default-branch issues.opened path. Revert before merging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Print PAT identity, scopes/expiry headers, repo visibility, and the
collaborator permission check. Lets us compare the in-CI secret value
against the manually-verified local token.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduces the exact API call anthropics/claude-code-action makes,
using the same secret value the action receives. If this returns 200
while the action 403s, the action is using a different token under
the hood.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Push-event run confirmed CLAUDE_PAT works (HTTP 200) for the same
endpoint that 403s inside anthropics/claude-code-action. The action
sets its own GITHUB_TOKEN env (to the default workflow token) alongside
our OVERRIDE_GITHUB_TOKEN, and the actor-check path reads GITHUB_TOKEN
first. Step-level env wins, so force GITHUB_TOKEN to CLAUDE_PAT.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Push-event test confirmed that overriding GITHUB_TOKEN with CLAUDE_PAT
on the Claude Code action step resolves the actor-permission 403.
Drop the temporary push trigger, the if-guard escape hatch, and the
diagnostic curl step now that the fix is verified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Snapshot the CI tracker (/api/ci) when building the auto-update issue
body and render a markdown table of self-hosted runner SKU status.
Compute a comma-joined allowed-SKUs list (idle > 0 AND not backlogged
AND not all-offline) and inline it into the @claude prompt so Claude
only opens PRs for config-keys whose hardware segment is currently
available. Also tighten the single-node directive (skip multinode:
true). Falls back gracefully if the tracker is unreachable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add push trigger scoped to docker-monitor-per-config-key-prs so the
monitor + tracker-gated issue body can be exercised end-to-end without
waiting on the weekly cron. Revert before merging to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ALLOWED_SKUS now requires pressureLevel == "clear" (was: != backlogged),
  so steady SKUs no longer pass the gate. Available column in the
  rendered table stays as raw capacity.
- Step 3 of the @claude prompt now explicitly tells Claude to run
  gh pr create after pushing a branch — the previous wording let it
  stop at git push, which only emits a Create-PR hint, never an actual PR.
- Every PR opened from the auto-issue must carry the
  full-sweep-enabled label (downstream automation keys off it).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant