Summary
Five-plus deploy failures over 48 hours on a Linux App Service (Node 22, Next.js 14.2.35), all 409 Conflict via OneDeploy, all triggered when concurrency.cancel-in-progress: true cancelled a prior in-flight workflow run.
Pattern: cancel-in-progress fires → Azure-side deploy state machine doesn't fully clear → next deploy hits 409. Manual recovery requires Azure Portal "Restart" + workflow re-run.
This appears to be a race between GitHub Actions' cancellation propagation and App Service's internal deploy lock release.
Reproducer
- App Service: Linux, Node 22, ~152MB artifact (Next.js standalone build + node_modules.tar.gz)
- Workflow uses
azure/webapps-deploy@v3 with concurrency.cancel-in-progress: true
- Push commit A → deploy starts
- Push commit B before A finishes → A is cancelled mid-OneDeploy
- B's deploy step fails immediately with
Conflict (CODE: 409)
- Workflow re-runs of B also 409 until App Service is manually Restarted
Occurrences (workflow run IDs)
- 27731256128 (2026-06-17, commit 9ca627f) — 409 after cancel
- 27741202083 (2026-06-18, commit 78ae3e6) — 409 after cancel
- 27783177387 (2026-06-18, commit 654db62) — 409 twice in a row (initial + rerun-failed)
What we've ruled out
- Azure Service Health: no incidents in our region during any occurrence
- Build artifact: identical artifact deploys successfully on retry (after manual Restart)
- Code: pattern reproduces across distinct commits / unrelated changes
- Plan SKU / region noise: pattern persists across multiple days
Questions for the team
- Is there a known timing race between GHA cancellation and App Service deploy lock release?
- Is there an API to force-clear the App Service deploy lock without a full Restart?
- Should
cancel-in-progress: true be considered unsafe with webapps-deploy@v3 on Linux? Should we document a recommended alternative?
- Even on successful runs the deploy step takes 28-38 minutes for a 152MB artifact — is that expected for OneDeploy on Linux? Service Health shows no degradation.
Happy to share full GHA logs, app_log_stream excerpts, and the workflow file on request.
Summary
Five-plus deploy failures over 48 hours on a Linux App Service (Node 22, Next.js 14.2.35), all 409 Conflict via OneDeploy, all triggered when
concurrency.cancel-in-progress: truecancelled a prior in-flight workflow run.Pattern: cancel-in-progress fires → Azure-side deploy state machine doesn't fully clear → next deploy hits 409. Manual recovery requires Azure Portal "Restart" + workflow re-run.
This appears to be a race between GitHub Actions' cancellation propagation and App Service's internal deploy lock release.
Reproducer
azure/webapps-deploy@v3withconcurrency.cancel-in-progress: trueConflict (CODE: 409)Occurrences (workflow run IDs)
What we've ruled out
Questions for the team
cancel-in-progress: truebe considered unsafe withwebapps-deploy@v3on Linux? Should we document a recommended alternative?Happy to share full GHA logs, app_log_stream excerpts, and the workflow file on request.