Skip to content

Forward SIGTERM to bundle steps for graceful shutdown#3603

Draft
kichristensen wants to merge 3 commits into
getporter:mainfrom
kichristensen:issue3602
Draft

Forward SIGTERM to bundle steps for graceful shutdown#3603
kichristensen wants to merge 3 commits into
getporter:mainfrom
kichristensen:issue3602

Conversation

@kichristensen

Copy link
Copy Markdown
Contributor

What does this change

When a user cancels a porter install/upgrade/uninstall with Ctrl+C, the running bundle step now receives SIGTERM and has up to 30 seconds to clean up before SIGKILL is sent. Outputs written during that window (e.g. a terraform state file) are captured and persisted as in a normal run.

Previously the cascade looked like this:

  1. User presses Ctrl+C → SIGINT
  2. Porter host cancels context → cnab-go calls docker stop (SIGTERM to container PID 1)
  3. Porter runtime (PID 1) only listened for os.Interrupt, so the OS terminated it immediately on SIGTERM
  4. Mixin process (e.g. terraform) was SIGKILL'd by the kernel
  5. No outputs captured — state lost

After this change:

  1. User presses Ctrl+C → SIGINT or SIGTERM
  2. Porter runtime handles it, cancels context gracefully
  3. Mixin runner sends SIGTERM to the mixin process and waits up to 30 s
  4. Mixin (terraform) saves state, writes outputs, exits
  5. Mixin outputs are read and applied even on non-zero exit
  6. Finalize packs the state bag using a non-cancelled context
  7. Best-effort save of any outputs returned by cnab-go

What issue does it fix

Closes #3602

Notes for the reviewer

Upstream dependency: full end-to-end output capture also requires a fix to cnab-go. In driver/docker/docker.go, the case <-ctx.Done() branch of getContainerResult calls ContainerStop but then returns driver.OperationResult{} (empty) without calling fetchOutputs. That means outputs written inside the container to /cnab/app/outputs/ are never copied back to the host. The Porter-side changes here lay the groundwork so that fix lands cleanly.

Windows: configureGracefulShutdown is a no-op on Windows (build-tagged) since SIGTERM is not a real process signal there; the existing SIGKILL behaviour is preserved.

Grace period: 30 seconds was chosen to match common terraform state-write times; this could be made configurable in a follow-up.

Checklist

  • Did you write tests?
  • Did you write documentation?
  • Did you change porter.yaml or a storage document record? Update the corresponding schema file.
  • If this is your first pull request, please add your name to the bottom of our Contributors list. Thank you for making Porter better! 🙇‍♀️

@lbergnehr

Copy link
Copy Markdown
Contributor

Does the fix in cnab-go only apply to explicit outputs not being captured, ot does it also apply to state, in which case I guess that one would have to be fixed as well in order for this to fully work?

@kichristensen

kichristensen commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

State goes through the same path. Finalize (running on the Porter runtime side with a non-cancelled context) packs all state variables into /cnab/app/outputs/porter-state — the same /cnab/app/outputs/ directory that cnab-go's fetchOutputs reads from. So fixing cnab-go to call fetchOutputs in the ctx.Done() branch instead of returning an empty OperationResult{} is required for this to fully work — it will capture both explicit outputs and state in one shot.

…r#3602)

When a user cancels porter with SIGINT, the cancellation now propagates
gracefully to the running bundle step instead of hard-killing it.

- Porter runtime (PID 1 in the invocation image) now handles SIGTERM in
  addition to SIGINT, so Docker's container stop signal reaches it.
- The mixin runner sends SIGTERM to the mixin process on context
  cancellation and waits up to 30 s before falling back to SIGKILL,
  giving steps like terraform time to flush state.
- Mixin outputs are read even when the mixin exits with a non-zero code,
  so state written before shutdown is not silently discarded.
- Finalize (which packs the state bag) now runs with a non-cancelled
  context so state is always persisted after a graceful stop.
- Docker containers are configured with a 30 s stop timeout so SIGKILL
  is not sent before the step has had a chance to clean up.
- action.go makes a best-effort save of any outputs returned by cnab-go
  after context cancellation.

Note: full end-to-end output capture also requires an upstream fix to
cnab-go so that fetchOutputs is called after ContainerStop in the
context-cancelled path of getContainerResult.

Signed-off-by: Kim Christensen <kimworking@gmail.com>
…ture)

- pkg/signals: verify InterruptSignals includes os.Interrupt on all platforms
- pkg/pkgmgmt/client: integration-style test that cancel() sends SIGTERM
  (not SIGKILL) to the child process on POSIX
- pkg/cnab/provider: verify Docker driver is configured with a graceful
  stop timeout on container.Config.StopTimeout
- pkg/runtime: verify mixin outputs are captured even when the mixin exits
  with an error (e.g. after receiving SIGTERM and flushing state)

Signed-off-by: Kim Christensen <kimworking@gmail.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves cancellation behavior for Porter bundle executions so that a cancelled step receives SIGTERM and has time to flush state/outputs before being force-killed, enabling better state persistence (e.g. terraform state) during Ctrl+C / container stop scenarios.

Changes:

  • Forward SIGTERM on context cancellation for mixin/package executions (POSIX) with a 30s grace period before force-kill; keep current behavior on Windows.
  • Capture and apply mixin outputs even when a step exits with a non-zero status, and ensure runtime finalization (state bag packing) still runs after cancellation.
  • Configure the CNAB Docker driver to use a 30s container stop timeout and attempt best-effort output persistence on cancellation.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/signals/signal.go Exposes the set of OS signals used to trigger graceful shutdown.
pkg/signals/signal_test.go Adds a unit test to ensure interrupt signals include os.Interrupt.
cmd/porter/main.go Updates interrupt handling to subscribe to SIGTERM as well as SIGINT (platform-dependent).
pkg/pkgmgmt/client/runner.go Hooks graceful shutdown configuration into the package/mixin runner command creation.
pkg/pkgmgmt/client/runner_posix.go Implements SIGTERM forwarding + 30s grace period on POSIX via exec.Cmd cancellation behavior.
pkg/pkgmgmt/client/runner_windows.go Adds Windows no-op implementation for graceful shutdown forwarding.
pkg/pkgmgmt/client/runner_graceful_shutdown_test.go Adds a POSIX-only test validating SIGTERM forwarding on cancellation.
pkg/runtime/runtime.go Ensures outputs are read/applied even on mixin failure; runs Finalize even when cancelled.
pkg/runtime/runtime_test.go Adds a test validating output capture/application after mixin failure.
pkg/runtime/helpers.go Exposes TestConfig from the test runtime helper for use in new tests.
pkg/cnab/provider/driver.go Configures Docker driver containers with a 30s stop timeout for graceful shutdown.
pkg/cnab/provider/driver_test.go Adds coverage asserting Docker driver stop timeout configuration is applied.
pkg/cnab/provider/action.go Attempts best-effort saving of operation outputs when execution ends due to context cancellation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/signals/signal.go
Comment thread pkg/pkgmgmt/client/runner_posix.go
Comment thread pkg/runtime/runtime.go
Comment thread pkg/cnab/provider/action.go
Comment thread pkg/runtime/runtime_test.go Outdated
Comment thread pkg/pkgmgmt/client/runner_windows.go Outdated
@kichristensen kichristensen changed the title feat: forward SIGTERM to bundle steps for graceful shutdown Forward SIGTERM to bundle steps for graceful shutdown Jun 13, 2026
Signed-off-by: Kim Christensen <kimworking@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bundle step should get a cancel signal in order to allow graceful shutdown with cleanup during porter SIGINT

3 participants