Skip to content

shim/task: wait for IO drain before closing stream connections#211

Merged
dmcgowan merged 1 commit into
containerd:mainfrom
dmcgowan:fix-stdio-dropped-packets
Jun 2, 2026
Merged

shim/task: wait for IO drain before closing stream connections#211
dmcgowan merged 1 commit into
containerd:mainfrom
dmcgowan:fix-stdio-dropped-packets

Conversation

@dmcgowan
Copy link
Copy Markdown
Member

@dmcgowan dmcgowan commented Jun 1, 2026

ioShutdown was closing the vsock stream connections before the host copy goroutines had finished draining them. A goroutine blocked in io.CopyBuffer's Read would see 'use of closed network connection' and return early, silently dropping bytes that were still buffered in the kernel socket receive queue (close-before-drain race).

Fix: wait for ioDone (closed when both stdout and stderr copy goroutines return) before closing streams. In normal operation ioDone always fires on its own — the container process exiting closes the runc pipe, vminitd sees EOF and closes its vsock conn, which propagates EOF to the host copy goroutines. ctx.Done() is only reached in exceptional cases (VM crash, vminitd hang, kernel wedge), where it acts as a deadline-bounded fallback to avoid pinning cleanup indefinitely.

Copilot AI review requested due to automatic review settings June 1, 2026 07:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a close-before-drain race in the shim IO shutdown path by ensuring stdout/stderr copy goroutines can finish draining buffered data before the underlying vsock stream connections are closed.

Changes:

  • Wait for ioDone (stdout/stderr copy completion) before closing the stream connections during IO shutdown.
  • Preserve ctx.Done() as an exceptional-case escape hatch, returning the context error when cancellation/deadline occurs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/shim/task/io.go
ioShutdown was closing the vsock stream connections before the host
copy goroutines had finished draining them. A goroutine blocked in
io.CopyBuffer's Read would see 'use of closed network connection' and
return early, silently dropping bytes that were still buffered in the
kernel socket receive queue (close-before-drain race).

Fix: wait for ioDone (closed when both stdout and stderr copy goroutines
return) before closing streams. In normal operation ioDone always fires
on its own — the container process exiting closes the runc pipe, vminitd
sees EOF and closes its vsock conn, which propagates EOF to the host copy
goroutines. ctx.Done() is only reached in exceptional cases (VM crash,
vminitd hang, kernel wedge), where it acts as a deadline-bounded
fallback to avoid pinning cleanup indefinitely.

Signed-off-by: Derek McGowan <derek@mcg.dev>
@dmcgowan dmcgowan force-pushed the fix-stdio-dropped-packets branch from b0831a2 to 918929f Compare June 1, 2026 07:51
@dmcgowan
Copy link
Copy Markdown
Member Author

dmcgowan commented Jun 1, 2026

Shim tests currently fail without this change https://github.com/dmcgowan/shimtest/actions/runs/26741679025/job/78806983484

Comment thread internal/shim/task/io.go
@dmcgowan dmcgowan merged commit 1ca01a5 into containerd:main Jun 2, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants