fix(erofs): write per-instance VMDK descriptor to bundle dir; use atomic write#212
Conversation
There was a problem hiding this comment.
Pull request overview
Adds self-healing for the cached merged_fs.vmdk descriptor in the erofs snapshotter: descriptors are now validated on each mount, regenerated when stale, written with extent paths relative to the descriptor directory, and persisted via an atomic temp-file + rename.
Changes:
internal/erofs/vmdk.go: NewParseVMDKDescriptorExtents;DumpVMDKDescriptortakes abaseDirto resolve relative paths for stat-ing;DumpVMDKDescriptorToFilewrites atomically via.tmp+ rename.internal/shim/task/mount.go: Computes per-descriptor relative device paths, validates the existing descriptor via newvmdkDescriptorIsStale, and regenerates on stale/parse failure.internal/erofs/vmdk_test.go: Adds round-trip, stale-detection, relative-path, large-extent, and atomic-write tests.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| internal/erofs/vmdk.go | Adds parser, atomic write, baseDir-aware stat, and relative-path support for descriptor writing. |
| internal/shim/task/mount.go | Builds relative extent paths, validates the cached descriptor, and triggers regeneration on stale/parse failure. |
| internal/erofs/vmdk_test.go | New unit tests covering round-trip parse, stale detection scenarios, relative-path resilience, and atomic write. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b27d0b0 to
d5a569c
Compare
|
Ugh, this is still way too fragile |
|
Agreed on both fronts — generating per-instance and never persisting is the cleaner shape, and matches what the rest of the producer code in mount.go does (the bundle dir is already the natural home for ephemeral per-VM state). I'll rework this PR to:
|
d5a569c to
c51364d
Compare
…mic write The merged_fs.vmdk descriptor was written to the snapshotter directory and reused across mounts via an existence check. If the backing extent files moved or were cleaned up, the cached descriptor became permanently stale with no recovery path. Switch to a simpler, more correct design: - Write the descriptor to the per-instance OCI bundle directory so it is naturally scoped to one container lifetime and torn down with the bundle. No descriptor is ever shared across instances. - Regenerate unconditionally on every transformMounts call. There is no cache to validate, so there is no stale state possible. - Use an atomic write (os.CreateTemp + fsync + os.Rename) so concurrent writers each get a unique temporary file name and a reader never observes a partially-written descriptor. Thread bundleDir through setupMounts and transformMounts so the erofs branch can place the file at filepath.Join(bundleDir, "merged_fs.vmdk"). No new public API is introduced. Signed-off-by: Austin Vazquez <austinvazquez@users.noreply.github.com>
c51364d to
c9d57bc
Compare
hsiangkao
left a comment
There was a problem hiding this comment.
If vmdks are generated in the bundle dirs, I think there is no concurrent write anyway.
but leaving write+fsync+rename is also fine.
The
merged_fs.vmdkdescriptor was written to the snapshotter directory and reused across mounts via a bare existence check. If the backing extent files moved or were cleaned up, the cached descriptor became permanently stale with no recovery path.Switch to a simpler, more correct design:
filepath.Join(bundleDir, "merged_fs.vmdk")) so the file is scoped to one container lifetime and torn down with the bundle. No descriptor is shared across instances.transformMountscall — no existence check, no cache, no stale state possible.os.CreateTemp+fsync+os.Renameso concurrent writers each get a unique temporary filename and a reader never observes a partially-written file.bundleDiris threaded throughsetupMountsandtransformMounts; the erofs branch writes the descriptor tofilepath.Join(bundleDir, "merged_fs.vmdk").No new public API is introduced.