Improve Linux backend: process_rw partial-transfer fix, state(), stable module handles, envar addresses#8
Open
killerra wants to merge 3 commits into
Open
Conversation
- Implement Process::state() via /proc/<pid>/stat: zombie/dead states map to ProcessState::Dead (with exit_code when readable), a vanished PID maps to Dead, and permission errors stay Unknown. - Use the module base address as the opaque module handle instead of an index into a freshly parsed maps list, eliminating the race between module_address_list_callback and module_by_address. - Resolve primary_module_address() through /proc/<pid>/exe instead of assuming the 0th mapping, falling back to the first mapping. - Report real in-process addresses for environment variables using env_start from /proc/<pid>/stat, and return the env block address from environment_block_address() (plugin API 2).
After a partial process_vm_readv/writev, the retry syscall resumes at iov offset `offset`, but the result-dispatch loop iterated the local/ remote iovecs and temp_meta from index 0. On every pass after the first, already-reported entries were re-reported with the wrong metadata and local slices, the entries the retry actually transferred were never reported, byte accounting used the wrong iov_lens, and out_fail flagged the wrong element. This corrupted result attribution for any batched read/write spanning an unmapped hole. Align dispatch and accounting with the syscall window by skipping the first `win` entries on all three iterators. Add a regression test that batches [valid, unmapped, valid] reads against our own PID; it fails against the previous code (first region duplicated, third dropped).
Follow-ups from the Linux backend soundness audit:
- Decode the waitpid(2)-style status word from /proc/<pid>/stat into a
real exit code in Process::state(): normal exits report the exit(3)
code, signal deaths report the negated signal number. Previously the
raw status word leaked through (exit(3) surfaced as Dead(768)).
- Derive OsInfo.arch and ProcessInfo::{sys_arch,proc_arch} from the
compile target (x86_64/x86/aarch64) instead of hardcoding x86-64, and
make the process module list callback emit the same arch field that
module_by_address keys on.
- Replace OS kernel-module handles (indices into a name-sorted snapshot
that shift on module load/unload) with a stable hash of the module
name (std DefaultHasher, fixed-seeded), resolved against the live
snapshot.
- Report the real process state in process_info_by_pid via the shared
process_state() helper instead of hardcoding Alive.
1d0b0d9 to
7a0104c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR closes three gaps in the Linux backend relative to the Windows backend, all in
src/linux/process.rs:Process::state()(previously alwaysProcessState::Unknown): reads/proc/<pid>/stat. Zombie (Z) and dead (X/x) states map toProcessState::Deadwithexit_codewhen the kernel exposes it (requires ptrace-read permission, kernel >= 3.5); a vanished PID maps toDead(0)since the exit code is unrecoverable after the process is reaped; permission errors remainUnknown./proc/<pid>/mapslist. This mirrors how the Windows backend usesHMODULEand removes the race where mappings changing betweenmodule_address_list_callbackandmodule_by_addressmade indices resolve to the wrong module.primary_module_address()now resolves/proc/<pid>/exeagainst the maps (falling back to the first file-backed mapping) instead of assuming the 0th entry./proc/<pid>/environis byte-identical to theenv_start..env_endmemory range from/proc/<pid>/stat, so eachEnvVarInfo.addressis derived fromenv_start+ byte offset.environment_block_address()now returnsenv_startinstead of theAddress::NULLstub.Testing
cargo build,cargo clippy,cargo fmt --checkclean against memflow 0.2.4.Alive; an unreaped child reportsDead(0);primary_module_address()resolves to the actual executable mapping and round-trips throughmodule_by_address; module list addresses match mapping bases.#[cfg(memflow_plugin_api = "2")]envar path compiles out against released memflow (plugin API 1), same as the pre-existing envar code in this repo, so it is not covered by local builds or CI. The changes only use the sameEnvVarInfosurface as the existing Windows API-2 code plusStat::env_startfrom procfs 0.15.1.Audit follow-ups (second push)
A soundness review of the rest of the Linux backend surfaced one serious pre-existing bug plus several metadata gaps, addressed in two additional commits:
process_rw(src/linux/mem.rs): after a partialprocess_vm_readv/writev, the retry syscall resumed at the iov window offset, but result dispatch iterated the iovecs and metadata from index 0 — re-reporting already-delivered entries with wrong metadata/buffers, dropping the entries the retry actually transferred, and flagging the wrong element as failed. Any batched read spanning an unmapped hole was affected. Comes with a regression test (batched[valid, unmapped, valid]read against our own PID) that fails on the previous code.state()soexit(3)reportsDead(3)rather thanDead(768); signal deaths report the negated signal number. Unit-tested.OsInfo.archandProcessInfo::{sys_arch,proc_arch}, and make the module list callback emit the same arch fieldmodule_by_addresskeys on.process_info_by_pidreports the real process state via the shared helper instead of hardcodingAlive.Deferred (noted, not in this PR): CI coverage for the plugin-API-2 cfg paths, and a pidfd-based mitigation for the inherent PID-recycling race.