feat(fuzz): add egfx_multi_frame oracle and target#1336
Open
Greg Lamberson (glamberson) wants to merge 1 commit into
Open
feat(fuzz): add egfx_multi_frame oracle and target#1336Greg Lamberson (glamberson) wants to merge 1 commit into
Greg Lamberson (glamberson) wants to merge 1 commit into
Conversation
This was referenced May 28, 2026
Implements target 5 of Devolutions#1316 (egfx fuzz-coverage umbrella): a multi-frame oracle that drives a sequence of Arbitrary-derived GfxPdus through a single GraphicsPipelineClient + surface-cache pair across each fuzz iteration, exposing cross-PDU state corruption that single-shot fuzzers cannot reach. Harness shape (per the design note posted as issuecomment-4559459227 on Devolutions#1316). The oracle: 1. Constructs one GraphicsPipelineClient (no-op handler, no H.264 decoder) at iteration start. The surface cache, state machine, and ZGFX decompressor history are owned by this single instance. 2. Initialises the channel by calling DvcProcessor::start so the client moves into its post-start state. 3. Uses arbitrary::Unstructured to derive Vec<GfxPdu> from the raw fuzz input. Each GfxPdu variant is Arbitrary via the cascade in PR Devolutions#1334. 4. For each PDU: encodes back to wire bytes via encode_vec, wraps in a single uncompressed ZGFX segment via wrap_uncompressed, and feeds the wire bytes to the client's public DvcProcessor::process entry point. Errors and panics propagate to libFuzzer naturally. The harness deliberately routes through the public process() entry rather than the private handle_pdu so the dispatch path exercises the same ZGFX-decompress + GfxPdu-decode + per-variant-dispatch + state machine + surface cache flow that production traffic takes. The uncompressed ZGFX wrapper bypasses the ZGFX decoder layer (already covered by egfx_zgfx_decompress in a separate PR) and concentrates fuzz pressure on the dispatch + state machine. What this catches: panics or sanitizer reports along the dispatch + state machine path under adversarially-ordered PDU sequences; inconsistent surface-cache state under attacker-controlled CreateSurface / DeleteSurface / Map* orderings; corrupted frame-id state from interleaved StartFrame / EndFrame / FrameAcknowledge sequences; ZGFX-wrapper integration bugs separate from the standalone ZGFX coverage. What this does NOT catch: cross-frame H.264 decoder state corruption. The client is constructed with h264_decoder: None, so H264-bearing PDUs (WireToSurface1 with AVC codecs) skip the H.264 decoder. The standalone egfx_avc420_decode and egfx_avc444_decode targets cover the H.264 wrapper. Wiring a real (or mock) H.264 decoder into this harness can be a follow-up if frame-to-frame H.264 state coverage surfaces as a gap. Bug discovery during development. The first smoke fuzz on the in-progress harness surfaced a bit_field::set_bits panic in Timestamp::encode where derive(Arbitrary) generated u8/u16 values exceeding the encoder's 6-bit / 10-bit pack widths. The fix landed in PR Devolutions#1334 (the cascade prerequisite) as manual Arbitrary impls for Timestamp, QuantQuality, and the Encoding bitflag, mapping each field to its wire-allowed range. This is the "sanitize via mask" shape from Devolutions#1122's body. Validation. 15-minute rigorous libFuzzer + ASan smoke fuzz on the final SHA: 3,659,157 iterations in 901 seconds (~4,061 exec/s sustained), peak RSS 96 MB, zero panics, zero sanitizer reports, zero OOMs. Corpus grew to ~4,558 entries via libFuzzer's coverage discovery. The regression-replay test check_egfx_multi_frame passes against the seed-empty corpus entry. Depends on PR Devolutions#1334 (Arbitrary cascade across ironrdp-egfx public PDU types). CI on this PR's branch reds until Devolutions#1334 merges or this PR's branch rebases onto master post-merge. That's the correct mechanical signal of the dependency per the no-exclude-then-follow-up rule. Refs Devolutions#1316.
223058f to
d6911a8
Compare
Greg Lamberson (glamberson)
added a commit
to lamco-admin/IronRDP
that referenced
this pull request
Jun 8, 2026
Implements target 6 of Devolutions#1316 (egfx fuzz-coverage umbrella): the egfx_surface_state oracle. Sibling of egfx_multi_frame (target 5, PR Devolutions#1336) with two distinct properties: 1. The PDU stream is narrowed to a SurfaceLifecyclePdu enum that contains only the seven state-affecting variants: ResetGraphics, CreateSurface, DeleteSurface, MapSurfaceToOutput, StartFrame, EndFrame, FrameAcknowledge. Each iteration concentrates fuzz pressure on the surface and frame state machine rather than spreading across all 23 GfxPdu variants. 2. The oracle maintains a parallel ExpectedState model alongside the client. For each successfully-encoded PDU the model is updated to mirror the client's documented state transition (CreateSurface inserts a surface unless width or height is zero; DeleteSurface removes; MapSurfaceToOutput updates is_mapped and origin; ResetGraphics clears all surfaces; EndFrame increments the running frame counter). After dispatch the client's observable state via the public get_surface and total_frames_decoded getters is asserted against the model. This catches a different bug class from egfx_multi_frame's panic / sanitizer oracle: logic bugs in state transitions that produce wrong but non-crashing observable state. Examples covered: client retains a surface after DeleteSurface, client clobbers is_mapped on a surface that did not receive MapSurfaceToOutput, client fails to clear all surfaces on ResetGraphics, client increments total_frames_decoded on a PDU other than EndFrame. The model encodes the implementation's current documented behavior (e.g., zero-dimension CreateSurface is silently skipped per handle_create_surface), so the oracle catches drift from that behavior. Spec-compliance verification is a separate concern that would require a parallel spec-only model. What this does NOT catch: state transitions that exist only in private fields without a public getter (current_frame_id, frames_queued). Exposing observable access to those fields is a separate egfx-public-API change, out of scope here. Validation. 15-minute rigorous libFuzzer + ASan smoke fuzz on the final SHA: 2,770,751 iterations in 901 seconds (~3,075 exec/s sustained), peak RSS 102 MB, zero panics, zero sanitizer reports, zero state-model assertion failures. Corpus grew to ~1,840 entries via libFuzzer's coverage discovery. The check_egfx_surface_state regression-replay test passes against the seed-empty corpus entry. Stack dependency. This PR is stacked on PR Devolutions#1336 (egfx_multi_frame, target 5) which is stacked on PR Devolutions#1334 (Arbitrary cascade). CI on this branch will red until both ancestors merge or this branch rebases onto master post-merge. With this PR the Devolutions#1316 umbrella has its initial six target slots filled (1 merged, 2+2b+3+5+6 open). Refs Devolutions#1316. Depends on Devolutions#1334 and Devolutions#1336.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
egfx_multi_frameoracle and target.issuecomment-4559459227on egfx: extend fuzz coverage to OpenH264-wrapper, ZGFX, multi-frame state, and encoder round-trip #1316): anArbitrary-derivedVec<GfxPdu>drives a singleGraphicsPipelineClient+ surface-cache pair across each fuzz iteration, exposing cross-PDU state corruption that single-shot fuzzers cannot reach.DvcProcessor::processentry. The harness deliberately routes throughprocess()rather than the privatehandle_pduso production traffic's full path is exercised (ZGFX-decompress + GfxPdu-decode + per-variant dispatch + state machine + surface cache).Dependency
This PR depends on #1334 (Arbitrary cascade across
ironrdp-egfxpublic PDU types). The harness usesArbitrary-derivedVec<GfxPdu>, which requires the cascade. CI on this branch will red until #1334 merges or this branch rebases onto master post-merge. That is the correct mechanical signal of the dependency per the no-exclude-then-follow-up rule.Validation
cargo xtask ciclean locally on1.89.0: fmt, lints, tests, typos, locks all green.check_egfx_multi_frameregression-replay test passes against the seed-empty corpus entry.What this catches
CreateSurface/DeleteSurface/Map*orderings.StartFrame/EndFrame/FrameAcknowledgesequences.egfx_zgfx_decompress(PR feat(fuzz): add egfx_zgfx_decompress oracle and harden the ZGFX decoder against attacker-controlled input #1333).What this does NOT catch
h264_decoder: None, soWireToSurface1PDUs carrying AVC payloads skip the decoder. The standaloneegfx_avc420_decode(PR feat(fuzz): add egfx_avc420_decode oracle and target #1326) andegfx_avc444_decode(PR feat(fuzz): add egfx_avc444_decode oracle and target #1327) targets cover the H.264 wrapper. Wiring a real or mock H.264 decoder into this harness can be a follow-up if frame-to-frame H.264 state coverage surfaces as a gap.Notes
bit_field::set_bitspanic inTimestamp::encodewherederive(Arbitrary)generatedu8/u16values exceeding the encoder's 6-bit and 10-bit pack widths. The fix landed preemptively in PR feat(egfx): cascade Arbitrary derives across ironrdp-egfx public PDU types #1334 as manualArbitraryimpls forTimestamp,QuantQuality, and theEncodingbitflag (mapping each field to its wire-allowed range). That follows thesanitize via maskshape from ImplementArbitraryfor more PDU structs #1122's body.as _imports forArbitraryandDvcProcessorper clippy'sunused_trait_nameslint.NoOpHandleris an empty struct: everyGraphicsPipelineHandlermethod has a default impl in the trait.Refs #1316. Depends on #1334.