perf(engine): segment-scope SDR→HDR preflight#445
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
miguel-heygen
left a comment
There was a problem hiding this comment.
Review: perf(engine): segment-scope SDR→HDR preflight
Correct approach — mirrors the established VFR segment-scope pattern. -ss before -i (input seeking), -t after -i, smart metadata reuse avoids extra ffprobe, shallow copy prevents caller mutation. The claimed perf numbers (87% drop in hdrPreflightMs) are consistent with scoping a 30-min transcode to 2 seconds.
P1 — Important (inherited, not a regression)
mediaStart beyond source duration → silent empty output. If entry.video.mediaStart > metadata.durationSeconds, -ss seeks past EOF and FFmpeg silently produces a 0-byte file. The subsequent entry.videoPath = convertedPath then points to a broken file. Same edge case exists in the VFR path. Since this code is being touched, worth adding:
if (startTime >= metadata.durationSeconds) { errors.push(...); continue; }P2 — Should fix
No validation that duration > 0 before passing to FFmpeg. If end === start, segDuration = 0, FFmpeg receives -t 0, and silently produces an empty output. One-line guard:
if (duration <= 0) {
throw new Error(`SDR→HDR conversion: duration must be positive (got ${duration})`);
}P3 — Suggestions
- Test lives inside the "on a VFR source" describe block but is about HDR preflight scoping. A separate
describeblock (or renaming to something broader like "extractAllVideoFrames preflight normalization") would improve readability. - Test uses
extractVideoMetadatawhile production code usesextractMediaMetadata. If the latter is canonical going forward, the test should match. - Benchmark reproducibility — the PR body references
/tmp/hf-fixtures/hdr-sdr-mixed-scopewhich isn't committed. Consider adding a script reference for anyone wanting to reproduce the numbers.
Verdict: Approve with minor suggestions
Core change is correct and well-tested. Push for the duration > 0 guard — it's a one-line fix that prevents a silent failure mode.
76948c9 to
5cfabff
Compare
…F mediaStart Address Miguel's review on #445 — harden against two silent-failure modes in convertSdrToHdr that produce 0-byte output files: - duration <= 0: throw explicit Error inside convertSdrToHdr so every caller (current and future) is protected. FFmpeg's `-t 0` silently writes an empty output that the downstream extractor treats as a valid (empty) file; the throw surfaces the misuse. - startTime >= metadata.durationSeconds at the HDR call site: skip the entry and push a descriptive error instead of invoking ffmpeg with `-ss` past EOF (also yields 0-byte output). Matches Miguel's suggested caller-side shape with errors.push + continue. This edge case exists in the VFR path too; kept scope tight here since that's inherited and not a regression.
1be4ec0 to
b66273d
Compare
|
Rebased onto the updated #444 head and pushed the P2 + P1 fixes from <https://github.com/heygen-com/hyperframes/pull/445#pullrequestreview-4164954688|your review>:
Scope note on P1: the same edge case exists in the VFR path too as you pointed out, but I kept the fix tight to HDR since that's what this PR actually touches. Happy to mirror it into the VFR path in a follow-up so the symmetry holds and we don't paper over the inherited issue forever. On the P3s:
Engine tests still green (459/459) with the new guards in place. |
… Phase 3 Address Miguel's P3 on #445 (#445 (comment)): The previous `errors.push({...}); continue;` in the HDR preflight only short-circuited the normalization step — it did NOT remove the invalid entry from `resolvedVideos`, so Phase 3 still called extractVideoFramesRange with the same past-EOF mediaStart and surfaced a second raw FFmpeg error ("Nothing was written into output file") for the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR case and mediaStart=5. Fix: track the skipped-at-HDR-preflight indices in a set, then splice them out of every parallel array (resolvedVideos, videoMetadata, videoColorSpaces) after the HDR preflight loop completes. Iterate backwards so the indices stay stable while splicing. Phase 2b (VFR preflight) and Phase 3 (extract) now see only validated entries, so there's exactly one descriptive error per bad clip instead of two. Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will extend this splice to cover that array during its rebase. The same inherited bug still exists on the VFR path — leaving that for a follow-up PR since it's out of scope for this HDR-preflight fix and was noted as inherited (not a regression) in the original review.
Downstack extension of the #445 past-EOF splice fix. The parallel-array splice that removes HDR-preflight-skipped entries before Phase 3 was introduced on #445 to splice (resolvedVideos, videoMetadata, videoColorSpaces). This commit adds the fourth parallel array (cacheKeyInputs) that #446 introduces, so the cache lookup at `cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot after the splice. All 480 engine tests pass.
miguel-heygen
left a comment
There was a problem hiding this comment.
Re-checked the live head after the past-EOF follow-up. The new splice removes skipped HDR entries before later phases, my synthetic past-EOF repro now yields exactly one descriptive error, and the latest regression CI is green.
## What Adds per-phase timings and counters to `extractAllVideoFrames` and surfaces them on the producer's `RenderPerfSummary` as `videoExtractBreakdown` alongside a new `tmpPeakBytes` workDir size sample. ## Why Phase 2 video extraction has five distinct sub-phases (resolve, HDR probe, HDR preflight, VFR probe, VFR preflight, per-video extract) and today they collapse into a single `videoExtractMs` stage timing. That makes every subsequent perf PR in this stack immeasurable — you can't tell whether a win came from cache hits, preflight scope reduction, or pure extraction speed. This PR is foundational for PR #445 (segment-scope HDR preflight) and PR #446 (content-addressed extraction cache). ## How - New `ExtractionPhaseBreakdown` type with `resolveMs`, `hdrProbeMs`, `hdrPreflightMs/Count`, `vfrProbeMs`, `vfrPreflightMs/Count`, `extractMs`, `cacheHits`, `cacheMisses`. Populated inline with `Date.now()` wrappers — overhead is sub-millisecond on every phase. - Returned on `ExtractionResult.phaseBreakdown`. - Producer extends `RenderPerfSummary` with `videoExtractBreakdown?: ExtractionPhaseBreakdown` and `tmpPeakBytes?: number`. `tmpPeakBytes` is sampled from the workDir right before cleanup via a new recursive-size helper that swallows errors (purely observational — a missing workDir must never fail the render). No changes to the capture-lifecycle resource tracking — earlier versions of this instrumentation plumbed injector LRU stats through `RenderOrchestrator`, which conflicted hard with upstream #371 (`buildHdrCaptureOptions` refactor). Dropped that piece for a marginal observability loss. ## Test plan Validation on `packages/producer/tests/vfr-screen-recording`: ```json "videoExtractBreakdown": { "resolveMs": 0, "hdrProbeMs": 0, "hdrPreflightMs": 0, "hdrPreflightCount": 0, "vfrProbeMs": 0, "vfrPreflightMs": 166, "vfrPreflightCount": 1, "extractMs": 97, "cacheHits": 0, "cacheMisses": 0 }, "tmpPeakBytes": 4578598 ``` Total elapsed within noise of pre-PR baseline (2665 → 2673 → 3228ms across hosts). - [x] Unit test: phase-breakdown assertion added to `videoFrameExtractor.test.ts` - [x] Lint + format (oxlint + oxfmt) - [x] Typecheck (engine + producer) - [x] Manual perf validation against VFR fixture
convertSdrToHdr now takes startTime + duration and limits the re-encode to the segment the composition actually uses. Mirrors the VFR→CFR fix: a 30-minute SDR source contributing a 2-second clip was transcoded in full pre-fix (a >100× waste); post-fix only the used segment is touched and entry.video.mediaStart is zeroed so downstream extraction seeks from 0. Phase 2 now captures the full VideoMetadata per resolvedVideos entry (previously only colorSpace) so segDuration can fall back to the source's natural duration when video.end is Infinity without an extra ffprobe. Validation: /tmp/hf-fixtures/hdr-sdr-mixed-scope hdrPreflightMs: >1000 → 150 (-87%) videoExtractMs: 1272 → 237 (-82%) tmpPeakBytes: 8.2MB → 4.5MB (-45%) Adds a regression test that synthesizes a 10s SDR + 2s HDR fixture inline and asserts the converted SDR file's duration matches the 2s used segment, not the 10s source.
…F mediaStart Address Miguel's review on #445 — harden against two silent-failure modes in convertSdrToHdr that produce 0-byte output files: - duration <= 0: throw explicit Error inside convertSdrToHdr so every caller (current and future) is protected. FFmpeg's `-t 0` silently writes an empty output that the downstream extractor treats as a valid (empty) file; the throw surfaces the misuse. - startTime >= metadata.durationSeconds at the HDR call site: skip the entry and push a descriptive error instead of invoking ffmpeg with `-ss` past EOF (also yields 0-byte output). Matches Miguel's suggested caller-side shape with errors.push + continue. This edge case exists in the VFR path too; kept scope tight here since that's inherited and not a regression.
… Phase 3 Address Miguel's P3 on #445 (#445 (comment)): The previous `errors.push({...}); continue;` in the HDR preflight only short-circuited the normalization step — it did NOT remove the invalid entry from `resolvedVideos`, so Phase 3 still called extractVideoFramesRange with the same past-EOF mediaStart and surfaced a second raw FFmpeg error ("Nothing was written into output file") for the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR case and mediaStart=5. Fix: track the skipped-at-HDR-preflight indices in a set, then splice them out of every parallel array (resolvedVideos, videoMetadata, videoColorSpaces) after the HDR preflight loop completes. Iterate backwards so the indices stay stable while splicing. Phase 2b (VFR preflight) and Phase 3 (extract) now see only validated entries, so there's exactly one descriptive error per bad clip instead of two. Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will extend this splice to cover that array during its rebase. The same inherited bug still exists on the VFR path — leaving that for a follow-up PR since it's out of scope for this HDR-preflight fix and was noted as inherited (not a regression) in the original review.
17d1622 to
a083658
Compare
5cfabff to
31354d5
Compare
… Phase 3 Address Miguel's P3 on #445 (#445 (comment)): The previous `errors.push({...}); continue;` in the HDR preflight only short-circuited the normalization step — it did NOT remove the invalid entry from `resolvedVideos`, so Phase 3 still called extractVideoFramesRange with the same past-EOF mediaStart and surfaced a second raw FFmpeg error ("Nothing was written into output file") for the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR case and mediaStart=5. Fix: track the skipped-at-HDR-preflight indices in a set, then splice them out of every parallel array (resolvedVideos, videoMetadata, videoColorSpaces) after the HDR preflight loop completes. Iterate backwards so the indices stay stable while splicing. Phase 2b (VFR preflight) and Phase 3 (extract) now see only validated entries, so there's exactly one descriptive error per bad clip instead of two. Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will extend this splice to cover that array during its rebase. The same inherited bug still exists on the VFR path — leaving that for a follow-up PR since it's out of scope for this HDR-preflight fix and was noted as inherited (not a regression) in the original review.
a083658 to
6af6fa4
Compare
Downstack extension of the #445 past-EOF splice fix. The parallel-array splice that removes HDR-preflight-skipped entries before Phase 3 was introduced on #445 to splice (resolvedVideos, videoMetadata, videoColorSpaces). This commit adds the fourth parallel array (cacheKeyInputs) that #446 introduces, so the cache lookup at `cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot after the splice. All 480 engine tests pass.
Merge activity
|
Downstack extension of the #445 past-EOF splice fix. The parallel-array splice that removes HDR-preflight-skipped entries before Phase 3 was introduced on #445 to splice (resolvedVideos, videoMetadata, videoColorSpaces). This commit adds the fourth parallel array (cacheKeyInputs) that #446 introduces, so the cache lookup at `cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot after the splice. All 480 engine tests pass.
Downstack extension of the #445 past-EOF splice fix. The parallel-array splice that removes HDR-preflight-skipped entries before Phase 3 was introduced on #445 to splice (resolvedVideos, videoMetadata, videoColorSpaces). This commit adds the fourth parallel array (cacheKeyInputs) that #446 introduces, so the cache lookup at `cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot after the splice. All 480 engine tests pass.
## What Adds a content-addressed cache for extracted video frames, keyed on the tuple `(path, mtime, size, mediaStart, duration, fps, format)`. Repeat renders of the same composition (studio edit → re-render, preview → final) skip the ffmpeg extraction entirely. ## Why Video frame extraction is the dominant non-capture phase for video-heavy compositions. Studio iteration workflows extract the same frames over and over — each render burns ffmpeg time that adds no value. Validated on `/tmp/hf-fixtures/cfr-sdr-cache`: ``` Cold (miss): extractMs=69, videoExtractMs=70, totalElapsedMs=2052 Warm (hit): extractMs=1, videoExtractMs=2, totalElapsedMs=1964 cacheHits: 0→1, cacheMisses: 1→0 ``` The fixture is tiny (3s CFR SDR @ 30fps), so the wall-clock delta is small; the extraction-time delta (69→1ms, 98%) scales linearly with source length. For heavy-iteration workflows (a user rendering the same composition while tuning encoding params), extraction time goes to zero on every repeat render. Depends on #444 (instrumentation surface) and #445 (segment-scope HDR preflight — otherwise cache keys would be unstable across renders on mixed-HDR compositions). ## How - New `packages/engine/src/services/extractionCache.ts`: - SHA-256 key over a stable JSON encoding of `(path, mtime_ms, size, mediaStart, duration, fps, format)`. Infinity duration is normalized to `-1` so unresolved natural-duration sources still produce stable keys. - Truncates to 16 hex chars in the entry directory name — 64 bits of entropy is plenty at cache scale and keeps `ls` output short. - `hfcache-v2-` schema prefix — bumping it invalidates old entries (callers own gc policy; the cache owns keys). - `.hf-complete` dotfile sentinel. An entry dir without the sentinel is treated as a miss (covers crash-mid-extract and abandoned writes); the next render re-extracts over the partial frames with `-y`. - `FRAME_FILENAME_PREFIX = "frame_"` shared with the extractor — future refactors only need to touch one place to rename frames. - `EngineConfig.extractCacheDir` (env: `HYPERFRAMES_EXTRACT_CACHE_DIR`) gates the feature. Undefined disables caching — extraction runs into the render's workDir and cleanup removes it on render end, preserving the prior behaviour exactly. No default root is chosen by the engine; the caller (CLI, app, studio) owns the location policy. - `ExtractedFrames.ownedByLookup` flag prevents `FrameLookupTable.cleanup` from rm'ing a shared cache dir at render end. Set to `true` on both hits and misses (misses own the directory they wrote into, but hand it over to the cache rather than deleting it). - Phase 3 extractor flow: 1. Snapshot `(videoPath, mediaStart, start, end)` per resolved video BEFORE Phase 2a/2b preflight mutates them — so cache keys are stable across renders that use workDir-local normalized files (those files have fresh mtimes every render). 2. Compute key, `lookupCacheEntry`. 3. On hit: rebuild `ExtractedFrames` from the cache dir plus the Phase 2-probed `VideoMetadata` — no re-ffprobe. 4. On miss: `ensureCacheEntryDir`, extract with `extractVideoFramesRange(..., outputDirOverride)`, then `markCacheEntryComplete` (the sentinel write is the last step so a crash leaves the dir un-sentineled). - `extractVideoFramesRange` gains an `outputDirOverride` parameter so cache-miss writes land directly in the keyed dir (no `join(outputDir, videoId)` wrapping). ## Test plan - [x] 19 unit tests in `extractionCache.test.ts` covering key determinism, mtime/size invalidation, format/fps/mediaStart/duration invalidation, Infinity normalization, sentinel semantics, missing-file tolerance - [x] 2 integration tests in `videoFrameExtractor.test.ts`: - "reuses extracted frames on a warm cache hit" — asserts `cacheHits=1`, `extractMs<50ms` on second call against a CFR SDR fixture - "invalidates the cache when fps changes" — different fps on second call forces a new miss - [x] End-to-end validation with `HYPERFRAMES_EXTRACT_CACHE_DIR` set, two runs of the same fixture - [x] Lint + format (oxlint + oxfmt) - [x] Typecheck (engine + producer)
…#454) ## What Forwards the new per-phase extraction breakdown and `tmpPeakBytes` fields from `RenderPerfSummary` (added in #444 and #446) to PostHog via the CLI's existing `render_complete` telemetry event. ## Why The CLI already ships `render_complete` events to PostHog (`packages/cli/src/telemetry/client.ts`), but `events.ts:trackRenderComplete` only carried a subset of `RenderPerfSummary` — top-level timings, composition dims, and memory snapshots. After #444 added per-phase extraction breakdown (`videoExtractBreakdown`) and #446 added cache hit/miss counters, the data lives on `job.perfSummary` at render-complete but never reaches PostHog dashboards. Without this, any PostHog insight built around "how often are we hitting the cache?", "what's the median HDR preflight cost?", or "where in the extract phase do compositions spend time?" has to be answered by Datadog log scraping instead. ## How - **`packages/cli/src/telemetry/events.ts`** — extend `trackRenderComplete` props with 17 new optional fields: `tmpPeakBytes`, the six named stage timings, and the ten `videoExtractBreakdown` fields. All sent as flat properties (`extract_cache_hits`, `stage_capture_ms`, etc.) — PostHog insights query flat keys more ergonomically than nested objects. - **`packages/cli/src/commands/render.ts`** — wire `job.perfSummary.videoExtractBreakdown` / `stages` / `tmpPeakBytes` into the `trackRenderMetrics` → `trackRenderComplete` hand-off. - Naming: `extract_phase3_ms` deliberately disambiguates from `stage_video_extract_ms` — the former is just the parallel ffmpeg extract inside Phase 3; the latter is the full stage (resolve + probe + preflight + extract). - All new fields are optional. The Docker-subprocess branch of `render.ts` that doesn't have a local `perfSummary` still compiles and ships events without them. ## Test plan - [x] `bun run --cwd packages/cli test` — 161/161 pass - [x] `bunx tsc -p packages/cli/tsconfig.json --noEmit` — no errors - [x] `bunx oxlint` + `bunx oxfmt` — clean - [ ] Once merged, verify PostHog receives the new properties on a real render event (run `hyperframes render` against a fixture and watch PostHog ingestion — telemetry auto-disables in CI, so this requires a local dev render with `HYPERFRAMES_NO_TELEMETRY` unset). ## Stack Depends on #444 (adds the `videoExtractBreakdown` + `tmpPeakBytes` fields to `RenderPerfSummary`) and transitively on #445 → #446. ## Future work (not in this PR) - The HeyGen internal producer server (`hyperframes-internal/packages/producer/src/server.ts`) logs `perfSummary` to Datadog via `log.info` but has no PostHog integration. Production renders are the bulk of the traffic — separate PR to either ship perfSummary to PostHog from the internal server, or materialize Datadog log-based metrics for per-phase timings.

What
Scopes the SDR→HDR preflight re-encode to the segment the composition actually uses, mirroring the existing VFR→CFR segment-scope fix.
Why
convertSdrToHdrwas re-encoding entire source files, so a 30-minute SDR screen recording contributing a 2-second clip in a mixed HDR/SDR composition ate multi-second preflight time that produced frames no one would ever read. Validated on a mixed 30s-SDR + 2s-HDR fixture:hdrPreflightMsdrops 87% (1162→148ms),videoExtractMsdrops 82% (1272→231ms),tmpPeakBytesdrops 45% (8.2MB→4.5MB).Depends on #444 (phase-level instrumentation) for the measurement surface.
How
convertSdrToHdrgainsstartTimeanddurationparameters ahead of the upstreamtargetTransferarg added by feat(engine): wire options.hdr through chunkEncoder + dynamic SDR→HDR transfer #370. New signature:convertSdrToHdr(input, output, startTime, duration, targetTransfer, signal, config).-ss $start -t $durationis added to the ffmpeg args.VideoMetadataperresolvedVideosentry (previously justcolorSpace) so the caller can computesegDurationfromvideo.end - video.startwith a fallback tometadata.durationSeconds - video.mediaStartfor unbounded (Infinity) clips — without firing another ffprobe.entry.video.mediaStartis zeroed out via shallow-copy (doesn't mutate the caller'sVideoElement) so downstream extraction seeks from 0 instead of the original offset. Mirrors what the VFR→CFR path already does.Test plan
Validation on
/tmp/hf-fixtures/hdr-sdr-mixed-scope: