Skip to content

perf(engine): content-addressed extraction cache for video frames#446

Merged
jrusso1020 merged 4 commits intomainfrom
perf/v2/extraction-cache
Apr 24, 2026
Merged

perf(engine): content-addressed extraction cache for video frames#446
jrusso1020 merged 4 commits intomainfrom
perf/v2/extraction-cache

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented Apr 23, 2026

What

Adds a content-addressed cache for extracted video frames, keyed on the tuple (path, mtime, size, mediaStart, duration, fps, format). Repeat renders of the same composition (studio edit → re-render, preview → final) skip the ffmpeg extraction entirely.

Why

Video frame extraction is the dominant non-capture phase for video-heavy compositions. Studio iteration workflows extract the same frames over and over — each render burns ffmpeg time that adds no value.

Validated on /tmp/hf-fixtures/cfr-sdr-cache:

Cold (miss): extractMs=69,  videoExtractMs=70,  totalElapsedMs=2052
Warm (hit):  extractMs=1,   videoExtractMs=2,   totalElapsedMs=1964
cacheHits: 0→1, cacheMisses: 1→0

The fixture is tiny (3s CFR SDR @ 30fps), so the wall-clock delta is small; the extraction-time delta (69→1ms, 98%) scales linearly with source length. For heavy-iteration workflows (a user rendering the same composition while tuning encoding params), extraction time goes to zero on every repeat render.

Depends on #444 (instrumentation surface) and #445 (segment-scope HDR preflight — otherwise cache keys would be unstable across renders on mixed-HDR compositions).

How

  • New packages/engine/src/services/extractionCache.ts:
    • SHA-256 key over a stable JSON encoding of (path, mtime_ms, size, mediaStart, duration, fps, format). Infinity duration is normalized to -1 so unresolved natural-duration sources still produce stable keys.
    • Truncates to 16 hex chars in the entry directory name — 64 bits of entropy is plenty at cache scale and keeps ls output short.
    • hfcache-v2- schema prefix — bumping it invalidates old entries (callers own gc policy; the cache owns keys).
    • .hf-complete dotfile sentinel. An entry dir without the sentinel is treated as a miss (covers crash-mid-extract and abandoned writes); the next render re-extracts over the partial frames with -y.
    • FRAME_FILENAME_PREFIX = "frame_" shared with the extractor — future refactors only need to touch one place to rename frames.
  • EngineConfig.extractCacheDir (env: HYPERFRAMES_EXTRACT_CACHE_DIR) gates the feature. Undefined disables caching — extraction runs into the render's workDir and cleanup removes it on render end, preserving the prior behaviour exactly. No default root is chosen by the engine; the caller (CLI, app, studio) owns the location policy.
  • ExtractedFrames.ownedByLookup flag prevents FrameLookupTable.cleanup from rm'ing a shared cache dir at render end. Set to true on both hits and misses (misses own the directory they wrote into, but hand it over to the cache rather than deleting it).
  • Phase 3 extractor flow:
    1. Snapshot (videoPath, mediaStart, start, end) per resolved video BEFORE Phase 2a/2b preflight mutates them — so cache keys are stable across renders that use workDir-local normalized files (those files have fresh mtimes every render).
    2. Compute key, lookupCacheEntry.
    3. On hit: rebuild ExtractedFrames from the cache dir plus the Phase 2-probed VideoMetadata — no re-ffprobe.
    4. On miss: ensureCacheEntryDir, extract with extractVideoFramesRange(..., outputDirOverride), then markCacheEntryComplete (the sentinel write is the last step so a crash leaves the dir un-sentineled).
  • extractVideoFramesRange gains an outputDirOverride parameter so cache-miss writes land directly in the keyed dir (no join(outputDir, videoId) wrapping).

Test plan

  • 19 unit tests in extractionCache.test.ts covering key determinism, mtime/size invalidation, format/fps/mediaStart/duration invalidation, Infinity normalization, sentinel semantics, missing-file tolerance
  • 2 integration tests in videoFrameExtractor.test.ts:
    • "reuses extracted frames on a warm cache hit" — asserts cacheHits=1, extractMs<50ms on second call against a CFR SDR fixture
    • "invalidates the cache when fps changes" — different fps on second call forces a new miss
  • End-to-end validation with HYPERFRAMES_EXTRACT_CACHE_DIR set, two runs of the same fixture
  • Lint + format (oxlint + oxfmt)
  • Typecheck (engine + producer)

Copy link
Copy Markdown
Collaborator Author

jrusso1020 commented Apr 23, 2026

jrusso1020 added a commit that referenced this pull request Apr 23, 2026
Review feedback on PR #446:

- Move the readdir/filter/sort rebuild of ExtractedFrames on cache hit
  into a new `rehydrateCacheEntry` helper inside the cache module — the
  extractor no longer imports FRAME_FILENAME_PREFIX or assembles frame
  paths by hand, so cache layout stays owned by the cache.
- Extract `resolveSegmentDuration` helper to replace two copies of the
  "requested duration → fallback to source natural" logic in Phase 3.
- Collapse the nested Phase 3 cache logic into a `tryCachedExtract`
  helper that returns `ExtractedFrames | null`. Reads top-down now.
- Move source-file `statSync` out of `canonicalKeyBlob` into a `readKeyStat`
  helper the extractor calls once per video in Phase 1. Eliminates the
  redundant statSync on every `computeCacheKey` / `lookupCacheEntry`.
  `CacheKeyInput` now requires `mtimeMs` / `size` — cleaner contract.
- Drop the redundant `existsSync` in `ensureCacheEntryDir`: `mkdirSync`
  with `recursive:true` is already idempotent.
- Tighten `CacheKeyInput.format` from `string` to `CacheFrameFormat`
  (`"jpg" | "png"`). Catches typos at compile time.
- Add an explicit concurrency caveat on `markCacheEntryComplete` — the
  lookup→populate→mark sequence is non-atomic; document the tradeoff
  in code rather than only in the commit history.
- Stop re-exporting cache internals (SCHEMA_PREFIX, COMPLETE_SENTINEL,
  FRAME_FILENAME_PREFIX, cacheEntryDirName, computeCacheKey, ensureCacheEntryDir,
  lookupCacheEntry, markCacheEntryComplete, CacheEntry, CacheKeyInput,
  CacheLookup) from the engine's public index.ts — they're only used by
  the colocated extractor and its tests, which import relatively.

Behavior unchanged — same cold/warm validation on /tmp/hf-fixtures/cfr-sdr-cache:
  cold: extractMs=71, cacheMisses=1
  warm: extractMs=0,  cacheHits=1   (was 1 → 0 post-refactor)

All 32 cache+extractor tests pass.
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: perf(engine): content-addressed extraction cache

Solid implementation. The opt-in gating (extractCacheDir / HYPERFRAMES_EXTRACT_CACHE_DIR, disabled by default) ensures zero behavioral change when unset. Pre-preflight key snapshot, sentinel-based atomicity, ownedByLookup cleanup separation, schema versioning (hfcache-v2-) — all well-designed. The simplify pass cleaned things up meaningfully (rehydrateCacheEntry, resolveSegmentDuration, flattened Phase 3).

Test quality is good — 19 unit tests cover key determinism, invalidation on each tuple component, Infinity normalization, sentinel semantics, and missing-file tolerance. 2 integration tests validate cold/warm paths with real FFmpeg.

P1 — Important (non-blocking, track as follow-up)

Hash truncation to 64 bits — silent corruption on collision. The 16 hex-char truncation gives ~2^32 birthday bound. Safe at local scale, but a collision silently serves wrong frames with no error. Consider writing the full 64-char hash into the .hf-complete sentinel and verifying on lookup — turns silent corruption into a cache miss. One-line improvement.

No eviction mechanism. Cache grows unbounded. Each segment stores potentially hundreds of extracted frames. Should be tracked as follow-up — a hyperframes cache clean --max-age CLI command or purgeStaleEntries(rootDir, maxAgeMs) utility. The schema prefix makes version-based cleanup straightforward.

P2 — Should fix

Concurrent renders can race on partial frame reads. If render A is writing frame_00050.jpg while render B reads the directory in rehydrateCacheEntry, B could see a truncated file. The sentinel prevents serving an incomplete set of frames, but not a partially-written individual frame. Unlikely in single-user usage, but worth documenting that concurrent use of the same cache dir is unsupported, or using atomic rename(tmp, final) per frame.

readKeyStat returns {mtimeMs: 0, size: 0} for missing files. Two missing files would share the same (0, 0) stat tuple, and the downstream extractor would fail with "file not found" anyway — but the zero-stat key pollutes the cache with dead entries. Returning null and skipping the cache path would prevent orphaned entries.

P3 — Suggestions

  • Integration test cleanup uses inline rmSync rather than afterEach — leaked on assertion failure. The unit tests in extractionCache.test.ts correctly use afterEach.
  • canonicalKeyBlob uses abbreviated property names (p, m, s, ms, d, f, fmt). Since the blob is only hashed (never stored), there's no benefit to compression. Full names would aid debugging. Only change if a v3 bump is planned.
  • rehydrateCacheEntry doesn't validate the %05d pattern in filenames. A stray file matching prefix/suffix would shift indices. Minor since the directory is cache-controlled.
  • mtime reliability on network filesystems — worth a note in the extractCacheDir doc that NFS/SMB mounts may have coarser granularity.

Verdict: Approve with suggestions

The implementation is well-designed and the dependency chain (#444#445#446) is clean. Main follow-ups: hash collision mitigation (one-line sentinel check) and eviction strategy (CLI command or utility). None of the issues are blocking for merge.

@jrusso1020 jrusso1020 force-pushed the perf/v2/hdr-segment-scope branch from 1be4ec0 to b66273d Compare April 23, 2026 22:04
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
Review feedback on PR #446:

- Move the readdir/filter/sort rebuild of ExtractedFrames on cache hit
  into a new `rehydrateCacheEntry` helper inside the cache module — the
  extractor no longer imports FRAME_FILENAME_PREFIX or assembles frame
  paths by hand, so cache layout stays owned by the cache.
- Extract `resolveSegmentDuration` helper to replace two copies of the
  "requested duration → fallback to source natural" logic in Phase 3.
- Collapse the nested Phase 3 cache logic into a `tryCachedExtract`
  helper that returns `ExtractedFrames | null`. Reads top-down now.
- Move source-file `statSync` out of `canonicalKeyBlob` into a `readKeyStat`
  helper the extractor calls once per video in Phase 1. Eliminates the
  redundant statSync on every `computeCacheKey` / `lookupCacheEntry`.
  `CacheKeyInput` now requires `mtimeMs` / `size` — cleaner contract.
- Drop the redundant `existsSync` in `ensureCacheEntryDir`: `mkdirSync`
  with `recursive:true` is already idempotent.
- Tighten `CacheKeyInput.format` from `string` to `CacheFrameFormat`
  (`"jpg" | "png"`). Catches typos at compile time.
- Add an explicit concurrency caveat on `markCacheEntryComplete` — the
  lookup→populate→mark sequence is non-atomic; document the tradeoff
  in code rather than only in the commit history.
- Stop re-exporting cache internals (SCHEMA_PREFIX, COMPLETE_SENTINEL,
  FRAME_FILENAME_PREFIX, cacheEntryDirName, computeCacheKey, ensureCacheEntryDir,
  lookupCacheEntry, markCacheEntryComplete, CacheEntry, CacheKeyInput,
  CacheLookup) from the engine's public index.ts — they're only used by
  the colocated extractor and its tests, which import relatively.

Behavior unchanged — same cold/warm validation on /tmp/hf-fixtures/cfr-sdr-cache:
  cold: extractMs=71, cacheMisses=1
  warm: extractMs=0,  cacheHits=1   (was 1 → 0 post-refactor)

All 32 cache+extractor tests pass.
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
…che caveats

Address Miguel's review on #446:

- P2: readKeyStat now returns null when the source file is missing instead
  of the {mtimeMs: 0, size: 0} zero-tuple sentinel. Two unrelated missing
  paths would otherwise share the same cache key and pollute the cache
  with orphaned entries. The cacheKeyInputs builder propagates the null
  forward; tryCachedExtract already skips the cache path for null entries
  so the extractor surfaces the real file-not-found error downstream.
- P2: document the single-writer caveat on EngineConfig.extractCacheDir.
  Concurrent renders targeting the same cache dir can race on partial
  frame reads — the .hf-complete sentinel prevents serving an incomplete
  entry but individual frames are written non-atomically. Also flagged
  the coarser-mtime-on-network-filesystems hazard for future ops.
- Updated the existing "tolerates a missing source" test to assert the
  new contract (readKeyStat returns null) rather than the old "zero-stat
  sentinel" behaviour.

All 480 engine tests pass.
@jrusso1020 jrusso1020 force-pushed the perf/v2/extraction-cache branch from 8a31b2c to 3c36324 Compare April 23, 2026 22:10
@jrusso1020
Copy link
Copy Markdown
Collaborator Author

Rebased onto updated #445 and pushed 3c36324. Addressed both P2 items from <https://github.com/heygen-com/hyperframes/pull/446#pullrequestreview-4164956477|your review>:

  1. readKeyStat zero-tuple pollution. Return type is now {mtimeMs: number; size: number} | null. The cacheKeyInputs builder in videoFrameExtractor.ts propagates the null (entries with missing files become null in the array), and tryCachedExtract already had the if (!keyInput) return null; guard — so missing files now cleanly skip the cache path and let the extractor surface the real file-not-found error downstream. No shared (0, 0) tuple, no orphaned entries. The existing "tolerates a missing source" test now asserts the new contract (readKeyStat returns null) rather than exercising the old zero-stat sentinel behaviour.
  2. Concurrent-renders caveat. Documented on EngineConfig.extractCacheDir — single-writer is the intended model, the .hf-complete sentinel prevents serving an incomplete entry but individual frames are written non-atomically so a concurrent reader could observe a truncated frame. Also flagged the coarser-mtime-on-NFS/SMB hazard for ops. Didn't go the per-frame rename(tmp, final) route since the fix surface was bigger than the doc surface and single-writer is already the expected model for the CLI / app / studio callers you enumerated.

On the P1 follow-ups (non-blocking per your review):

  • Hash-collision mitigation via writing the full 64-char hash into .hf-complete and verifying on lookup — totally fair, queued as a follow-up. The current 16-hex-char truncation has the birthday bound at local scale so I'm not losing sleep, but "silent corruption on collision → cache miss" is the right shape.
  • Eviction mechanism — also queued. Thinking hyperframes cache clean --max-age <duration> or a purgeStaleEntries(rootDir, maxAgeMs) utility that walks the schema-prefixed dirs. The hfcache-v2- prefix makes version-based cleanup a bonus trivial case.

On the P3s: test cleanup style, canonicalKeyBlob property-name expansion, frame-pattern validation, all noted. Fold them into the follow-up PR alongside the hash-collision fix so the core semantic work here isn't further churned.

480/480 engine tests green after the refactor.

jrusso1020 added a commit that referenced this pull request Apr 23, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
… Phase 3

Address Miguel's P3 on #445 (#445 (comment)):

The previous `errors.push({...}); continue;` in the HDR preflight only
short-circuited the normalization step — it did NOT remove the invalid
entry from `resolvedVideos`, so Phase 3 still called
extractVideoFramesRange with the same past-EOF mediaStart and surfaced
a second raw FFmpeg error ("Nothing was written into output file") for
the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR
case and mediaStart=5.

Fix: track the skipped-at-HDR-preflight indices in a set, then splice
them out of every parallel array (resolvedVideos, videoMetadata,
videoColorSpaces) after the HDR preflight loop completes. Iterate
backwards so the indices stay stable while splicing. Phase 2b (VFR
preflight) and Phase 3 (extract) now see only validated entries, so
there's exactly one descriptive error per bad clip instead of two.

Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will
extend this splice to cover that array during its rebase.

The same inherited bug still exists on the VFR path — leaving that for
a follow-up PR since it's out of scope for this HDR-preflight fix and
was noted as inherited (not a regression) in the original review.
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
Review feedback on PR #446:

- Move the readdir/filter/sort rebuild of ExtractedFrames on cache hit
  into a new `rehydrateCacheEntry` helper inside the cache module — the
  extractor no longer imports FRAME_FILENAME_PREFIX or assembles frame
  paths by hand, so cache layout stays owned by the cache.
- Extract `resolveSegmentDuration` helper to replace two copies of the
  "requested duration → fallback to source natural" logic in Phase 3.
- Collapse the nested Phase 3 cache logic into a `tryCachedExtract`
  helper that returns `ExtractedFrames | null`. Reads top-down now.
- Move source-file `statSync` out of `canonicalKeyBlob` into a `readKeyStat`
  helper the extractor calls once per video in Phase 1. Eliminates the
  redundant statSync on every `computeCacheKey` / `lookupCacheEntry`.
  `CacheKeyInput` now requires `mtimeMs` / `size` — cleaner contract.
- Drop the redundant `existsSync` in `ensureCacheEntryDir`: `mkdirSync`
  with `recursive:true` is already idempotent.
- Tighten `CacheKeyInput.format` from `string` to `CacheFrameFormat`
  (`"jpg" | "png"`). Catches typos at compile time.
- Add an explicit concurrency caveat on `markCacheEntryComplete` — the
  lookup→populate→mark sequence is non-atomic; document the tradeoff
  in code rather than only in the commit history.
- Stop re-exporting cache internals (SCHEMA_PREFIX, COMPLETE_SENTINEL,
  FRAME_FILENAME_PREFIX, cacheEntryDirName, computeCacheKey, ensureCacheEntryDir,
  lookupCacheEntry, markCacheEntryComplete, CacheEntry, CacheKeyInput,
  CacheLookup) from the engine's public index.ts — they're only used by
  the colocated extractor and its tests, which import relatively.

Behavior unchanged — same cold/warm validation on /tmp/hf-fixtures/cfr-sdr-cache:
  cold: extractMs=71, cacheMisses=1
  warm: extractMs=0,  cacheHits=1   (was 1 → 0 post-refactor)

All 32 cache+extractor tests pass.
@jrusso1020 jrusso1020 force-pushed the perf/v2/extraction-cache branch from 3c36324 to a0f49f2 Compare April 23, 2026 22:43
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
…che caveats

Address Miguel's review on #446:

- P2: readKeyStat now returns null when the source file is missing instead
  of the {mtimeMs: 0, size: 0} zero-tuple sentinel. Two unrelated missing
  paths would otherwise share the same cache key and pollute the cache
  with orphaned entries. The cacheKeyInputs builder propagates the null
  forward; tryCachedExtract already skips the cache path for null entries
  so the extractor surfaces the real file-not-found error downstream.
- P2: document the single-writer caveat on EngineConfig.extractCacheDir.
  Concurrent renders targeting the same cache dir can race on partial
  frame reads — the .hf-complete sentinel prevents serving an incomplete
  entry but individual frames are written non-atomically. Also flagged
  the coarser-mtime-on-network-filesystems hazard for future ops.
- Updated the existing "tolerates a missing source" test to assert the
  new contract (readKeyStat returns null) rather than the old "zero-stat
  sentinel" behaviour.

All 480 engine tests pass.
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
Downstack extension of the #445 past-EOF splice fix. The parallel-array
splice that removes HDR-preflight-skipped entries before Phase 3 was
introduced on #445 to splice (resolvedVideos, videoMetadata,
videoColorSpaces). This commit adds the fourth parallel array
(cacheKeyInputs) that #446 introduces, so the cache lookup at
`cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot
after the splice.

All 480 engine tests pass.
jrusso1020 added a commit that referenced this pull request Apr 23, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-checked the live head. The downstack splice extension keeps cacheKeyInputs aligned after the #445 fix, local focused engine validation is clean, and the latest regression CI is green.

jrusso1020 added a commit that referenced this pull request Apr 24, 2026
## What

Adds per-phase timings and counters to `extractAllVideoFrames` and surfaces them on the producer's `RenderPerfSummary` as `videoExtractBreakdown` alongside a new `tmpPeakBytes` workDir size sample.

## Why

Phase 2 video extraction has five distinct sub-phases (resolve, HDR probe, HDR preflight, VFR probe, VFR preflight, per-video extract) and today they collapse into a single `videoExtractMs` stage timing. That makes every subsequent perf PR in this stack immeasurable — you can't tell whether a win came from cache hits, preflight scope reduction, or pure extraction speed.

This PR is foundational for PR #445 (segment-scope HDR preflight) and PR #446 (content-addressed extraction cache).

## How

- New `ExtractionPhaseBreakdown` type with `resolveMs`, `hdrProbeMs`, `hdrPreflightMs/Count`, `vfrProbeMs`, `vfrPreflightMs/Count`, `extractMs`, `cacheHits`, `cacheMisses`. Populated inline with `Date.now()` wrappers — overhead is sub-millisecond on every phase.
- Returned on `ExtractionResult.phaseBreakdown`.
- Producer extends `RenderPerfSummary` with `videoExtractBreakdown?: ExtractionPhaseBreakdown` and `tmpPeakBytes?: number`. `tmpPeakBytes` is sampled from the workDir right before cleanup via a new recursive-size helper that swallows errors (purely observational — a missing workDir must never fail the render).

No changes to the capture-lifecycle resource tracking — earlier versions of this instrumentation plumbed injector LRU stats through `RenderOrchestrator`, which conflicted hard with upstream #371 (`buildHdrCaptureOptions` refactor). Dropped that piece for a marginal observability loss.

## Test plan

Validation on `packages/producer/tests/vfr-screen-recording`:
```json
"videoExtractBreakdown": {
  "resolveMs": 0, "hdrProbeMs": 0, "hdrPreflightMs": 0, "hdrPreflightCount": 0,
  "vfrProbeMs": 0, "vfrPreflightMs": 166, "vfrPreflightCount": 1,
  "extractMs": 97, "cacheHits": 0, "cacheMisses": 0
},
"tmpPeakBytes": 4578598
```
Total elapsed within noise of pre-PR baseline (2665 → 2673 → 3228ms across hosts).

- [x] Unit test: phase-breakdown assertion added to `videoFrameExtractor.test.ts`
- [x] Lint + format (oxlint + oxfmt)
- [x] Typecheck (engine + producer)
- [x] Manual perf validation against VFR fixture
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
… Phase 3

Address Miguel's P3 on #445 (#445 (comment)):

The previous `errors.push({...}); continue;` in the HDR preflight only
short-circuited the normalization step — it did NOT remove the invalid
entry from `resolvedVideos`, so Phase 3 still called
extractVideoFramesRange with the same past-EOF mediaStart and surfaced
a second raw FFmpeg error ("Nothing was written into output file") for
the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR
case and mediaStart=5.

Fix: track the skipped-at-HDR-preflight indices in a set, then splice
them out of every parallel array (resolvedVideos, videoMetadata,
videoColorSpaces) after the HDR preflight loop completes. Iterate
backwards so the indices stay stable while splicing. Phase 2b (VFR
preflight) and Phase 3 (extract) now see only validated entries, so
there's exactly one descriptive error per bad clip instead of two.

Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will
extend this splice to cover that array during its rebase.

The same inherited bug still exists on the VFR path — leaving that for
a follow-up PR since it's out of scope for this HDR-preflight fix and
was noted as inherited (not a regression) in the original review.
@jrusso1020 jrusso1020 force-pushed the perf/v2/hdr-segment-scope branch from 17d1622 to a083658 Compare April 24, 2026 03:55
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
… Phase 3

Address Miguel's P3 on #445 (#445 (comment)):

The previous `errors.push({...}); continue;` in the HDR preflight only
short-circuited the normalization step — it did NOT remove the invalid
entry from `resolvedVideos`, so Phase 3 still called
extractVideoFramesRange with the same past-EOF mediaStart and surfaced
a second raw FFmpeg error ("Nothing was written into output file") for
the same clip. Miguel reproduced this with a synthetic 2s SDR + 1s HDR
case and mediaStart=5.

Fix: track the skipped-at-HDR-preflight indices in a set, then splice
them out of every parallel array (resolvedVideos, videoMetadata,
videoColorSpaces) after the HDR preflight loop completes. Iterate
backwards so the indices stay stable while splicing. Phase 2b (VFR
preflight) and Phase 3 (extract) now see only validated entries, so
there's exactly one descriptive error per bad clip instead of two.

Downstack #446 adds a fourth parallel array (cacheKeyInputs) and will
extend this splice to cover that array during its rebase.

The same inherited bug still exists on the VFR path — leaving that for
a follow-up PR since it's out of scope for this HDR-preflight fix and
was noted as inherited (not a regression) in the original review.
@jrusso1020 jrusso1020 force-pushed the perf/v2/hdr-segment-scope branch from a083658 to 6af6fa4 Compare April 24, 2026 03:55
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
Review feedback on PR #446:

- Move the readdir/filter/sort rebuild of ExtractedFrames on cache hit
  into a new `rehydrateCacheEntry` helper inside the cache module — the
  extractor no longer imports FRAME_FILENAME_PREFIX or assembles frame
  paths by hand, so cache layout stays owned by the cache.
- Extract `resolveSegmentDuration` helper to replace two copies of the
  "requested duration → fallback to source natural" logic in Phase 3.
- Collapse the nested Phase 3 cache logic into a `tryCachedExtract`
  helper that returns `ExtractedFrames | null`. Reads top-down now.
- Move source-file `statSync` out of `canonicalKeyBlob` into a `readKeyStat`
  helper the extractor calls once per video in Phase 1. Eliminates the
  redundant statSync on every `computeCacheKey` / `lookupCacheEntry`.
  `CacheKeyInput` now requires `mtimeMs` / `size` — cleaner contract.
- Drop the redundant `existsSync` in `ensureCacheEntryDir`: `mkdirSync`
  with `recursive:true` is already idempotent.
- Tighten `CacheKeyInput.format` from `string` to `CacheFrameFormat`
  (`"jpg" | "png"`). Catches typos at compile time.
- Add an explicit concurrency caveat on `markCacheEntryComplete` — the
  lookup→populate→mark sequence is non-atomic; document the tradeoff
  in code rather than only in the commit history.
- Stop re-exporting cache internals (SCHEMA_PREFIX, COMPLETE_SENTINEL,
  FRAME_FILENAME_PREFIX, cacheEntryDirName, computeCacheKey, ensureCacheEntryDir,
  lookupCacheEntry, markCacheEntryComplete, CacheEntry, CacheKeyInput,
  CacheLookup) from the engine's public index.ts — they're only used by
  the colocated extractor and its tests, which import relatively.

Behavior unchanged — same cold/warm validation on /tmp/hf-fixtures/cfr-sdr-cache:
  cold: extractMs=71, cacheMisses=1
  warm: extractMs=0,  cacheHits=1   (was 1 → 0 post-refactor)

All 32 cache+extractor tests pass.
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
…che caveats

Address Miguel's review on #446:

- P2: readKeyStat now returns null when the source file is missing instead
  of the {mtimeMs: 0, size: 0} zero-tuple sentinel. Two unrelated missing
  paths would otherwise share the same cache key and pollute the cache
  with orphaned entries. The cacheKeyInputs builder propagates the null
  forward; tryCachedExtract already skips the cache path for null entries
  so the extractor surfaces the real file-not-found error downstream.
- P2: document the single-writer caveat on EngineConfig.extractCacheDir.
  Concurrent renders targeting the same cache dir can race on partial
  frame reads — the .hf-complete sentinel prevents serving an incomplete
  entry but individual frames are written non-atomically. Also flagged
  the coarser-mtime-on-network-filesystems hazard for future ops.
- Updated the existing "tolerates a missing source" test to assert the
  new contract (readKeyStat returns null) rather than the old "zero-stat
  sentinel" behaviour.

All 480 engine tests pass.
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
Downstack extension of the #445 past-EOF splice fix. The parallel-array
splice that removes HDR-preflight-skipped entries before Phase 3 was
introduced on #445 to splice (resolvedVideos, videoMetadata,
videoColorSpaces). This commit adds the fourth parallel array
(cacheKeyInputs) that #446 introduces, so the cache lookup at
`cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot
after the splice.

All 480 engine tests pass.
@jrusso1020 jrusso1020 force-pushed the perf/v2/extraction-cache branch from a0f49f2 to b231f61 Compare April 24, 2026 03:55
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
@jrusso1020 jrusso1020 changed the base branch from perf/v2/hdr-segment-scope to graphite-base/446 April 24, 2026 04:15
Keys extracted frame bundles on (path, mtime, size, mediaStart, duration,
fps, format) so iteration workflows (studio edit → re-render) skip the
ffmpeg call when none of those inputs have changed.

- New extractionCache service: SHA-256 key, `.hf-complete` sentinel,
  `hfcache-v2-` schema prefix, FRAME_FILENAME_PREFIX shared with the
  extractor. Schema bump invalidates older entries (callers gc them).
- `EngineConfig.extractCacheDir` (env: HYPERFRAMES_EXTRACT_CACHE_DIR)
  gates the feature — undefined disables caching, preserving the prior
  per-workDir behaviour exactly.
- Phase 3 per-video flow: compute key on the pre-preflight video
  (so keys are stable across renders that use workDir-local normalized
  files), look up; on hit rebuild ExtractedFrames from the cache dir
  plus Phase 2 metadata — no re-ffprobe. On miss, extract into the
  keyed dir then mark complete.
- `ExtractedFrames.ownedByLookup` prevents `FrameLookupTable.cleanup`
  from deleting shared cache entries when a render finishes.
- Extractor output-dir override lets cache-miss writes land directly in
  the keyed dir instead of the transient workDir.

Validation: /tmp/hf-fixtures/cfr-sdr-cache
  Cold  run: extractMs=69,  videoExtractMs=70,  totalElapsedMs=2052
  Warm  run: extractMs=1,   videoExtractMs=2,   totalElapsedMs=1964
  cacheHits 0→1, cacheMisses 1→0.

Adds 19 unit tests for key/sentinel/format semantics and 2 integration
tests (cache hit, fps change invalidates key) in videoFrameExtractor.test.ts.
Review feedback on PR #446:

- Move the readdir/filter/sort rebuild of ExtractedFrames on cache hit
  into a new `rehydrateCacheEntry` helper inside the cache module — the
  extractor no longer imports FRAME_FILENAME_PREFIX or assembles frame
  paths by hand, so cache layout stays owned by the cache.
- Extract `resolveSegmentDuration` helper to replace two copies of the
  "requested duration → fallback to source natural" logic in Phase 3.
- Collapse the nested Phase 3 cache logic into a `tryCachedExtract`
  helper that returns `ExtractedFrames | null`. Reads top-down now.
- Move source-file `statSync` out of `canonicalKeyBlob` into a `readKeyStat`
  helper the extractor calls once per video in Phase 1. Eliminates the
  redundant statSync on every `computeCacheKey` / `lookupCacheEntry`.
  `CacheKeyInput` now requires `mtimeMs` / `size` — cleaner contract.
- Drop the redundant `existsSync` in `ensureCacheEntryDir`: `mkdirSync`
  with `recursive:true` is already idempotent.
- Tighten `CacheKeyInput.format` from `string` to `CacheFrameFormat`
  (`"jpg" | "png"`). Catches typos at compile time.
- Add an explicit concurrency caveat on `markCacheEntryComplete` — the
  lookup→populate→mark sequence is non-atomic; document the tradeoff
  in code rather than only in the commit history.
- Stop re-exporting cache internals (SCHEMA_PREFIX, COMPLETE_SENTINEL,
  FRAME_FILENAME_PREFIX, cacheEntryDirName, computeCacheKey, ensureCacheEntryDir,
  lookupCacheEntry, markCacheEntryComplete, CacheEntry, CacheKeyInput,
  CacheLookup) from the engine's public index.ts — they're only used by
  the colocated extractor and its tests, which import relatively.

Behavior unchanged — same cold/warm validation on /tmp/hf-fixtures/cfr-sdr-cache:
  cold: extractMs=71, cacheMisses=1
  warm: extractMs=0,  cacheHits=1   (was 1 → 0 post-refactor)

All 32 cache+extractor tests pass.
…che caveats

Address Miguel's review on #446:

- P2: readKeyStat now returns null when the source file is missing instead
  of the {mtimeMs: 0, size: 0} zero-tuple sentinel. Two unrelated missing
  paths would otherwise share the same cache key and pollute the cache
  with orphaned entries. The cacheKeyInputs builder propagates the null
  forward; tryCachedExtract already skips the cache path for null entries
  so the extractor surfaces the real file-not-found error downstream.
- P2: document the single-writer caveat on EngineConfig.extractCacheDir.
  Concurrent renders targeting the same cache dir can race on partial
  frame reads — the .hf-complete sentinel prevents serving an incomplete
  entry but individual frames are written non-atomically. Also flagged
  the coarser-mtime-on-network-filesystems hazard for future ops.
- Updated the existing "tolerates a missing source" test to assert the
  new contract (readKeyStat returns null) rather than the old "zero-stat
  sentinel" behaviour.

All 480 engine tests pass.
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
Downstack extension of the #445 past-EOF splice fix. The parallel-array
splice that removes HDR-preflight-skipped entries before Phase 3 was
introduced on #445 to splice (resolvedVideos, videoMetadata,
videoColorSpaces). This commit adds the fourth parallel array
(cacheKeyInputs) that #446 introduces, so the cache lookup at
`cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot
after the splice.

All 480 engine tests pass.
@jrusso1020 jrusso1020 force-pushed the perf/v2/extraction-cache branch from b231f61 to 1a669be Compare April 24, 2026 04:15
@graphite-app graphite-app Bot changed the base branch from graphite-base/446 to main April 24, 2026 04:15
Downstack extension of the #445 past-EOF splice fix. The parallel-array
splice that removes HDR-preflight-skipped entries before Phase 3 was
introduced on #445 to splice (resolvedVideos, videoMetadata,
videoColorSpaces). This commit adds the fourth parallel array
(cacheKeyInputs) that #446 introduces, so the cache lookup at
`cacheKeyInputs[i]` in tryCachedExtract doesn't point at a stale slot
after the splice.

All 480 engine tests pass.
@jrusso1020 jrusso1020 force-pushed the perf/v2/extraction-cache branch from 1a669be to ba05796 Compare April 24, 2026 04:15
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
@jrusso1020 jrusso1020 merged commit 57cdf7d into main Apr 24, 2026
34 checks passed
Copy link
Copy Markdown
Collaborator Author

Merge activity

@jrusso1020 jrusso1020 deleted the perf/v2/extraction-cache branch April 24, 2026 04:35
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
The CLI already ships `render_complete` events to PostHog but only carries
a subset of `RenderPerfSummary` (duration_ms, capture_avg_ms, etc.). With
#444 adding per-phase breakdown and #446 adding cache counters, the data
is sitting in `job.perfSummary.videoExtractBreakdown` and
`job.perfSummary.tmpPeakBytes` but never reaching PostHog.

This forwards the new fields as flat properties (flat is easier to query
with PostHog insights than nested objects):

- tmp_peak_bytes
- stage_compile_ms, stage_video_extract_ms, stage_audio_process_ms,
  stage_capture_ms, stage_encode_ms, stage_assemble_ms
- extract_resolve_ms, extract_hdr_probe_ms, extract_hdr_preflight_ms,
  extract_hdr_preflight_count, extract_vfr_probe_ms, extract_vfr_preflight_ms,
  extract_vfr_preflight_count, extract_phase3_ms, extract_cache_hits,
  extract_cache_misses

`extract_phase3_ms` disambiguates from `stage_video_extract_ms`: the former
is just the parallel ffmpeg extract inside Phase 3, the latter is the full
stage (resolve + probe + preflight + extract).

All fields optional — the Docker subprocess call site (no perfSummary
available) still compiles and ships events without them.
jrusso1020 added a commit that referenced this pull request Apr 24, 2026
…#454)

## What

Forwards the new per-phase extraction breakdown and `tmpPeakBytes` fields from `RenderPerfSummary` (added in #444 and #446) to PostHog via the CLI's existing `render_complete` telemetry event.

## Why

The CLI already ships `render_complete` events to PostHog (`packages/cli/src/telemetry/client.ts`), but `events.ts:trackRenderComplete` only carried a subset of `RenderPerfSummary` — top-level timings, composition dims, and memory snapshots. After #444 added per-phase extraction breakdown (`videoExtractBreakdown`) and #446 added cache hit/miss counters, the data lives on `job.perfSummary` at render-complete but never reaches PostHog dashboards.

Without this, any PostHog insight built around "how often are we hitting the cache?", "what's the median HDR preflight cost?", or "where in the extract phase do compositions spend time?" has to be answered by Datadog log scraping instead.

## How

- **`packages/cli/src/telemetry/events.ts`** — extend `trackRenderComplete` props with 17 new optional fields: `tmpPeakBytes`, the six named stage timings, and the ten `videoExtractBreakdown` fields. All sent as flat properties (`extract_cache_hits`, `stage_capture_ms`, etc.) — PostHog insights query flat keys more ergonomically than nested objects.
- **`packages/cli/src/commands/render.ts`** — wire `job.perfSummary.videoExtractBreakdown` / `stages` / `tmpPeakBytes` into the `trackRenderMetrics` → `trackRenderComplete` hand-off.
- Naming: `extract_phase3_ms` deliberately disambiguates from `stage_video_extract_ms` — the former is just the parallel ffmpeg extract inside Phase 3; the latter is the full stage (resolve + probe + preflight + extract).
- All new fields are optional. The Docker-subprocess branch of `render.ts` that doesn't have a local `perfSummary` still compiles and ships events without them.

## Test plan

- [x] `bun run --cwd packages/cli test` — 161/161 pass
- [x] `bunx tsc -p packages/cli/tsconfig.json --noEmit` — no errors
- [x] `bunx oxlint` + `bunx oxfmt` — clean
- [ ] Once merged, verify PostHog receives the new properties on a real render event (run `hyperframes render` against a fixture and watch PostHog ingestion — telemetry auto-disables in CI, so this requires a local dev render with `HYPERFRAMES_NO_TELEMETRY` unset).

## Stack

Depends on #444 (adds the `videoExtractBreakdown` + `tmpPeakBytes` fields to `RenderPerfSummary`) and transitively on #445#446.

## Future work (not in this PR)

- The HeyGen internal producer server (`hyperframes-internal/packages/producer/src/server.ts`) logs `perfSummary` to Datadog via `log.info` but has no PostHog integration. Production renders are the bulk of the traffic — separate PR to either ship perfSummary to PostHog from the internal server, or materialize Datadog log-based metrics for per-phase timings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants