Add Helios 14B streaming T2V integration by shy982 · Pull Request #347 · NVIDIA/flashdreams

shy982 · 2026-06-24T17:37:31Z

Summary

Adds FlashDreams integration for Helios 14B real-time streaming text-to-video, addressing #276.

New plugin under integrations/helios/ wrapping HeliosPyramidPipeline from diffusers
Three registered runner slugs:
- helios-distilled-t2v-14b — BestWishYsh/Helios-Distilled, pyramid [2,2,2]
- helios-base-t2v-14b — BestWishYsh/Helios-Base, pyramid [20,20,20], CFG 5.0
- helios-distilled-t2v-14b-2gpu — distilled checkpoint with Ulysses context parallelism (torchrun)
Each generate() call yields one native 33-frame chunk through the FlashDreams streaming interface
CPU smoke tests for runner registration and entry-point discovery
Model gallery docs at docs/source/models/helios.rst

Requirements note

HeliosPyramidPipeline requires a recent diffusers build. If the resolved PyPI release does not export it, install from source:

pip install git+https://github.com/huggingface/diffusers.git
uv sync --project integrations/helios

Register three flashdreams-run slugs wrapping HeliosPyramidPipeline, with smoke tests and model gallery docs. Closes NVIDIA#276.

copy-pr-bot · 2026-06-24T17:37:34Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-06-24T17:42:21Z

Greptile Summary

This PR adds a first-party FlashDreams integration for the Helios 14B real-time streaming text-to-video model, wrapping HeliosPyramidPipeline from diffusers and exposing three runner slugs (helios-distilled-t2v-14b, helios-base-t2v-14b, helios-distilled-t2v-14b-2gpu) through the standard flashdreams-run entry point. The core AR loop, runner, and smoke tests follow existing integration conventions correctly.

pipeline.py: implements HeliosStreamingPipeline with a 33-frame chunk AR loop, video-conditioning carry-over between steps, compile/FlashAttention toggles, and a silent fallback when video conditioning fails — the fallback path has a continuity bug where discontinuous frames are still appended to decoded_chunks.
run_benchmark.py: the FlashDreams timed path crashes immediately because step() repeatedly passes the fixed STEP_TO_MEASURE index to generate(), which enforces strictly-sequential AR indices.
config.py: PIPELINE_HELIOS_DISTILLED_T2V_14B_OPTIMIZED is defined but never registered as an entry point or included in RUNNER_CONFIGS, leaving it as dead code.

Confidence Score: 3/5

The core runner and smoke tests work, but the benchmark tool crashes immediately and the video-conditioning fallback silently corrupts the AR history for all remaining chunks.

The benchmark's step() function hard-codes the same AR index on every call; because generate() enforces strictly-sequential indices the benchmark crashes on the second call, producing no timing data. Separately, when video conditioning raises an exception, the fallback silently generates a frame without temporal context and still appends it to decoded_chunks, so every AR step thereafter is conditioned on a discontinuous frame — degrading quality for the rest of the generation. Both are defects in the changed code that manifest during normal use.

integrations/helios/tests/benchmark/run_benchmark.py (benchmark crashes on second call), integrations/helios/helios/pipeline.py (fallback path corrupts history, fragile pixel-range heuristic).

Important Files Changed

Filename	Overview
integrations/helios/tests/benchmark/run_benchmark.py	FlashDreams benchmark is broken: `step()` passes the fixed `STEP_TO_MEASURE` index on every call, triggering an `AssertionError` on the second invocation because the pipeline enforces strictly-sequential AR indices.
integrations/helios/helios/pipeline.py	Core streaming pipeline; two issues found: the pixel-range heuristic in `_normalize_frames_to_flashdreams` can silently double-remap bright frames, and the video-conditioning fallback poisons `decoded_chunks` with discontinuous frames.
integrations/helios/helios/config.py	Three runner configs correctly registered; `PIPELINE_HELIOS_DISTILLED_T2V_14B_OPTIMIZED` is defined but not wired to any entry point or runner slug.
integrations/helios/helios/helios_loader.py	Model loading logic is sound; redundant try/except fallback in `get_helios_vae_class` can be removed.
integrations/helios/helios/runner.py	Runner drives AR loop correctly with `generate` + `finalize` pairing; output video writing and stats serialization look correct.
integrations/helios/helios/encoder.py	Thin T5 encoder wrapper; CFG gating logic is clear and correct.
integrations/helios/helios/cache.py	Simple dataclass for AR state; no issues.
integrations/helios/helios/compiler.py	FlashAttention and `torch.compile` helpers look correct; `dynamic=True` is appropriate for varying AR chunk sizes.
integrations/helios/tests/test_smoke.py	CPU smoke tests thoroughly validate runner registration, entry-point discovery, and the chunk-frame constant.
integrations/helios/pyproject.toml	Package metadata and entry points are consistent with the three registered runner slugs.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Runner as HeliosT2VRunner
    participant Pipeline as HeliosStreamingPipeline
    participant Encoder as HeliosEncoder
    participant Diffusers as HeliosPyramidPipeline
    participant Cache as HeliosPipelineCache

    Runner->>Pipeline: "initialize_cache(text=[prompt])"
    Pipeline->>Encoder: encode(prompt, negative_prompt, guidance_scale)
    Encoder->>Diffusers: encode_prompt(...)
    Diffusers-->>Encoder: prompt_embeds, negative_prompt_embeds
    Encoder-->>Pipeline: HeliosConditionings
    Pipeline-->>Runner: "HeliosPipelineCache(cond=...)"

    loop "AR step i = 0..total_blocks-1"
        Runner->>Pipeline: generate(i, cache, width, height)
        Pipeline->>Pipeline: _video_conditioning(cache)
        Note over Pipeline: Returns prior decoded frames as [B,T,C,H,W] if >= 33 frames exist
        Pipeline->>Diffusers: "pipe(prompt_embeds, video=video_cond, ...)"
        alt Diffusers call fails and video_cond present
            Pipeline->>Diffusers: "pipe(...) without video="
            Note over Pipeline,Cache: Discontinuous frames still appended to decoded_chunks
        end
        Diffusers-->>Pipeline: result.frames [T,C,H,W]
        Pipeline->>Pipeline: _normalize_frames_to_flashdreams(frames)
        Pipeline->>Cache: decoded_chunks.append(normalized)
        Pipeline->>Cache: "pending_history = normalized[-history_len:]"
        Pipeline-->>Runner: normalized [T,C,H,W]
        Runner->>Pipeline: finalize(i, cache)
        Pipeline->>Cache: "history_frames = pending_history"
    end

    Runner->>Runner: torch.cat(chunks) then write MP4

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Runner as HeliosT2VRunner
    participant Pipeline as HeliosStreamingPipeline
    participant Encoder as HeliosEncoder
    participant Diffusers as HeliosPyramidPipeline
    participant Cache as HeliosPipelineCache

    Runner->>Pipeline: "initialize_cache(text=[prompt])"
    Pipeline->>Encoder: encode(prompt, negative_prompt, guidance_scale)
    Encoder->>Diffusers: encode_prompt(...)
    Diffusers-->>Encoder: prompt_embeds, negative_prompt_embeds
    Encoder-->>Pipeline: HeliosConditionings
    Pipeline-->>Runner: "HeliosPipelineCache(cond=...)"

    loop "AR step i = 0..total_blocks-1"
        Runner->>Pipeline: generate(i, cache, width, height)
        Pipeline->>Pipeline: _video_conditioning(cache)
        Note over Pipeline: Returns prior decoded frames as [B,T,C,H,W] if >= 33 frames exist
        Pipeline->>Diffusers: "pipe(prompt_embeds, video=video_cond, ...)"
        alt Diffusers call fails and video_cond present
            Pipeline->>Diffusers: "pipe(...) without video="
            Note over Pipeline,Cache: Discontinuous frames still appended to decoded_chunks
        end
        Diffusers-->>Pipeline: result.frames [T,C,H,W]
        Pipeline->>Pipeline: _normalize_frames_to_flashdreams(frames)
        Pipeline->>Cache: decoded_chunks.append(normalized)
        Pipeline->>Cache: "pending_history = normalized[-history_len:]"
        Pipeline-->>Runner: normalized [T,C,H,W]
        Runner->>Pipeline: finalize(i, cache)
        Pipeline->>Cache: "history_frames = pending_history"
    end

    Runner->>Runner: torch.cat(chunks) then write MP4

Comments Outside Diff (3)

integrations/helios/helios/pipeline.py, line 983-1003 (link)

Video-conditioning fallback silently poisons future AR steps

When self.pipe(**pipe_kwargs) fails and video conditioning is to blame, the code retries without "video" and proceeds. The resulting frames — generated without temporal context — are then appended to cache.decoded_chunks and stored in cache.pending_history (lines 1000–1001). Every subsequent AR step will call _video_conditioning(cache) and feed those discontinuous frames as context, compounding the discontinuity. At minimum, cache.decoded_chunks should not receive a frame generated without its expected history, or the history should be explicitly invalidated so future steps know the continuity chain is broken.
integrations/helios/helios/pipeline.py, line 1006-1014 (link)

Pixel-range heuristic can silently double-remap frames

The condition frames.max() <= 1.0 and frames.min() >= 0.0 is used to detect [0, 1]-range output and remap to [-1, 1]. For a uniformly bright scene (e.g., snow, sky) every pixel in the native [-1, 1] representation can be positive, causing min() >= 0.0 to be True and incorrectly triggering the 2× scale-shift. The resulting tensor would have values outside the expected range after the _video_conditioning round-trip, producing colour artifacts on the first video-conditioned AR step. A more reliable check is to use a known output_type contract from the diffusers pipeline rather than inferring it from tensor values.
integrations/helios/helios/config.py, line 492-502 (link)

PIPELINE_HELIOS_DISTILLED_T2V_14B_OPTIMIZED is unreachable dead code

This config object is defined and has a comment referencing "Panel C," but it is not included in RUNNER_CONFIGS, not listed as an entry point in pyproject.toml, and is not exported from __init__.py. It can never be loaded by flashdreams-run or discovered by the plugin system. If it's intentionally deferred, a # TODO: note would make that clear; otherwise it should be either wired up or removed.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

_{Reviews (1): Last reviewed commit: "Add Helios 14B streaming T2V integration" | Re-trigger Greptile}

greptile-apps · 2026-06-24T17:42:24Z

+    def step() -> None:
+        pipe.generate(STEP_TO_MEASURE, cache, width=640, height=384)


Benchmark crashes on the second call to step()

pipe.generate() asserts that each call increments autoregressive_index by exactly 1 (expected = (prev + 1)). The warmup loop at line 87–89 drives the cache up to STEP_TO_MEASURE - 1 = 5. The first invocation of step() (passing STEP_TO_MEASURE = 6) succeeds and sets cache.autoregressive_index = 6. Every subsequent call in measure() again passes 6, so expected = 7 but autoregressive_index = 6 → AssertionError. The benchmark crashes on the second warmup iteration and no timing data is collected.

A minimal fix is to maintain an incrementing counter inside step() and call pipe.finalize() after each generate so the cache stays consistent across the N_WARMUP + N_MEASURE = 13 timed repetitions.

greptile-apps · 2026-06-24T17:42:28Z

+def get_helios_vae_class() -> Type[Any]:
+    try:
+        from diffusers import AutoencoderKLWan
+
+        return AutoencoderKLWan
+    except ImportError:
+        from diffusers.models import AutoencoderKLWan
+
+        return AutoencoderKLWan


Redundant fallback import that cannot actually help

If from diffusers import AutoencoderKLWan raises ImportError, it means AutoencoderKLWan is not in the top-level diffusers namespace. The fallback from diffusers.models import AutoencoderKLWan imports from the same underlying module; if the symbol is absent at the top level due to a version mismatch, the submodule import will still succeed, but the caller will silently use a different code path than expected. Conversely, if diffusers itself is not installed, both lines raise and the fallback provides no value. The redundant try/except should be collapsed.

Suggested change

def get_helios_vae_class() -> Type[Any]:

try:

from diffusers import AutoencoderKLWan

return AutoencoderKLWan

except ImportError:

from diffusers.models import AutoencoderKLWan

return AutoencoderKLWan

def get_helios_vae_class() -> Type[Any]:

from diffusers import AutoencoderKLWan

return AutoencoderKLWan

Add Helios 14B streaming T2V integration

f1fa6c1

Register three flashdreams-run slugs wrapping HeliosPyramidPipeline, with smoke tests and model gallery docs. Closes NVIDIA#276.

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Helios 14B streaming T2V integration#347

Add Helios 14B streaming T2V integration#347
shy982 wants to merge 1 commit into
NVIDIA:mainfrom
shy982:add-helios-integration

shy982 commented Jun 24, 2026

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot commented Jun 24, 2026 •

edited

Loading

Comments Outside Diff (3)

Uh oh!

greptile-apps Bot Jun 24, 2026

Uh oh!

greptile-apps Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		def step() -> None:
		pipe.generate(STEP_TO_MEASURE, cache, width=640, height=384)

Uh oh!

Conversation

shy982 commented Jun 24, 2026

Summary

Requirements note

Uh oh!

copy-pr-bot Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (3)

Uh oh!

greptile-apps Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 24, 2026 •

edited

Loading