QVAC-19998 feat: Add LTX support with Metal kernels and unified GGML by aegioscy · Pull Request #12 · tetherto/qvac-ext-stable-diffusion.cpp

aegioscy · 2026-06-05T07:08:15Z

Summary

Adds full LTX-2.3 video generation support to the stable-diffusion.cpp fork with complete Metal GPU acceleration.

Key changes:

Unified GGML: leejet/ggml v0.12 + Metal IM2COL_3D/PAD + ROPE_FLUX kernels
LTX engine: All qvac vcpkg patches reconciled on upstream LTX base
ESRGAN backend preference API preserved
Addon API migrated to LTX's 5-arg generate_video

Tested:

Generated 10-second LTX-2.3 video (768×512, 241 frames, full Metal GPU, no --vae-on-cpu)
vcpkg overlay build validated from GitHub sources
Backwards compatible with qvac-ext-stable-diffusion

Implementation

6 commits implementing:

qvac vcpkg port patches reconciliation
ESRGAN upscaler device API
Backend preference defaults
Metal IM2COL_3D + PAD kernels for LTX video VAE
Unified ggml submodule integration
System ggml compatibility

All based on upstream/master and ready for team review.

Made with Cursor

Replays the downstream vcpkg overlay patches (originally b474457) on top of upstream master, adapting to upstream's refactored backend system: - preferred_gpu_backend (sd_backend_preference_t) is preserved as public API. Upstream's richer backend/params_backend string mechanism stays primary; when no explicit --backend is set, init_backend() now derives the spec from preferred_gpu_backend and the SD_CPU_ONLY env var (auto/cpu/gpu/opencl). - abort-callback: sd_set_abort_callback() / sd_abort_requested() restored; the denoise step now returns an empty GuiderOutput when an abort is requested, so sample_k_diffusion bails through the normal cleanup path. - ggml->sd log bridge restored (surfaces backend init failures via the host log callback; e.g. Android Vulkan diagnostics). Dropped as obsolete (already fixed/superseded upstream): - generic #ifdef backend init (replaced by backend_manager). - failure-path free_compute_buffer fix (upstream already frees work_diffusion_model on the failure path). Builds clean (sd-cli, Release, Metal). Co-authored-by: Cursor <cursoragent@cursor.com>

…nto upstream Combines the net effect of the two downstream ESRGAN commits and adapts them to upstream's refactored upscaler (backend_manager / SDBackendModule::UPSCALER): - Public API preserved for the vcpkg/RuntimeStats consumer: - sd_upscaler_device_t { CPU=0, GPU=1 } - new_upscaler_ctx_with_device(..., device, gpu_backend_pref) - get_upscaler_backend_device() -> 0=CPU, 1=GPU, -1=error - Matches the final shipped qvac enum: sd_backend_preference_t drops AUTO (CPU=0, GPU, OPENCL) and defaults to GPU; init_backend()/sd_ctx_params_init updated accordingly. - new_upscaler_ctx_with_device maps the high-level device/preference onto upstream's backend spec string; UpscalerGGML tracks actual_backend_device (resolved post-init via sd_backend_is_cpu) instead of the old custom device-enumeration init (superseded by backend_manager). Builds clean (sd-cli, Release, Metal). Co-authored-by: Cursor <cursoragent@cursor.com>

…n_cpu) Follow-up to the b474457/ESRGAN reconcile: - init_backend(): map SD_BACKEND_PREF_GPU (and the default) to an EMPTY backend spec rather than "gpu". Upstream auto-selection is already GPU-first, and an empty spec (unlike an explicit "gpu") leaves the keep_clip/vae/control_net_on_cpu overrides effective -- an explicit spec makes runtime_assignment_ non-empty and silently disables --vae-on-cpu. - common.cpp: the CLI builds sd_ctx_params via aggregate init, which left the new preferred_gpu_backend field at 0 (== SD_BACKEND_PREF_CPU in the shipped qvac enum), forcing the whole pipeline onto CPU. Set it to SD_BACKEND_PREF_GPU explicitly. Co-authored-by: Cursor <cursoragent@cursor.com>

Cherry-picks qvac-ext-ggml@bc053644 ("metal: add IM2COL_3D op and PAD left-padding support for Wan video") onto leejet/ggml@0ce7ad3 (v0.12.0, which the fork builds against to match upstream LTX). leejet's Metal backend implements IM2COL_3D/PAD on CPU/CUDA only, so the LTX video VAE (IM2COL_3D) and audio VAE (PAD) aborted on Metal and required --vae-on-cpu. With this kernel port the entire LTX-2.3 pipeline (diffusion + video VAE + audio VAE) runs on Metal: verified 512x320x25 T2V with audio on M3 Ultra, ~49s, no --vae-on-cpu. NOTE: the ggml submodule now points at a local branch commit. To make the fork cloneable, this commit must be pushed to a ggml fork (e.g. tetherto/qvac-ext-ggml) and .gitmodules updated to that URL. Co-authored-by: Cursor <cursoragent@cursor.com>

…/ltx-metal-unified) Pins the ggml submodule to the unified branch = leejet/ggml v0.12 + Metal IM2COL_3D/PAD + GGML_OP_ROPE_FLUX + qvac packaging deltas. This single ggml supports LTX on Metal and stays backwards-compatible with qvac-ext-stable-diffusion (feature/ltx-support, which calls ggml_rope_flux) and the qvac monorepo DL integration. Co-authored-by: Cursor <cursoragent@cursor.com>

…l pin - ggml_graph_cut.cpp: include "ggml-impl.h" by name instead of the submodule path "../ggml/src/ggml-impl.h" so it resolves under both the bundled-submodule build and SD_USE_SYSTEM_GGML=ON (where the ggml port installs ggml-impl.h) - CMakeLists: add ggml/src to the include path when building the submodule - bump ggml submodule to the unified-ggml commit that exports GGML_MAX_NAME and ggml-impl.h via the package config Co-authored-by: Cursor <cursoragent@cursor.com>

aegioscy and others added 6 commits June 3, 2026 14:12

aegioscy changed the title ~~feat: Add LTX support with Metal kernels and unified GGML~~ QVAC-19998 feat: Add LTX support with Metal kernels and unified GGML Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-19998 feat: Add LTX support with Metal kernels and unified GGML#12

QVAC-19998 feat: Add LTX support with Metal kernels and unified GGML#12
aegioscy wants to merge 6 commits into
2026-06-04from
feature/ltx

aegioscy commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aegioscy commented Jun 5, 2026

Summary

Implementation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant