A video player engine for Apple platforms.
FFmpeg demuxes. VideoToolbox decodes. AVPlayer handles Dolby Atmos.
You ship the UI.
A player engine that gets the hard parts right (HDR, Dolby Vision, Dolby Atmos, container coverage, codec coverage) and exposes a single AetherPlayerView (UIKit / AppKit) or AetherPlayerSurface (SwiftUI) plus a handful of async methods. No AVPlayerViewController. No opinionated controls. No analytics. Bind the view, call play(), read the published properties for state.
The view is polymorphic: under the hood the engine swaps the hosted CALayer (AVPlayerLayer for the native AVPlayer path, AVSampleBufferDisplayLayer for the SW dav1d fallback path) per session without the host having to know.
You provide the transport bar. You provide the dropdowns. You provide the pretty.
| Area | Details |
|---|---|
| Containers | MKV, MP4, WebM, MPEG-TS, AVI, OGG, FLV (demux side) |
| HW decode | H.264, HEVC, HEVC Main10 via VideoToolbox in AVPlayer's HLS-fMP4 path. AV1 on devices with HW AV1 (M3+ Mac, iPhone 15 Pro+, future Apple TV chips) also routes natively |
| SW decode | AV1 (libavcodec/dav1d) on devices without HW AV1 — currently all Apple TVs, M1/M2 Macs, pre-A17-Pro iPhones. VP9 (libavcodec native) unconditionally, since AVPlayer's HLS pipeline rejects the vp09 CODECS attribute even where VT can HW-decode it. Both render through SoftwareVideoDecoder + AVSampleBufferDisplayLayer. Dispatch decision lives in AetherEngine.load, gated per source on VTCapabilityProbe |
| HDR10 | BT.2020 + PQ signaled via the HLS-fMP4 wrapper; AVPlayer hands the bitstream to the system HDR pipeline |
| HDR10+ | Per-frame ST 2094-40 dynamic metadata preserved through stream-copy into the HLS-fMP4 wrapper |
| Dolby Vision | HEVC P5 / P8.1 / P8.4 with dvh1 / hvc1 track type + dvcC box. AV1 P10.0 / P10.1 / P10.4 with dav1 / av01 track type + dvvC box (per Apple HLS Authoring Spec + Dolby ETSI TS 103 572). Both engage the tvOS HDMI DV handshake on DV-capable displays |
| HLG | Transfer function detected and signaled |
| HDR to SDR | Handled by AVPlayer / system compositor based on the connected display; no host-side tonemap |
| Audio | AAC, AC3, EAC3, FLAC, MP3, Opus, Vorbis, TrueHD, DTS, DTS-HD MA, ALAC, PCM |
| Dolby Atmos | EAC3+JOC stream-copied through the HLS-fMP4 wrapper, played back by AVPlayer with Dolby MAT 2.0 unwrap downstream |
| Surround | 5.1 / 7.1 with correct AudioChannelLayout preserved through the wrapper |
| Subtitles | SubRip / ASS / SSA / WebVTT / mov_text streamed inline; PGS / HDMV PGS / DVB / DVD rendered as CGImage with normalised position; sidecar .srt / .ass / .vtt URLs decoded via short-lived context |
| Seek | Producer teardown + restart for backward / far-forward scrubs; short-range forward scrubs ride the cached segment window |
| Streaming | HTTP Range + chunked delegate reads via URLSession |
| Resilience | Exponential backoff on transient network errors, background pause, display-link aware lifecycle |
import AetherEngine
import SwiftUI
let player = try AetherEngine()
// SwiftUI: drop AetherPlayerSurface anywhere in the view tree
var body: some View {
AetherPlayerSurface(engine: player)
}
// UIKit / AppKit: bind an AetherPlayerView directly
let surface = AetherPlayerView()
player.bind(view: surface)
try await player.load(url: videoURL) // or
try await player.load(url: videoURL, startPosition: 347.5) // resume
try await player.load(url: videoURL, options: .init(httpHeaders: headers)) // auth
player.play()
player.pause()
player.setRate(1.5)
await player.seek(to: 120)
player.stop()
// Observe (Combine @Published)
player.$state // .idle, .loading, .playing, .paused, .seeking, .error
player.$currentTime
player.$duration
player.$videoFormat // .sdr, .hdr10, .hdr10Plus, .dolbyVision, .hlg
player.audioTracks // [TrackInfo]
player.selectAudioTrack(index: trackID)
// Subtitles, text and bitmap, one published list
player.subtitleTracks // [TrackInfo] for the loaded source
player.selectSubtitleTrack(index: streamID) // embedded, text or bitmap
player.selectSidecarSubtitle(url: srtURL) // .srt / .ass / .vtt next to the media
player.clearSubtitle()
player.$subtitleCues // [SubtitleCue], body is .text(String) or .image(SubtitleImage)
player.$isSubtitleActive // host mirror gate
player.$isLoadingSubtitles // sidecar fetch + decode in progressInstall via Swift Package Manager:
.package(url: "https://github.com/superuser404notfound/AetherEngine", branch: "main")AetherEngine has two playback pipelines, picked once at load(url:) based on the source's video codec:
Native AVPlayer pipeline (default). Demux the source with libavformat, re-mux the elementary streams on the fly into HLS-fMP4, serve them from a local HTTP server on 127.0.0.1:<port>, point AVPlayer at the playlist. Apple's stack does all decode, all HDR / Dolby Vision signaling over HDMI, all audio routing. This is the path for HEVC and H.264, which is what AVPlayer's HLS-fMP4 pipeline reliably accepts. Atmos passthrough, DV HDMI handshake, HDR10 / HDR10+ system-side tone-mapping all live on this path.
Source URL ──► Demuxer ──► HLSSegmentProducer ──► SegmentCache ──► HLSLocalServer
│
▼
AVPlayer
│
├─► VideoToolbox (HW decode)
└─► AVR / speakers (Atmos via MAT 2.0)
Software decoder pipeline (AV1 + VP9 fallback). Demux the source, run video packets through libavcodec (dav1d for AV1, FFmpeg's native VP9 decoder for VP9) into CVPixelBuffers, run audio through libavcodec into CMSampleBuffers, render via AVSampleBufferDisplayLayer + AVSampleBufferAudioRenderer with AVSampleBufferRenderSynchronizer as the master clock. Used for codecs AVPlayer's HLS-fMP4 pipeline doesn't accept: AV1 (no AV1 decoder on tvOS at all; Apple ships dav1d on iOS / macOS only, no Apple TV chip has HW AV1) and VP9 (AVPlayer parses the HLS manifest, sees vp09 in the CODECS attribute, then silently stops fetching — item.status never leaves .unknown. VideoToolbox HW-decodes VP9 fine, but only outside the HLS pipeline).
Source URL ──► Demuxer ──┬─► SoftwareVideoDecoder (dav1d) ──► SampleBufferRenderer
│ │
│ ▼
│ AVSampleBufferDisplayLayer
│ ▲
└─► AudioDecoder ──► AudioOutput ────────────┘
│ (synchronizer drives the layer's
▼ control timebase → A/V sync)
AVR / speakers
AV1 sources in the wild almost never carry Dolby Vision (DV is HEVC-profile-driven) or Atmos (mastering runs in HEVC overwhelmingly), so the SW pipeline's lack of those capabilities is a theoretical limitation rather than a real one. The dispatch happens once at load time; hosts see a unified @Published state surface either way.
Why HLS-fMP4 for the native path instead of feeding AVPlayer the source URL directly: AVPlayer's progressive-download path won't accept arbitrary MKV containers, and even for MP4 sources it's brittle around Dolby Vision sample-description quirks and EAC3 dec3 box variants. The HLS-fMP4 wrapper is the most permissive surface AVPlayer exposes; libavformat's hls muxer produces bytes byte-identical to ffmpeg -f hls -hls_segment_type fmp4, which is what Apple's HLS spec is defined against.
EAC3+JOC packets are stream-copied through the muxer with the original dec3 extradata preserved. AVPlayer reads the segment, recognises JOC from the dec3 box (numDepSub=1, depChanLoc=0x0100), and hands the bitstream to the HDMI output as Dolby MAT 2.0. The AVR lights up the Atmos indicator. The engine emits an explicit [HLSVideoEngine] EAC3+JOC Atmos: stream-copy engaged, MAT 2.0 passthrough intact diagnostic on every Atmos session so the path is unambiguous in the log.
For codecs that fMP4 doesn't accept directly (TrueHD, DTS, DTS-HD MA), AudioBridge decodes to PCM and re-encodes losslessly as FLAC. This preserves bit-exact channel data for 5.1 / 7.1 surround but, by definition, loses spatial Atmos / TrueHD-MA object metadata (it's a PCM derivative). The trade-off is per-source: keep the spatial mix when the wrapper can carry it, fall back to lossless 5.1 / 7.1 when it can't. If a JOC source ever falls through to the bridge for whatever reason the engine logs a loud WARNING: Atmos downgrade — ... so the silent quality regression doesn't go unnoticed.
| Source | Wrapper signaling |
|---|---|
| H.264, HEVC (SDR) | BT.709 |
| HEVC Main10 (HDR10) | BT.2020 / PQ |
| HEVC Main10 (HDR10+) | BT.2020 / PQ + per-frame ST 2094-40 SEI stream-copied |
| HEVC Main10 (DV P5 / P8.1 / P8.4) | dvh1 / dvhe track type with the source's dvcC box preserved |
| HEVC Main10 (HLG) | BT.2020 / HLG |
| AV1 HDR | BT.2020 / PQ |
HDR-to-SDR mapping is handled by AVPlayer and the system compositor according to the connected display. AetherEngine doesn't tonemap on the host; it tells the system "this is BT.2020 PQ" (or DV, or HLG) via the HLS-fMP4 sample description and lets tvOS / iOS pick the right path.
DisplayCriteriaController issues the HDMI content-frame-rate and dynamic-range hint via AVDisplayManager before the first segment is fetched, so the receiver-side handshake is in flight by the time AVPlayer is ready to render.
For DV streams the demuxer surfaces the source's AVDOVIDecoderConfigurationRecord. HLSVideoEngine writes the matching ISO BMFF dvcC box into the HLS-fMP4 sample description and promotes the track type from hvc1 to dvh1 (Profile 5, no HDR10 base) or dvhe (Profile 8.1 / 8.4 with HDR10 / HLG backward-compatible base layer). Profile 5 plays only on DV-capable displays; profiles 8.1 / 8.4 fall back to their base layer when the TV doesn't advertise DV.
ST 2094-40 metadata stays attached to the HEVC bitstream as user-data-registered ITU-T T.35 SEI NALs. The HLS-fMP4 stream-copy preserves the SEI through to AVPlayer, which forwards it to the system compositor. HDR10+-capable TVs apply the per-scene tone-mapping curves; HDR10-only TVs fall back to the static HDR10 base.
Subtitle packets are routed through the same demux loop as audio and video. No second AVIO connection, no full-file scan. Each packet decodes inline through avcodec_decode_subtitle2, the result lands in a single [SubtitleCue] published list:
- Text codecs (SubRip / ASS / SSA / WebVTT / mov_text) →
SubtitleCue.body = .text(String). ASS dialogue headers and override blocks ({\an8},{\b1}, ...) are stripped;\Nbecomes a real newline so the host can render with regular text layout. - Bitmap codecs (PGS / HDMV PGS / DVB / DVD) →
.image(SubtitleImage). The indexed pixel plane is walked through its palette, premultiplied against alpha, and wrapped as aCGImage. Position is normalised in[0..1]against the source video frame so the host scales to any on-screen rect. - Sidecar files (a separate
.srt/.ass/.vttURL) →selectSidecarSubtitle(url:)opens its own short-livedAVFormatContext, decodes the whole file once, atomically swaps the result intosubtitleCues.
A single packet that carries multiple rects (PGS often emits signs/songs at the top alongside dialogue at the bottom) becomes multiple cues at the same time range, and the host renders all of them. Cues are inserted in sorted order; backward seeks dedupe by start|end so the list doesn't grow on rewind.
The host stays in charge of the actual paint: text styling, overlay layout, fade transitions, position scaling against the on-screen video rect.
Sources/AetherEngine/
├── AetherEngine.swift Public API + codec dispatch + subtitle stream decode
├── PlayerState.swift PlaybackState, VideoFormat, PlaybackBackend, TrackInfo, SubtitleCue, SubtitleImage
├── Audio/
│ ├── AudioBridge.swift Native path: stream-copy or lossless FLAC transcode per source audio codec
│ ├── AudioDecoder.swift SW path: libavcodec → PCM → CMSampleBuffer with channel-layout tagging
│ └── AudioOutput.swift SW path: AVSampleBufferAudioRenderer + Synchronizer (master clock)
├── Decoder/
│ ├── EmbeddedSubtitleDecoder.swift Inline subtitle decode from demuxed packets
│ ├── SoftwareVideoDecoder.swift SW path: libavcodec/dav1d → CVPixelBuffer (NV12 / P010), HDR10+ side data
│ ├── SubtitleDecoder.swift Sidecar URL one-shot decode (text only)
│ └── VideoDecoderTypes.swift DecodedFrameHandler typealias + VideoDecoderError
├── Demuxer/
│ ├── AVIOReader.swift URLSession → avio_alloc_context
│ └── Demuxer.swift libavformat wrapper
├── Diagnostics/
│ └── EngineLog.swift Gated OSLog emission
├── Display/
│ ├── DisplayCriteriaController.swift AVDisplayManager content-rate / dynamic-range hints (native path)
│ └── FrameRateSnap.swift Snap to standard rates (23.976, 24, 25, 29.97, 30, 50, 59.94, 60)
├── Native/
│ ├── NativeAVPlayerHost.swift Native path: AVPlayer host bound to the loopback HLS-fMP4 URL
│ └── SoftwarePlaybackHost.swift SW path: demux loop + decoders + renderer + synchronizer orchestration
├── Network/
│ └── HLSLocalServer.swift Native path: local HTTP server (127.0.0.1) serving playlist + segments
├── Renderer/
│ └── SampleBufferRenderer.swift SW path: AVSampleBufferDisplayLayer + B-frame reorder, HDR10+ attachments
├── Video/
│ ├── HLSVideoEngine.swift Native path: session orchestrator (muxer wiring, DV signaling, scrub teardown)
│ ├── HLSSegmentProducer.swift Native path: drives libavformat's hls-fmp4 muxer; custom io_open hooks segment writes
│ ├── SegmentCache.swift Native path: producer/consumer segment store with backpressure + scrub-aware eviction
│ └── VTCapabilityProbe.swift VP9 / AV1 system-decode probe (gates codec routing)
└── View/
└── AetherPlayerView.swift Polymorphic surface: hosts either AVPlayerLayer (native) or AVSampleBufferDisplayLayer (SW)
| Package | License | Purpose |
|---|---|---|
| FFmpegBuild | LGPL-3.0 | Slim FFmpeg 7.1 (avcodec / avformat / avutil / swresample / swscale) for demux + HLS-fMP4 mux + AudioBridge FLAC encode + SW-path dav1d decode + sws_scale YUV → NV12 / P010 |
| VideoToolbox | System | Native path video decode (HW where available, Apple's bundled SW dav1d on iOS / macOS) |
| AVFoundation | System | AVPlayer + AVDisplayManager (native path); AVSampleBufferDisplayLayer + AVSampleBufferRenderSynchronizer (SW path) |
| CoreMedia | System | Sample descriptions, format-description tagging, CMTimebase |
A standalone macOS CLI is shipped alongside the library for repro
work without going through TestFlight + Apple TV. Three subcommands,
all operating on a media source URL (file:// or http(s)://):
swift run aetherctl probe <url> # dump container + streams + duration, exit
swift run aetherctl serve <url> # park the engine's loopback HLS-fMP4 server
swift run aetherctl validate <url> # serve + run mediastreamvalidator, exit
swift run aetherctl <url> # alias for serve (backwards compat)probe opens the demuxer, prints the codec / resolution / frame rate
of the video track, the audio track list (codec, channels, language,
Atmos flag), the subtitle track list, then exits. No HLS server is
started.
serve is the original behavior. The CLI prints the loopback URL and
parks until Ctrl-C; from another terminal you can:
curl -i http://127.0.0.1:<port>/master.m3u8
curl -o /tmp/init.mp4 http://127.0.0.1:<port>/init.mp4
mediastreamvalidator http://127.0.0.1:<port>/master.m3u8
mp4dump --verbosity 1 /tmp/init.mp4
ffprobe -v debug /tmp/seg0.mp4
open 'http://127.0.0.1:<port>/master.m3u8' # macOS QuickTimevalidate is the same plus an inline xcrun mediastreamvalidator
run against the loopback manifest, with the report printed and the
engine torn down on completion.
For repeatable runs, Scripts/fetch-fixtures.sh generates a small
set of synthetic FFmpeg test clips in ./Fixtures/ (H.264 SDR,
HEVC HDR10, AV1, VP9) covering both the native AVPlayer path and
the software fallback. Real-world DV / Atmos / multichannel sources
go in ./Fixtures/user/ (gitignored).
Things AetherEngine deliberately doesn't do, so you don't have to read the source to find out:
- No built-in UI. No controls, no transport bar, no pretty HUD.
- No analytics, telemetry, or session reporting. Wire your own to the
@Publishedstate. - No playlist / queue management. Call
load(url:)when you want the next one. - No subtitle overlay. The engine decodes packets and emits
SubtitleCue(text orCGImagewith normalised position); your UI paints them with whatever style and animation you want. - No Metal shaders. Everything renders through Apple's native display stack.
- No third-party networking.
URLSessionhandles bytes; TLS / HTTP-3 / proxies / MDM rules ride for free.
| Min | |
|---|---|
| iOS | 16.0 |
| tvOS | 16.0 |
| macOS | 14.0 |
| Swift | 6.0 |
| Xcode | 16.0 |
- Sodalite: native Jellyfin client for Apple TV.
AetherEngine is vibe-coded, designed and shipped by Vincent Herbst in close pair-programming with Claude (Anthropic). The commit log is the receipt: nearly every commit carries a Co-Authored-By: Claude trailer.
LGPL-3.0 with Apple Store / DRM Exception. The exception clause grants explicit permission to distribute through application stores (Apple App Store, TestFlight, etc.) whose terms otherwise conflict with LGPL §4–6. Modifications to the engine itself still have to be released under LGPL.