OpenIVaC — Instructional Videos as Code

One script, two outputs. A Python module drives a real Chromium browser through a user flow while recording everything. The same script that proves your documentation is correct produces the training video.

The video is the byproduct. The real output is verified documentation — if the script can't click "Submit" because the button is actually labeled "Save & Continue," your docs are wrong, and you find out before your users do.

What you get

Every run produces, per video:

Artifact	What it is	Who it's for
`*.webm`	Raw screen recording of the walkthrough	Archival
`*.srt`	Subtitles timed to each narration beat	Accessibility, voice-over
`*_subtitled.mp4`	Recording with burned-in subtitles	Humans
`*_voiced.mp4`	Subtitled video + synthesized narration	Humans (final cut)
`step_*.webp`	Auto-captured screenshot at every interaction	Agents, doc frames

If the script exits 0, the flow works and the docs are current. Same action, both guarantees.

Quickstart

Docker (recommended)

docker compose up -d tts          # start the TTS sidecar
docker compose run openivac       # record every video_*.py script
docker compose run openivac python run.py 01      # just one
docker compose run openivac python run.py --headed # watch it drive

Local

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
playwright install chromium

export DEMO_BASE_URL=http://localhost:8000
export DEMO_USERNAME=admin
export DEMO_PASSWORD=admin
export TTS_ENABLED=true

# Narration backend (default is bark-local; point it at your Bark daemon).
# export TTS_BACKEND=bark-local && export BARK_ENDPOINT=http://localhost:8202
# Or swap to an OpenAI-compatible server with no GPU of your own:
# export TTS_BACKEND=openai && export TTS_ENDPOINT=http://localhost:8100/v1

python run.py

Requires Python 3.10+, ffmpeg (with libass for subtitle burning), and — if you want narration — a running TTS endpoint.

Writing a script

A script is a Python module in scripts/ named video_*.py. It exports a script (a VideoScript with metadata) and a run(headless) function that drives the browser through DemoRunner:

def run(headless: bool = True):
    with DemoRunner(script.id, headless=headless) as demo:
        demo.login()
        demo.subtitle("Welcome.  Let's take a look around.")
        demo.navigate("/dashboard")
        demo.subtitle("The dashboard shows your recent activity.")
        demo.screenshot("dashboard")
    demo.merge_subtitles()
    demo.narrate()

The full DemoRunner API, the doc-to-script generation loop, selector strategy, and customization points (custom auth, viewport, pacing) live in agents.md — that's the deep guide, and it doubles as the reference you feed an LLM when generating scripts.

Narration (TTS)

Narration is optional (TTS_ENABLED=false skips it) and backend-pluggable. Every cue routes through the same synth interface; you pick the engine with TTS_BACKEND:

bark-local (default) — Suno Bark behind a small local HTTP daemon. Fully offline, MIT-licensed, and the backend we run in production. Natural, expressive narration with no hosted dependency. Each sentence is synthesized separately and stitched with a short silence gap (Bark degrades on long inputs), and a deterministic seed keeps the voice stable across a video. Picks a voice from Bark's built-in presets (v2/en_speaker_0 … v2/en_speaker_9); supports the same per-script voice cast as fish. See The Bark backend for the daemon contract.
openai — any OpenAI-compatible TTS server, e.g. openedai-speech. Fast and consistent; selects a voice by name (shimmer, nova, …). No seeding, no voice cloning.
fish — Fish Speech 1.5 via its Gradio interface. Preloaded reference voices and deterministic seeds, so a cue renders identically every time and timbre stays consistent across a video. Slower; self-hosted. Supports a per-script voice cast for call-and-answer narration (a "narrator" slot and an "asker" slot). Heads up: check Fish Speech's licence terms before shipping output — that friction is what pushed our production stack to Bark.

A pronunciation dictionary and unit-expansion table (config.py) keep TTS from reading "10TB" as "ten tee bee" or "SQL" letter-by-letter. Subtitle text is never altered — only the string sent to the synth. (On bark-local the ALL-CAPS spacing pass is skipped: caps are Bark's own emphasis convention.)

The Bark backend

bark-local talks to a tiny HTTP daemon that wraps Bark — OpenIVaC ships the client, you run the daemon (any host with a GPU and the bark package). The contract is one endpoint:

POST {BARK_ENDPOINT}/generate
{ "text": "One sentence of narration.",
  "voice_preset": "v2/en_speaker_9", "do_sample": true, "seed": 43,
  "semantic_temperature": 0.6, "coarse_temperature": 0.7,
  "fine_temperature": 0.35, "return_base64_wav": true }
→ 200 { "wav_base64": "<base64 WAV>", "sample_rate": 24000 }

Tuning knobs (all env vars, production defaults shown):

Var	Default	What it does
`BARK_ENDPOINT`	`http://localhost:8202`	Daemon base URL
`BARK_SEMANTIC_TEMP`	`0.6`	Semantic-stage sampling temperature
`BARK_COARSE_TEMP`	`0.7`	Coarse-stage temperature
`BARK_FINE_TEMP`	`0.35`	Fine-stage temperature
`BARK_SILENCE_MS`	`250`	Silence inserted between stitched sentences
`BARK_SPEED`	`1.08`	Pitch-preserving atempo lift (Bark reads slow); folded into the cache key

Running

python run.py                 # all videos
python run.py 01              # by number
python run.py myapp           # by project prefix
python run.py --headed        # visible browser (debugging)
python run.py --narrate-only  # re-synth audio without re-recording
TTS_ENABLED=false python run.py   # quick silent pass
DEMO_PACE=1.5 python run.py       # slower, for longer narration gaps

Project structure

OpenIVaC/
  config.py          # DemoRunner framework, timing, TTS, subtitles
  run.py             # CLI runner, auto-discovers video_*.py scripts
  agents.md          # the deep guide + LLM reference
  requirements.txt   # Python dependencies
  Dockerfile         # Playwright + ffmpeg container
  docker-compose.yml # app + TTS sidecar
  scripts/           # your video scripts
  docs/              # source docs to generate scripts from
  output/            # generated artifacts (gitignored)

Origin

OpenIVaC is the standalone extraction of the IVaC framework that grew up inside the mydoulapage project and became fleet infrastructure: any app with a frontend can adopt it, override login(), write scripts, and get tested documentation as video.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenIVaC — Instructional Videos as Code

What you get

Quickstart

Docker (recommended)

Local

Writing a script

Narration (TTS)

The Bark backend

Running

Project structure

Origin

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
agents.md		agents.md
config.py		config.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.py		run.py

Folders and files

Latest commit

History

Repository files navigation

OpenIVaC — Instructional Videos as Code

What you get

Quickstart

Docker (recommended)

Local

Writing a script

Narration (TTS)

The Bark backend

Running

Project structure

Origin

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages