Skip to content

bryanlabs/echoform

Repository files navigation

Echoform

CI License: MIT Platform Swift

Echoform is a macOS app that turns whatever audio is playing on your Mac into a calm, beautiful visualization. It taps your system audio output and renders it as ambient, audio-reactive visuals, so whatever you are listening to (music, a podcast, an audiobook, a call) you have something gentle to look at instead of reaching for a feed, a game, or another screen.

It is built to hold your eyes without taking your attention: ambient, not interactive; glanceable, not readable; audio-reactive, not content-competing. The point is visual interest while you listen, not a second task.

Echoform in Bars mode with the Cyberpunk theme and live on-device captions

Bars mode with the Cyberpunk theme and on-device captions.

What it does

  • Captures whatever is playing through the system audio output, locally.
  • Renders it as one of six calm visual modes (bars, waveform, spectral heat, pulse field, flow field, or a combined view).
  • Optionally shows a low-latency caption layer, with optional on-device translation between a dozen major languages.
  • Themeable, full-screen capable, and adjustable from a right-click menu.

Nothing is recorded or saved. The visualizer itself makes no network calls. When captions are enabled, Apple Speech may use online recognition for languages without a local model unless On-device Only is turned on.

Requirements

  • macOS 15 or later (built and tested on macOS 26).
  • To build from source: Xcode 26 with the Swift 6.2 toolchain.

Build and install

Building from source is the recommended way to run Echoform. You can read every line first, and a build you compiled yourself is not quarantined, so it opens with no Gatekeeper prompt (see "Permissions and trust" below).

git clone https://github.com/bryanlabs/echoform.git
cd echoform
./Scripts/install.sh

This builds Echoform.app, installs it to /Applications, and installs an echoform launcher into ~/bin. The build is ad-hoc signed. If you have an Apple Development certificate it is used automatically instead, which keeps macOS from re-asking for system audio access on every rebuild.

  • ./Scripts/package-app.sh builds the app into dist/ without installing.
  • ./Scripts/make-icon.sh regenerates the app icon.

First run: grant system audio access

Echoform uses Core Audio process taps to listen to the system audio already playing on your Mac. macOS calls this System Audio Recording Only in System Settings.

  1. Launch Echoform.
  2. Approve the macOS system audio prompt if it appears.
  3. If the prompt does not appear, open System Settings › Privacy & Security › Screen & System Audio Recording.
  4. Enable Echoform under System Audio Recording Only, and authenticate when macOS asks.
  5. Quit and reopen Echoform.

macOS remembers the grant, so you only do this once.

Controls

Right-click anywhere on the visualizer to open the main controls menu. It includes visual mode, theme, brightness, intensity, captions, spoken language, translation target, recognition mode, low-latency mode, and caption sync offset.

Keyboard shortcuts remain available as optional accelerators:

Key Action
1-6 Switch visual mode
Space Pause / resume
F Toggle full screen
Esc Leave full screen
[ ] Decrease / increase intensity
B Cycle brightness
Cycle theme
C Open the theme / color panel
T Toggle captions
L Open the captions panel
, . Decrease / increase sync offset
Cmd+Q Quit

Visual modes

  1. Bars. Symmetric loudness and frequency bars.
  2. Wave Ribbon. A smooth, glowing waveform ribbon.
  3. Spectral Heat. A slow-scrolling spectrogram.
  4. Pulse Field. Breathing shapes driven by loudness and bass.
  5. Flow Field. A slowly flowing vector field shaped by mids and treble.
  6. Combined. Heat, pulse, and bars layered into one ambient view.

Captions and the delay

Right-click and enable Captions to show a calm, low-contrast text layer in the lower window. Pick Spoken Language and, if useful, enable Translate and choose Translate To.

Low Latency Captions is on by default. In this mode Echoform shows Apple's current partial recognition hypothesis immediately, replacing it as Apple revises the text. This is tuned for listening above 1x speed, where waiting for fully stable words is more distracting than occasional text correction.

Turning Low Latency Captions off switches back to the steadier delayed caption history. That mode hides the newest partial word and can feel calmer, but it usually trails fast podcast or audiobook playback.

Recognition runs behind the audio, so Caption Sync Offset in the right-click menu lets you tune the relationship between the captions and the visualizer from -2 to +10 seconds. Negative offsets are useful for small timing nudges, such as -0.33s. Positive offsets hold back the visualizer so captions and bars can line up, but they do not delay what you hear.

Echoform currently observes system audio through a Core Audio process tap. It does not act as an output device and does not route audio onward to your speakers. That means a negative caption offset cannot show a word before the speech recognizer has emitted it. To make heard audio itself wait for captions, Echoform would need a route-through audio mode built around a virtual output device or audio driver.

With translation on in low-latency mode, Echoform keeps the source-language partial visible immediately and updates the translated line as translation catches up. Translation uses Apple's Translation framework; the first time you use a language pair, macOS downloads that pair once.

On-device Only is off by default so languages without installed local speech models still work through Apple's online speech recognition. Turn it on if you want to force local recognition only.

Themes

Use the right-click Theme menu to pick Classic, Cyberpunk (pink and purple), Aurora (greens), Ember (warm reds), or Custom. The Custom Colors menu item opens the color panel with preset swatches and three color wells.

Preview mode

To see the visuals without playing audio and without granting any permission:

echoform --demo

Add --text to start with the caption layer on. Demo mode feeds a synthetic signal through the renderer, useful for trying modes, themes, and brightness.

Permissions and trust

Echoform asks for two macOS permissions, and only those two.

  • System Audio Recording Only. Echoform uses a Core Audio process tap to read the audio that is already playing. It never captures, shows, or saves the screen or any video. Granting it is a one-time step (see "First run").
  • Speech Recognition. Requested only when you turn captions on. By default Echoform allows Apple's online recognition for languages without an installed local model, which makes Spanish, Korean, and other languages work without extra model setup. Turn on On-device Only in the right-click menu if you want to force local recognition.

Echoform itself makes no network calls and never records, saves, or uploads audio, transcripts, or anything else. Audio is analyzed in memory in real time and then discarded. The exceptions are Apple-provided system services: macOS may download a translation language pack the first time you pick a new pair, and Apple Speech may use online recognition when On-device Only is off.

Why there is no notarized download

An app that opens with no warning on any Mac has to be notarized by Apple, which requires a paid Apple Developer Program membership. Echoform is a free, zero-budget project and is not enrolled, so it is not notarized.

That is why building from source is the recommended path: the whole app is in this repository, you can audit it, and a build you compiled locally is not quarantined, so it opens with no Gatekeeper prompt.

If a release attaches a pre-built Echoform.app, macOS blocks it on first launch because it is not notarized. To open it anyway, launch it once, then open System Settings › Privacy & Security, find the note about Echoform being blocked, and click Open Anyway. Or clear the download quarantine first:

xattr -dr com.apple.quarantine /path/to/Echoform.app

Project layout

  • Sources/EchoformKit/ is the engine library: capture, analysis, observable state, speech, and the SwiftUI renderers.
  • Sources/Echoform/ is the app entry point.
  • Tests/EchoformKitTests/ holds unit tests for the analysis layer.
  • Scripts/ holds the build, sign, install, icon, and transcription benchmark scripts.

Transcription benchmarks

Use Scripts/benchmark-transcribers.py to compare candidate caption engines on the same audio file. It runs Parakeet locally and can run xAI Grok STT or Groq Whisper when XAI_API_KEY or GROQ_API_KEY is present in the environment.

Scripts/benchmark-transcribers.py sample.wav --language es --repeat 3

See docs/transcription-benchmarks.md for the latest local benchmark notes.

License

MIT. See LICENSE.

About

Turn whatever audio is playing on your Mac into calm, ambient visuals

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors