LiveTranslator

Real-time AI-powered speech translator for face-to-face conversations.

Talk in your language. Hear the translation in your earpiece. Instantly.

LiveTranslator uses the Gemini Live API WebSocket streaming to translate speech in real-time with sub-second latency. No cloud roundtrips for audio processing — the AI model listens and speaks simultaneously.

✨ Features

🎧 Solo Mode

One-way translation: your conversation partner speaks, you hear the translation in both earbuds. Perfect for listening to lectures, meetings, or one-on-one conversations.

🎧🎧 Duo Mode

Two-way simultaneous translation with stereo channel separation:

Left earphone → translation into your language
Right earphone → translation into partner's language

Share one pair of earbuds — each person gets their own translation channel.

Duo audio has three launch modes:

Headphones · normal — default mode with local self-translation filtering.
Headphones · continuous — experimental mode with both translation channels playing continuously.
Speaker · anti-echo — half-duplex mode for quick tests without headphones.

🔄 Seamless Sessions

GoAway handling — transparent WebSocket migration when Gemini closes the connection (~10 min intervals)
Session Resumption — conversation context preserved across reconnections
Context Window Compression — unlimited session duration (no 15-minute cap)
Heartbeat monitoring — dead connections detected within 60 seconds

📱 Background Operation

Foreground Service keeps the microphone alive when the screen is off
AppState monitoring auto-recovers WebSocket connections when returning from background
Translation continues while you use other apps

🛡️ Smart Language Filtering

In Solo mode, the translator ignores your native language and only translates the partner's speech — no echo loops.

🏗️ Architecture

┌─────────────────────────────────────────────────┐
│                    App.tsx                       │
│              (BT guard, API key)                 │
├─────────────────────────────────────────────────┤
│              useTranslator hook                  │
│         (lifecycle orchestration)                │
├─────────────────────────────────────────────────┤
│            TranslationEngine                     │
│    ┌─────────────┐    ┌─────────────┐           │
│    │  Session A   │    │  Session B   │  (duo)   │
│    │ partner→my   │    │ my→partner   │          │
│    └──────┬───────┘    └──────┬───────┘          │
│           │                   │                  │
│     ┌─────▼─────┐      ┌─────▼─────┐           │
│     │  L channel │      │  R channel │           │
│     └─────┬─────┘      └─────┬─────┘           │
├───────────┼─────────────────┼───────────────────┤
│     PcmPlayer (native stereo AudioTrack)         │
│     interleaved [L,R,L,R,...] @ 24kHz            │
└─────────────────────────────────────────────────┘
         ▲
         │ base64 PCM chunks (50ms)
         │
┌────────┴──────────┐
│    AudioCapture    │
│  (Foreground Svc)  │
│    16kHz mono      │
└───────────────────┘

🚀 Quick Start

Prerequisites

Node.js 18+
Android SDK (API 33+)
Gemini API key with Live API access

Install & Run

# Clone
git clone https://github.com/deprav1/LiveTranslator.git
cd LiveTranslator

# Install dependencies
npm install

# Generate native project only for a fresh checkout without android/
npx expo prebuild

# Run on Android
npx expo run:android

For the current Windows workspace with an existing native Android folder, prefer the checked build wrapper below. Re-running expo prebuild can overwrite manual native fixes unless the same change is already represented in app.json or a config plugin.

Build APK

# Windows / PowerShell wrapper
npm run build:apk

# Result
LiveTranslator-release.apk

build:apk runs tsc --noEmit, forces a fresh Metro release bundle by removing android/app/build/generated/assets/react, builds assembleRelease, and copies the APK to the repository root.

GitHub Actions also has a manual/tagged APK workflow. See docs/GITHUB_WORKFLOWS.md.

Configure

Open the app → tap ⚙️ Settings
Enter your Gemini API key. Current development uses a test key with quota limits; production ephemeral-token security is intentionally out of scope for this test build.
Select languages
Choose Solo or Duo mode
In Duo, choose the audio mode on the main screen
Tap START TRANSLATION

🛠️ Tech Stack

Layer	Technology
Framework	Expo SDK 56 + React Native 0.85
AI Model	Gemini Live API (WebSocket streaming)
Audio Capture	@siteed/audio-studio (Foreground Service)
Audio Playback	Custom native `PcmPlayer` module (Kotlin, `AudioTrack MODE_STREAM`)
Storage	AsyncStorage (API key persistence)
Language	TypeScript 6.0

📂 Project Structure

src/
├── components/       # UI: AudioWaveform, LanguagePicker, ModeSwitch, SubtitlesPanel
├── constants/        # Supported languages (sr, en, ru)
├── hooks/            # useTranslator, useApiKey, useSettings, useAppStateReconnect
├── screens/          # HomeScreen, SettingsScreen
├── services/         # Core: GeminiLiveTranslate, TranslationEngine, AudioCapture
└── utils/            # toneTest (stereo channel verification)

modules/
└── pcm-player/       # Native Expo module — stereo PCM streaming (Kotlin + Swift stub)

docs/
└── solutions/        # Compound engineering: documented bugs & fixes

🌍 Supported Languages

Currently configured with 15 languages in src/constants/languages.ts.

⚠️ Known Limitations

iOS: Audio playback module is a stub (AVAudioEngine implementation pending)
Bluetooth mic + Duo: BT SCO profile forces mono audio, breaking stereo separation. The app falls back to the phone microphone while keeping A2DP headphones for stereo playback
Duo speaker mode: available for experiments, but headphones remain the intended setup for clean two-person translation
Background on some devices: Aggressive OEMs (Xiaomi, Huawei) may kill the foreground service. Disable battery optimization for the app
Manual verification pending: the installed APK starts cleanly, but live Gemini translation via x-goog-api-key, L/R tone-test, input-device list, and long screen-off behavior still need a human phone pass.

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Feel free to open issues and pull requests.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Built with ❤️ and Gemini Live API

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.claude		.claude
.github		.github
assets		assets
docs		docs
modules/pcm-player		modules/pcm-player
patches		patches
scripts		scripts
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
App.tsx		App.tsx
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
app.json		app.json
build_apk.bat		build_apk.bat
build_apk.ps1		build_apk.ps1
eas.json		eas.json
index.ts		index.ts
package-lock.json		package-lock.json
package.json		package.json
serve_apk.bat		serve_apk.bat
serve_apk.ps1		serve_apk.ps1
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiveTranslator

✨ Features

🎧 Solo Mode

🎧🎧 Duo Mode

🔄 Seamless Sessions

📱 Background Operation

🛡️ Smart Language Filtering

🏗️ Architecture

🚀 Quick Start

Prerequisites

Install & Run

Build APK

Configure

🛠️ Tech Stack

📂 Project Structure

🌍 Supported Languages

⚠️ Known Limitations

📄 License

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LiveTranslator

✨ Features

🎧 Solo Mode

🎧🎧 Duo Mode

🔄 Seamless Sessions

📱 Background Operation

🛡️ Smart Language Filtering

🏗️ Architecture

🚀 Quick Start

Prerequisites

Install & Run

Build APK

Configure

🛠️ Tech Stack

📂 Project Structure

🌍 Supported Languages

⚠️ Known Limitations

📄 License

🤝 Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages