Unity is the cognitive architecture behind Unify's persistent AI colleagues. It implements steerable nested execution, code-first planning, dual-brain voice, and distributed state managers, and runs in production as the agent runtime inside every assistant Unify hosts.
Demo: Launch video β’ Longer-form screenshares
Technical overview: ARCHITECTURE.md
Unity is the open core of the Unify platform. This repository contains the full agent runtime β the managers, tool loops, CodeAct, dual-brain voice coordination, event backbone, memory consolidation. All MIT-licensed.
The persistence backend is also open-source: Orchestra (FastAPI + Postgres + pgvector) runs as a Docker container on your machine by default. The Quick Start curl | bash installer spins it up for you. No Unify account required to run the full open core locally.
Not open-sourced β the managed platform layer. External communication routing, the hosted communication edge (telephony, WhatsApp Business Solution Provider, Microsoft 365 tenant integration, SIP trunking), the assistant session control plane, the billing layer, and the identity layer run as part of the hosted service at unify.ai. You can point the runtime at Unify's hosted Orchestra instead of a local one, but features that depend on the managed platform layer only work against the hosted backend.
If you're here to study the runtime, start with ARCHITECTURE.md. If you're here to run it, the Quick Start gets you a full local install (runtime + Orchestra) in under 5 minutes.
| Steerable nested execution | Every operation returns a live handle β pause, resume, interject, or query at any depth without restarting. Handles nest: steering at one level propagates through the whole tree. |
| Code plans, not tool menus | The Actor writes Python programs over typed primitives with variables, loops, and real control flow β not one JSON tool call at a time. |
| Dual-brain voice | A real-time voice process (sub-second latency) runs alongside a slower orchestration layer that continues tool use and planning in the background. They coordinate over IPC. |
| Distributed state managers | Contacts, knowledge, tasks, transcripts, guidance, files, and more β each owned by a specialized manager running its own async LLM tool loop, composed via English-language APIs. |
| Structured memory consolidation | Documents, screenshares, calls, tasks, and follow-up corrections get consolidated into typed, queryable state. |
| Concurrent steerable actions | Multiple tasks run at once. Each gets its own steering surface for inspection, interruption, and redirection. |
| Persistent identity across channels | Messages, SMS, email, phone calls, and meetings all update the same identity, memory, and task state. |
Get a fully local sandbox running in under 5 minutes. The runtime, the LLM client, and the persistence backend (Orchestra, via Docker) all run on your machine. Hosted backend at unify.ai is an opt-in alternative.
- Python 3.12+ (the installer will fetch it via
uvif you don't have it) - Docker (runs the local Orchestra backend β Postgres + pgvector in a container)
- PortAudio (audio support)
- macOS:
brew install portaudio - Ubuntu/Debian:
sudo apt-get install portaudio19-dev python3-dev
- macOS:
- An LLM provider key β OpenAI or Anthropic are the simplest paths from this README
If you don't want a local Orchestra and would rather point at Unify's hosted backend, install with
--skip-setupand fill inUNIFY_KEY/ORCHESTRA_URLmanually.
One command:
curl -fsSL https://raw.githubusercontent.com/unifyai/unity/main/scripts/install.sh | bashThis clones unity, unify, unillm, and orchestra as siblings under ~/.unity/, installs uv and poetry if you don't have them, syncs dependencies, drops a unity command into ~/.local/bin/, then spins up a local Orchestra (Docker Postgres + FastAPI) and wires your .env with the local UNIFY_KEY and ORCHESTRA_URL. Works on macOS, Linux, and WSL2.
Flags: --skip-setup (don't spin up Orchestra β just install the code), --no-cli (don't install the unity shim), --dir PATH (install somewhere other than ~/.unity), --branch NAME.
Manual install
git clone https://github.com/unifyai/unity.git ~/.unity/unity
git clone https://github.com/unifyai/unify.git ~/.unity/unify
git clone https://github.com/unifyai/unillm.git ~/.unity/unillm
git clone https://github.com/unifyai/orchestra.git ~/.unity/orchestra
cd ~/.unity/unity
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
cd ~/.unity/orchestra
poetry install
ORCHESTRA_INACTIVITY_TIMEOUT_SECONDS=0 scripts/local.sh start
# Copy the UNIFY_BASE_URL and UNIFY_KEY it prints into ~/.unity/unity/.envAfter install.sh, ~/.unity/unity/.env already has ORCHESTRA_URL and UNIFY_KEY set to your local Orchestra. All you need to add is an LLM provider key:
OPENAI_API_KEY=sk-... # or ANTHROPIC_API_KEY=...unillm can also be pointed at other supported providers and compatible local endpoints.
unity --project_name Sandbox --overwriteOther unity subcommands:
unity setupβ re-bootstrap local Orchestra (useful if Docker wasn't running the first time)unity statusβ show local Orchestra statusunity stopβ stop local Orchestraunity restartβ stop + start (wipes DB)unity help
Without the CLI shim
cd ~/.unity/unity
source .venv/bin/activate
python -m sandboxes.conversation_manager.sandbox --project_name Sandbox --overwriteAt the configuration prompt, select option 2 (CodeAct + Simulated Managers). This runs the full architecture: ConversationManager orchestrates CodeActActor, which writes and executes Python plans against the manager APIs, with simulated backends for the managers themselves.
Option 1 is a simpler view that shows ConversationManager's orchestration without CodeAct β useful to focus on the brain and steering layer in isolation.
> msg Hey, can you help me organize my upcoming week?
> sms I need to reschedule my meeting with Sarah to Thursday
> email Project Update | Here are the Q3 numbers you asked for...
Commands: msg (Unify message), sms, email, call, meet. Type help for the full list.
Option 3 at the configuration prompt adds a real computer interface (virtual desktop + browser via agent-service) on top of the CodeAct architecture. See sandboxes/conversation_manager/README.md for the full matrix β voice mode, live voice calls, local comms, hosted comms, GUI mode.
Every operation in Unity returns a live handle you can steer. These handles nest: the user steers the ConversationManager, the ConversationManager steers the Actor, the Actor steers the managers. Corrections, pauses, and queries propagate through the full depth.
In practice:
- "Also include Q2 numbers" mid-way through a report β the agent adjusts without restarting
- "Pause that, something urgent" β work freezes and resumes exactly where it left off
- "How's the flight search going?" β you get a status update without disrupting the work
- Three tasks running at once, each independently steerable
The universal return type. Every manager's ask, update, and execute methods return one.
handle = await actor.act("Research flights to Tokyo and draft an itinerary")
# Twenty seconds later, while it's still working:
await handle.interject("Also check train options from Tokyo to Osaka")
# Or if something urgent comes up:
await handle.pause()
# ... deal with the urgent thing ...
await handle.resume()When the Actor calls primitives.contacts.ask(...), the ContactManager starts its own tool loop and returns its own handle β nested inside the Actor's handle, which is nested inside the ConversationManager's. Steering at any level propagates.
contacts = await primitives.contacts.ask(
"Who was involved in the Henderson project?"
)
for contact in contacts:
history = await primitives.knowledge.ask(
f"What was {contact} last working on?"
)
await primitives.contacts.update(
f"Send {contact} a catch-up email referencing {history}"
)This runs in a sandboxed execution session with the full primitives.* API available β the same typed interfaces the rest of the system uses. One program per turn, with variables, loops, and real control flow. Contact lookup β knowledge retrieval β outbound communication becomes one plan, not three separate tool-selection turns.
Slow brain β the ConversationManager. Sees the full picture: all conversations, notifications, in-flight actions. Makes deliberate decisions. Runs in the main process.
Fast brain β a real-time voice agent on LiveKit, running as a separate subprocess. Sub-second latency. Handles the conversation autonomously.
They talk over IPC. When the slow brain wants to guide the conversation, it sends:
- SPEAK β "say exactly this" (bypasses the fast brain's LLM entirely)
- NOTIFY β "here's some context, decide what to do with it"
- BLOCK β nothing; the fast brain keeps going on its own
A speech urgency evaluator can preempt the slow brain when the user says something that needs immediate attention.
Every 50 messages, the MemoryManager runs a background extraction pass. It pulls out:
- Contact profiles β who people are, their roles, relationships
- Per-contact summaries β what you've been discussing, sentiment, themes
- Response policies β how each person prefers to communicate
- Domain knowledge β project details, preferences, long-term facts
- Tasks β things you committed to, deadlines, follow-ups
Structured, queryable state in typed tables rather than freeform transcript summaries.
ββ In-Flight Actions βββββββββββββββββββββββββββββββββ
β β
β [0] research_flights βββββββββββββ In progress β
β β ask, interject, stop, pause β
β β
β [1] draft_summary βββββββββββββ In progress β
β β ask, interject, stop, pause β
β β
β [2] find_restaurants ββββββββββββ Starting β
β β ask, interject, stop, pause β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Each action gets its own dynamically-generated steering tools. You can inspect, interject into, pause, resume, or stop one action without affecting the others.
ConversationManager (dual-brain orchestration, event-driven scheduling)
β
β Slow Brain βββ IPC βββΊ Fast Brain (real-time voice, LiveKit)
β
βΌ
CodeActActor (generates Python plans, calls primitives.* APIs)
β
βΌ
State Managers (each runs its own async LLM tool loop)
β
βββ ContactManager β people and relationships
βββ KnowledgeManager β domain facts, structured knowledge
βββ TaskScheduler β durable tasks, execution with live handles
βββ TranscriptManager β conversation history and search
βββ GuidanceManager β procedures, SOPs, how-to knowledge
βββ FileManager β file parsing and registry
βββ ImageManager β image storage, vision queries
βββ FunctionManager β user-defined functions, primitives registry
βββ WebSearcher β web research orchestration
βββ SecretManager β encrypted secret storage
βββ BlacklistManager β blocked contact details
βββ DataManager β low-level data operations
β
βββ EventBus β typed pub/sub backbone (Pydantic events)
βββ MemoryManager β offline consolidation every 50 messages
- User message arrives. The slow brain renders a full state snapshot and makes a single-shot tool decision.
- It starts an action via
actor.act(...)β gets back aSteerableToolHandle, registered inin_flight_actions. - The Actor generates a Python plan calling typed primitives. Each primitive dispatches to a manager running its own LLM tool loop, returning its own steerable handle.
- Meanwhile, the slow brain can start more work, steer existing work, or guide the fast brain during voice calls.
- The MemoryManager observes message events and periodically distills conversations into structured knowledge.
- The EventBus carries typed events with hierarchy labels aligned to tool-loop lineage, making everything observable.
| Repo | Role |
|---|---|
| unity (this) | The agent runtime β managers, tool loops, CodeAct, voice, orchestration |
| orchestra | Persistence backend β FastAPI + Postgres + pgvector. Installer spins it up locally in Docker |
| unify | Python SDK β the client Unity uses to talk to Orchestra |
| unillm | LLM access layer β OpenAI, Anthropic, or any compatible endpoint |
All MIT-licensed. The managed product layer β communication routing, telephony, the assistant session control plane, the web dashboard, billing, identity β runs on Unify's platform and is not part of this open core. You can point Unity at Unify's hosted Orchestra instead of a local one, but managed-service features only work against the hosted backend.
Tests exercise the real system (steerable handles, CodeAct, manager composition, nested tool loops) against simulated backends with cached LLM responses:
uv sync --all-groups
source .venv/bin/activate
tests/parallel_run.sh tests/ # everything
tests/parallel_run.sh tests/actor/ # one module
tests/parallel_run.sh tests/contact_manager/ # anotherSee tests/README.md for the full philosophy β responses are cached, not mocked.
| File | What's there |
|---|---|
unity/common/async_tool_loop.py |
SteerableToolHandle β the protocol everything returns |
unity/common/_async_tool/loop.py |
The async tool loop engine β nesting, steering, context propagation |
unity/actor/code_act_actor.py |
CodeAct β plan generation, sandbox, primitives |
unity/conversation_manager/conversation_manager.py |
Dual-brain orchestration, debouncing, in-flight actions |
unity/conversation_manager/domains/brain_action_tools.py |
How the brain starts, steers, and tracks concurrent work |
unity/function_manager/primitives/registry.py |
How primitives are assembled into the typed API surface |
unity/events/event_bus.py |
Typed event backbone |
unity/memory_manager/memory_manager.py |
Offline consolidation pipeline |
unity/
βββ unity/
β βββ actor/ # CodeActActor
β βββ conversation_manager/ # Dual-brain orchestration
β β βββ domains/ # Brain tools, action tracking, rendering
β βββ common/
β β βββ async_tool_loop.py # SteerableToolHandle
β β βββ _async_tool/ # Tool loop internals
β βββ contact_manager/
β βββ knowledge_manager/
β βββ task_scheduler/
β βββ transcript_manager/
β βββ guidance_manager/
β βββ memory_manager/
β βββ function_manager/
β βββ file_manager/
β βββ image_manager/
β βββ web_searcher/
β βββ secret_manager/
β βββ events/
β βββ manager_registry.py
βββ sandboxes/ # Interactive playgrounds
β βββ conversation_manager/ # Full ConversationManager sandbox (start here)
βββ tests/
βββ agent-service/ # Node.js desktop/browser automation
βββ deploy/ # Dockerfile, Cloud Build, virtual desktop
No regex or substring matching for routing user intent. Everything goes through LLM reasoning, guided by prompts and tool docstrings. If the system handles something wrong, we fix the prompt, not add a hardcoded rule.
No mocked LLMs in tests. Every test uses real inference, cached for speed. Delete the cache and you're re-evaluating against live models.
No defensive coding. No try/except around things that shouldn't fail. No null checks for things that shouldn't be null. The system fails loud when assumptions break.
English as an API. Managers communicate through natural-language interfaces. The Actor orchestrates through English-language primitives. The whole system stays inspectable without reading implementation code.
MIT β see LICENSE.
Built by the team at Unify.
