Git for AI agents. One memory line. Every agent — Claude Code, Codex, Cursor, browser bots — reads from it, writes to it, coordinates through it. In real time.
Quick Start · Why Dhee · How It Works · The Channel · Integrations
Every AI agent shows up like a new employee.
Claude Code, Codex, Cursor, Gemini CLI, browser agents — they all work on the same codebase. None of them share what they learn.
- 9:00 AM — "Here's the codebase…" you explain it to Claude Code.
- 11:30 AM — "Here's the codebase…" you explain it to Codex. Again.
- 2:15 PM — "Here's the codebase…" you explain it to Cursor. Still again.
Claude Code hits its session limit? You copy-paste context into Codex, re-upload the spec PDF, re-explain the decisions you already made. Every developer using multiple agents is paying this tax every day.
Dhee is a shared information layer. One workspace → many projects → every agent session plugs into the same live memory.
- An agent reads a file, processes an asset, runs a tool — the digest lands on the shared line in real time.
- Another agent, different tool, different project in the same workspace — sees it instantly. No re-upload, no re-explain.
- Backend agent broadcasts an API contract change → a task auto-spawns in the frontend project. Claude Code picks it up and ships the UI.
- Every workspace is a repo for the agents' knowledge, the way git is a repo for code.
The philosophy: share a single information layer; agents connect and collaborate through it.
curl -fsSL https://raw.githubusercontent.com/Sankhya-AI/Dhee/main/install.sh | shThe installer:
- Creates
~/.dheewith a self-contained venv. - Installs
dhee[app]from PyPI. - Prompts you for a provider (OpenAI default, Gemini, NVIDIA, or Ollama).
- Prompts for your API key (stored encrypted under
~/.dhee/secret_store.enc.json). - Wires Claude Code hooks + MCP server + context router.
- Builds the web UI.
Then:
dhee ui # opens http://127.0.0.1:8080/ in your browser
dhee update # pull the latest release + rebuild UINon-interactive flow for CI / scripted installs:
DHEE_PROVIDER=openai DHEE_API_KEY=sk-... \
curl -fsSL https://raw.githubusercontent.com/Sankhya-AI/Dhee/main/install.sh | shOther install options
Via pip:
pip install dhee
dhee install --harness all # configure Claude Code + CodexFrom source:
git clone https://github.com/Sankhya-AI/Dhee.git
cd Dhee
./scripts/bootstrap_dev_env.sh
source .venv-dhee/bin/activate
dhee install --harness allVia Docker:
docker compose up -d # uses OPENAI_API_KEY from envAfter install:
- Claude Code uses native hooks for routing, memory updates, and shared-task context injection.
- Codex uses native
config.toml+ Dhee-managed instructions + incremental event-stream sync, so post-tool results and uploaded artifacts become shared reusable context without manual re-sync. dhee harness statusshows the live state anddhee harness disable --harness codexturns a harness off cleanly.
Running gstack? dhee install gstack
wires its siloed ~/.gstack/projects/* memory into the same Dhee pipeline
as everything else — semantic search, consolidation, correction, episodic
recall — without touching any gstack files. See
docs/adapters/gstack.md.
Project docs (CLAUDE.md, AGENTS.md, SKILL.md, etc.) still auto-ingest on first use. Run dhee ingest manually any time to re-chunk.
#1 on LongMemEval recall. R@1 94.8%, R@5 99.4%, R@10 99.8% — full 500 questions, no held-out split, no cherry-picking.
| System | R@1 | R@3 | R@5 | R@10 |
|---|---|---|---|---|
| Dhee v3.4.0 | 94.8% | 99.0% | 99.4% | 99.8% |
| MemPalace (raw) | — | — | 96.6% | — |
| MemPalace (hybrid v4, held-out 450q) | — | — | 98.4% | — |
| agentmemory | — | — | 95.2% | 98.6% |
Stack: NVIDIA llama-nemotron-embed-vl-1b-v2 embedder + llama-3.2-nv-rerankqa-1b-v2 reranker, top-k 10.
Proof is in-tree, not screenshots. Exact command, metrics, and per-question output are committed under benchmarks/longmemeval/. Recompute R@k yourself — any mismatch is a bug you can open.
┌──────────────────────────┐
│ Your fat context │
│ CLAUDE.md, AGENTS.md, │
│ SKILL.md, GBrain, docs │
│ session history, tools │
└──────────┬───────────────┘
│
Dhee ingests once
(chunk + embed + index)
│
▼
┌───────────────────────────────────────────────────────────────┐
│ Dhee memory + cognition │
│ │
│ Doc chunks · Short-term · Long-term · Insights · Beliefs │
│ Policies · Intentions · Performance · Episodes · Edits │
└────────────────────────────┬──────────────────────────────────┘
│
┌────────────────────┴────────────────────┐
│ │
Session start Each user prompt
(full assembly) (doc chunks only)
│ │
▼ ▼
┌───────────────┐ ┌───────────────┐
│ Relevant docs │ │ Matching rules │
│ + insights │ │ above threshold│
│ + performance │ │ or nothing │
│ + warnings │ │ │
└───────┬───────┘ └───────┬───────┘
│ │
└────────────────────┬────────────────────┘
│
▼
┌──────────────────────────────┐
│ Token-budgeted XML │
│ <dhee v="1"> │
│ <doc src="CLAUDE.md"...> │
│ <i>What worked last time</i>│
│ </dhee> │
└──────────────────────────────┘
│
LLM sees only what it
needs, when it needs it.
And on the tool-use side, the router digests raw output at source — a 10 MB git log becomes a 40-token summary + a pointer. The model expands the pointer only when the digest isn't enough.
| Dhee | CLAUDE.md | Mem0 | Letta | MemPalace | agentmemory | |
|---|---|---|---|---|---|---|
| Token cost per turn | router-replay projected¹ | 2,000+ | varies | ~1K+ | varies | ~1,900 |
| LongMemEval R@5 | 99.4% | N/A | N/A | N/A | 96.6% | 95.2% |
| Self-evolving retrieval policy | Yes | No | No | No | No | No |
| Auto-digest tool output | Yes (router) | No | No | No | No | No |
| Works across every MCP agent | Yes | No | Partial | No | Yes | Yes |
| Typed cognition (insights/beliefs/policies) | Yes | No | No | Partial | No | No |
| Proof in-tree (reproducible) | Yes | — | — | — | Yes | Partial |
| External DB required | No (SQLite) | No | Qdrant/pgvector | Postgres+vector | No | No |
| License | MIT | — | Apache-2 | Apache-2 | MIT | MIT |
Dhee is the only one that routes tool output at source and self-evolves its retrieval policy from an expansion ledger and leads on LongMemEval recall.
¹ Run dhee router report to see the actual per-turn token curve on your sessions. The replay corpus is already built from your real activity — dhee replay-corpus export derives it from the durable samskara log, no synthetic data. A redacted public corpus with reproducible numbers lands in the next release.
Most memory layers are static: you write rules, they retrieve. Dhee watches what happens and tunes itself.
- Intent classification: Every
Read/Bash/Agentcall is bucketed by intent (source code, test, config, doc, data, build). Each intent gets its own retrieval depth. - Expansion tracking: When the model calls
dhee_expand_result(ptr), Dhee logs which tool / intent / depth required expansion. A digest that's always expanded is too shallow. A digest that's never expanded might be too deep. - Policy tuning:
dhee router tunereads the expansion ledger, applies thresholds (>30% expansion → deepen; <5% expansion → make shallower), and persists the new policy atomically to~/.dhee/router_policy.json. - No config file to maintain. The policy is the behavior. The behavior is learned.
This means: the longer a team uses Dhee, the better it gets at serving that specific team's workflow. Frontend-heavy teams get deeper JS/TS digests. Data teams get richer CSV/JSONL summaries. You don't have to pick — Dhee picks for you, based on what you actually expand.
Doc bloat is every AI team's silent tax. Your CLAUDE.md grew from 50 lines to 500 to 2,000 in 18 months. Your skills directory has 30 files. Your prompt library has a thousand entries. Every new agent session loads all of it, every turn, at full token cost.
Dhee turns that library into searchable, decay-aware, self-promoting memory:
| Phase | What Dhee does |
|---|---|
| Ingest | Heading-scoped chunking with SHA tracking. Re-ingest is a no-op if unchanged. |
| Hot path | Vector search + heading-breadcrumb matching. Filters by threshold + token budget. |
| Promotion | Frequently-referenced memories auto-promote from short-term to long-term. |
| Decay | Unused memories fade on an Ebbinghaus curve. Your 50th memory costs the same as your 50,000th. |
| Insight synthesis | What-worked / what-failed from session checkpoints becomes transferable learnings. |
| Prospective | "Remember to run auth tests after login.py changes" fires when the trigger matches — days, weeks, or months later. |
Target shape: after a year of accumulation the per-turn injection stays bounded by token budget — only the matching slice of docs, insights, and policies above threshold reaches the model. The full canonical-tier retention guarantee (100% canonical survival across supersede chains on a decades-replay corpus) lands with movements 2–3 of the public plan. The substrate (propositional facts, supersede-ready schema, decay/promotion) ships today.
Got a GBrain? An AGENTS.md with 15 sections? A Skills library that your team reluctantly prunes every sprint because it "got too big for context"? Stop pruning. Dhee was built for this.
| What you feel today | What Dhee replaces it with |
|---|---|
| Every new agent session starts from scratch | Every agent joins a live memory line |
| Copy-pasting context between Claude Code and Codex | Cross-runtime broadcasts with auto-created tasks |
| Re-uploading the same PDF / design export / schema | Project-scoped asset drawer; upload once, all agents benefit |
Fat CLAUDE.md burned on every call |
Heading-scoped retrieval — the agent sees the 200 tokens that matter, not the 5,000 that don't |
| Tool output dumps bloating context | Router digests + pointers; raw stays out of the conversation |
| "Session limit reached" forces a painful restart | Line survives; resume in another runtime with full context |
Measured savings: 428k tokens → 197k tokens on a real dev session (−54%). Your own numbers are visible in dhee router stats anytime.
dhee ui opens a three-column workspace:
Left rail. Workspace picker + project tree. "+ new / manage" button opens a full CRUD dialog — create Office, Personal, Sankhya AI Labs; each with many projects (frontend, backend, design, …). Rename, re-root, delete. "Connected agents" block shows every runtime with a live dot: claude-code · live, codex · live, browser agent · live.
Centre. The live information line. Every tool call an agent makes lands here, newest first, attributed:
10:02 · claude-code · backend "read schema.sql (R-abc)"
10:04 · codex · frontend "broadcast · api contract changed"
10:09 · browser agent "verified fix on staging checkout"
10:14 · claude-code "merged PR #412 · closing thread"
Kind filters: all / broadcasts / tool events / notes.
Right rail. Broadcast composer — title + target-project picker. Broadcasting into another project auto-creates a task there. Asset drawer — drag a spec PDF, design export, or schema into the project; every agent in the workspace sees it, with a "processed by codex · 2m ago" feed under each asset.
Canvas view. Openswarm-inspired infinite canvas. DOM cards per entity (workspace → project → session → task → result), smooth pan / momentum / pinch zoom, minimap, direction hints, skeleton loader. One view of the whole brain.
Three substrates, one product:
Every write path — the MCP router, the Claude Code PostToolUse hook, project-asset uploads, human broadcasts — fans out through an in-process bus after the DB write. SSE subscribers get the message in the same event-loop tick. No 1-second polling. Dedup on (workspace_id, dedup_key) so retries from emitters are silent no-ops at the database level.
Workspaces own projects. Projects own sessions and project assets. Agent sessions tag every tool call with workspace_id + project_id so context is routable, not a global soup. Drop a PDF into design; the backend-project codex session doesn't see it unless it reads it — at which point the "processed by" linkage shows up in the design project's asset drawer.
Heavy tool output (Bash, Grep, Agent, huge Read) goes through the router. A digest goes to the LLM; the raw is stored behind a pointer. Fat CLAUDE.md gets heading-chunked and only the slices relevant to this turn get injected. When you hit Claude Code's session limit, dhee handoff hands the whole context — including the shared line — to Codex intact.
All three compose: the router produces digests → digests land on the line → other agents' sessions read them back in the next turn.
Dhee is a layer, not a runtime. It plugs into whatever agent you already use:
- Claude Code — hooks + MCP server (
dhee install, auto-wired by the installer). - Codex CLI — MCP server wired via
~/.codex/config.toml; session mirror reads rollouts in real time. - Cursor — MCP server.
- Gemini CLI, Aider, Cline — MCP-compatible; drop
dhee-mcpinto the config. - Browser agents — localhost REST API; register an agent session, publish to the line, subscribe via SSE.
When any of them process the screen — read a file, open a URL, execute a tool — the output is workspace-scoped, digested, and broadcast. That's what makes every runtime screen-aware through Dhee: they stop being isolated chat tabs and start behaving like teammates who remember what the others just did.
# Setup / maintenance
dhee onboard # interactive provider + key wizard (also runs inside install.sh)
dhee update # pull latest release + rebuild UI
dhee install # wire hooks + MCP into Claude Code / Codex / Cursor
dhee doctor # diagnose installation
# Daily use
dhee ui # open the channel view in your browser
dhee ui --no-open # same, don't auto-launch the browser
dhee router stats # what was digested today, how many tokens saved
dhee handoff # hand the current session context to another runtime- PyPI:
pip install dhee— ships the prebuilt web UI, no Node required. - Source tree:
pip install -e .in a clone;dhee updatewill then rungit pull+pip install -e .+ rebuild the UI.
- Workspaces & projects — full CRUD in the UI, REST API, and cascading deletes.
- Information line — SSE stream with project/channel filters, dedup-on-write, optional
?backfill=Nfor reconnects. - Asset drawer — SHA-256-deduped project/workspace assets with per-asset "processed by" feeds.
- Canvas — deterministic hierarchical layout, infinite-pan DOM canvas, minimap, direction hints, loading skeleton.
- Router — digest-and-pointer for Read/Bash/Grep/Agent; heading-scoped
CLAUDE.mdinjection. - Secret store — Fernet-encrypted local vault for provider keys, env vars still win for back-compat.
- 1,200+ tests —
pytest tests/runs green (handful of pre-existing non-blocking failures in adjacent subsystems).
Can I use Dhee without Claude Code? Yes. Any MCP-compatible agent works. Claude Code just gets the tightest integration (native PostToolUse hook for per-turn digest capture).
Does the line share across machines? Today: single-host, in-process pub/sub. Hosted Dhee (Redis pub/sub or Postgres LISTEN/NOTIFY, end-to-end encrypted) is in active development — it's the piece that turns the local tool into a team product.
What happens to my data? Everything lives under ~/.dhee/. SQLite for structured data (history, workspaces, projects, assets, line messages), Fernet-encrypted JSON for secrets, content-addressed pointer store for raw tool outputs. Nothing leaves your machine unless you point a hosted MCP at it.
Production-ready? The local CLI + UI are shipping. Multi-host hosted Dhee is the next milestone — that's where your memory lives online and connects natively to every agent on the market.
MIT. Make it yours.
Dhee is built by Sankhya AI Labs. Questions, bugs, feature requests: GitHub Issues.
