Working memory for AI reasoning.
workmem is a local knowledge graph that keeps the thread between sessions. It stores facts, decays what stops mattering, and surfaces what still does. One binary, one SQLite file, any MCP client.
The RAG preserves knowledge. workmem preserves the thread.
This is not an archive. It's the context that helps a model think now: recent decisions, open problems, corrections, preferences, relationship patterns. Things that keep coming back get reinforced. Things that don't, fade. That's the feature.
LLMs forget everything between sessions. System prompts can't hold your project's history. RAG is great for reference material but bad for recency, continuity, and working context. workmem fills the gap: lightweight enough to call on every session start, smart enough to surface what's relevant without being asked.
brew tap marlian/tap
brew install workmem
workmem versionSource: marlian/homebrew-tap. Updates with brew update && brew upgrade workmem.
Download the archive for your platform from releases and extract. Each archive contains the workmem binary plus LICENSE and README.md:
# pick the archive that matches your OS/arch, e.g. darwin-arm64 / linux-amd64
VER=v0.1.0
curl -LO "https://github.com/marlian/workmem/releases/download/${VER}/workmem-darwin-arm64-${VER}.tar.gz"
tar -xzf workmem-darwin-arm64-${VER}.tar.gz
sudo install workmem-darwin-arm64-${VER}/workmem /usr/local/bin/workmem
workmem versionFor integrity, download SHA256SUMS from the release page and verify only the archive you actually fetched, using the checksum tool that ships with your platform:
curl -LO "https://github.com/marlian/workmem/releases/download/${VER}/SHA256SUMS"
# macOS
grep "workmem-darwin-arm64-${VER}" SHA256SUMS | shasum -a 256 -c
# Linux
grep "workmem-linux-amd64-${VER}" SHA256SUMS | sha256sum -cOn macOS, Gatekeeper will warn on first launch of an unsigned binary downloaded this way. Remove the quarantine attribute with xattr -d com.apple.quarantine /usr/local/bin/workmem, or install via Homebrew (which does not trigger the warning).
git clone https://github.com/marlian/workmem.git
cd workmem
go test ./...
go run ./cmd/workmem sqlite-canary
go build -o workmem ./cmd/workmem
./workmem version # prints "workmem dev" without -ldflagsOr let Go fetch and install directly:
go install github.com/marlian/workmem/cmd/workmem@latestGo 1.26+ is required (see go.mod for the exact minimum). No CGO and no external runtime dependencies — the result is a single self-contained binary. Source builds report workmem dev for version; tagged release binaries carry the real vX.Y.Z plus commit SHA and build timestamp.
workmem speaks MCP over stdio. Add it to your client's config:
{
"mcpServers": {
"memory": {
"command": "/path/to/workmem"
}
}
}{
"mcpServers": {
"memory": {
"command": "/path/to/workmem"
}
}
}.vscode/mcp.json:
{
"servers": {
"memory": {
"command": "/path/to/workmem"
}
}
}/path/to/workmemNo arguments required. Configuration is optional via environment variables.
Facts don't live forever. workmem implements cognitive decay inspired by how human memory works:
effective_confidence = confidence * 0.5 ^ (age_weeks / stability)
stability = half_life * (1 + log2(access_count + 1))
- A fact recalled 0 times has stability equal to the half-life (12 weeks default)
- A fact recalled 3 times has stability of 24 weeks
- A fact recalled 7 times has stability of 36 weeks
- Frequently recalled facts resist decay. Forgotten facts fade naturally.
Decay is computed at read time. No background jobs.
Seven search channels feed a composite relevance score:
| Channel | Weight | What it matches |
|---|---|---|
fts_phrase |
1.15 | Adjacent terms in FTS |
fts |
1.0 | Any term match in FTS |
entity_exact |
0.9 | Exact entity name |
entity_like |
0.7 | Fuzzy entity name |
content_like |
0.5 | Substring in content |
type_like |
0.45 | Entity type match |
event_label |
0.4 | Event label match |
Final score blends relevance (70%) with decayed memory strength (30%), plus bonuses for FTS position and multi-channel hits.
remember({ entity: "API", observation: "rate limit 100/min", project: "~/my-app" })
Each project gets its own isolated SQLite database at <project>/.memory/memory.db, created lazily. Global memory (no project param) lives next to the binary.
12 MCP tools. No more, no less.
| Tool | Purpose |
|---|---|
remember |
Store a fact about an entity |
remember_batch |
Store multiple facts at once |
recall |
Search by free text (composite ranked) |
recall_entity |
Everything about one entity |
relate |
Link two entities |
forget |
Soft-delete a fact or entity |
list_entities |
Browse what's stored |
remember_event |
Group observations under a session/meeting/decision |
recall_events |
Search events by label, type, date |
recall_event |
Full event with all observations |
get_observations |
Fetch by ID (provenance) |
get_event_observations |
Fetch raw observations for an event |
recall accepts compact: true to return truncated snippets instead of full content. Use get_observations to expand specific items. This keeps context windows lean.
| Variable | Default | Description |
|---|---|---|
MEMORY_DB_PATH |
Next to binary | Path to the global SQLite database |
MEMORY_HALF_LIFE_WEEKS |
12 |
Decay half-life for global memory |
PROJECT_MEMORY_HALF_LIFE_WEEKS |
52 |
Decay half-life for project memory |
COMPACT_SNIPPET_LENGTH |
120 |
Max chars per observation in compact mode |
PROJECT_DB_CACHE_MAX |
16 |
Target max cached project-scoped SQLite handles; active leases may temporarily exceed it |
WORKMEM_EMBEDDING_PROVIDER |
none |
Semantic reconcile provider config: none, openai-compatible, ollama, or openai |
WORKMEM_EMBEDDING_BASE_URL |
unset | Embedding provider base URL for non-none providers |
WORKMEM_EMBEDDING_MODEL |
unset | Embedding model identifier for non-none providers |
WORKMEM_EMBEDDING_DIMENSIONS |
unset | Embedding vector dimensions for non-none providers |
Remote embedding opt-in is intentionally not an environment variable. Use the
workmem reconcile semantic --allow-remote-embeddings CLI flag for openai or
non-loopback endpoints.
Some MCP clients (e.g. Kilo, opencode-derivatives) ignore the env block in their server config — only command and args are portable. Use the -env-file flag to load variables from a file:
workmem -env-file /path/to/.env
The parser implements the documented workmem .env grammar: KEY=value, single/double quotes, # comments, export KEY=value, BOM, CRLF. No variable interpolation, no multi-line, no escape sequences. Missing file is not an error (silent fallback to defaults).
Precedence: explicit process env > -env-file values > built-in defaults. A key already present in the environment — even set to an empty string — is never overwritten by the file.
A common pattern: one for general knowledge, one for private notes. The client sees them as separate tool namespaces:
{
"mcpServers": {
"memory": {
"command": "/path/to/workmem",
"args": ["-env-file", "/path/to/memory/.env"]
},
"private_memory": {
"command": "/path/to/workmem",
"args": ["-env-file", "/path/to/private-memory/.env"]
}
}
}Each .env holds that instance's MEMORY_DB_PATH, MEMORY_HALF_LIFE_WEEKS, and any other overrides — no duplication in the client config. For clients that support it, the env block still works and takes precedence over the file.
Add to your system prompt or CLAUDE.md:
## Persistent Memory
You have access to a persistent memory store. Use it proactively:
- **`remember`** when you learn something worth retaining across sessions
- **`recall`** at session start or when you need context (it's free — local SQLite)
- **`remember_event`** to group related facts under a session or decision
- **`forget`** to remove stale or incorrect facts
- **`relate`** to link entities with named relationships
If `remember` returns `possible_conflicts`, review those observations before
storing more related facts. Use `forget(obs_id)` only when the old fact should
be deleted/erased. `workmem reconcile --mode propose` can report exact duplicate
candidates; `workmem reconcile --mode apply` and `workmem reconcile rollback
<run_id>` provide audited reversible exact-duplicate supersession.
Remember: preferences, corrections, names, decisions, conventions.
Don't remember: transient tasks, code snippets, things already in docs/git.SQLite with WAL mode. Tables include entities, observations, relations, events, reconcile audit tables, and memory_fts (FTS5). Schema created automatically. Soft-delete via deleted_at tombstones — forgotten facts are excluded from retrieval but remain in the database. Superseded observations are also excluded from active-memory reads while preserving auditability.
Produce an end-to-end encrypted snapshot with the backup subcommand. The snapshot is taken via VACUUM INTO (consistent, no lock on the live DB) and encrypted with age. The plaintext intermediate never leaves the temp directory; the output is written with 0600 permissions.
# single recipient
workmem backup --to backup.age --age-recipient age1yourpubkey...
# multiple recipients and/or a recipients file
workmem backup --to backup.age \
--age-recipient age1alpha... \
--age-recipient /path/to/recipients.txtRestore with the standard age CLI:
age -d -i my-identity.txt backup.age > memory.dbOnly the global memory DB is included. Project-scoped DBs live in their own workspaces and are out of scope. Telemetry data (if enabled) is operational and not included — rebuild freely.
workmem reconcile --mode propose runs a read-only hygiene scan and writes a
local markdown report under review/ by default. --mode apply reruns the same
deterministic exact-duplicate scan, validates each source/target pair in a short
transaction, supersedes duplicate sources, and records audit rows. Rollback uses
the recorded run ID:
workmem reconcile --mode propose
workmem reconcile --mode propose --scope project=/path/to/repo
workmem reconcile --mode propose --since 90d
workmem reconcile --mode propose --output /tmp/reconcile.md
workmem reconcile --mode apply
workmem reconcile rollback <run_id>
workmem reconcile semantic
workmem reconcile semantic --mode report \
--embedding-provider openai-compatible \
--embedding-base-url http://localhost:1234/v1 \
--embedding-model local-embedding-model \
--embedding-dimensions 768 \
--max-embeddings-per-request 64 \
--max-observations-per-entity 200 \
--max-candidates-per-entity 100The v0 runner detects exact duplicate observations within the same entity. It
does not perform semantic matching, embedding lookup, or summarization. Propose
opens the memory database read-only and does not create missing global/project
DBs, apply supersession, mutate observations, or write audit rows. Apply and
rollback require an existing DB, write reconcile_runs / reconcile_decisions,
snapshot the duplicated content in the audit row, and fail rather than mutate if
audited source/target state no longer matches.
Because they are write commands, apply/rollback may run schema migrations on an
existing DB before the reconcile transaction begins; use propose for a strictly
read-only inspection.
Rollback must be run against the same scope as the original apply run; use
--scope project=/path/to/repo for project-scoped apply runs.
The --since window selects entities with recent observations; once an entity
is selected, older active source rows can still be reported when they duplicate a
newer active observation.
workmem reconcile semantic --mode validate validates embedding provider
configuration and exits without generating semantic candidates, making network
calls, opening a memory database, or mutating memory. Validate mode ignores
report-only flag values, so stale --db, --output, threshold, or scan-window
flags cannot accidentally make validation touch a DB.
workmem reconcile semantic --mode report opens an existing global or project DB,
embeds same-entity active observations through openai-compatible or ollama,
populates/reuses observation_embeddings, and writes a markdown report under
review/ by default. Report mode excludes deleted, expired-event, and superseded
observations. It does not mutate observations, supersession fields, reconcile
audit rows, access counts, FTS state, or schema migrations; embedding-cache
writes are the only allowed persistence. Embedding requests are chunked by
--max-embeddings-per-request; per-entity comparison/output work is bounded by
--max-observations-per-entity and --max-candidates-per-entity, with limit
signals written to the report. Reports include bounded candidate snippets for
human review, candidate clusters, and manual decision checkboxes; they are
local/private markdown files. Semantic apply does not exist.
The default provider is none. Non-none providers require
--embedding-base-url, --embedding-model, and --embedding-dimensions.
openai-compatible and ollama are supported for report mode; openai config
can be validated but report mode rejects it. openai and endpoints whose host is
not literal localhost or a loopback IP require the explicit
--allow-remote-embeddings flag. Host aliases are not DNS-resolved for this
trust decision. Environment variables can set provider details, but remote opt-in
is intentionally CLI-only.
Semantic reports are evidence, not executable plans. You may paste a report into
an LLM you trust to draft a human cleanup proposal, but workmem does not call
that model or apply its suggestions. Reports contain memory snippets; only send
them to providers you are comfortable sharing that local/private content with.
Use a provider and model of your choice. A useful prompt shape is:
You are a conservative memory hygiene reviewer. Analyze this semantic reconcile
report as a human cleanup spec. Do not execute actions. Do not output tool calls.
Core rules:
1. First classify relationship_type, then suggest action.
2. Semantic similarity means relatedness, not duplication.
3. Preserve timeline facts: opened vs merged, draft vs implemented, phase N vs
phase N+1, and different commits/dates are not duplicates by default.
4. A source may be forgotten only if a proposed new observation fully preserves
every distinct fact visible in the snippets.
5. Large heterogeneous clusters must be split by subtheme, not consolidated.
Allowed relationship_type values:
- lifecycle_pair
- same_topic_distinct_facts
- possible_duplicate
- broad_topic_blob
- scope_mismatch
- insufficient_evidence
Allowed action values:
- keep_all
- draft_synthetic_keep_sources
- draft_synthetic_then_human_may_forget
- split_into_subthemes
- move_scope_review
- inspect_only
Return:
- threshold_assessment
- stable_prompt_invariants
- cluster_decisions as JSON
- self_critique
- Stupidity of use, solidity of backend. The model doesn't think about memory. It just calls tools. The ranking, decay, and retrieval happen behind the curtain.
- 12 tools is the ceiling, not the floor. Every tool costs context tokens on every model invocation. Adding tool 13 requires strong evidence.
- Decay is the feature. What matters keeps surfacing. What doesn't, fades. This isn't a compromise — it's the mechanism.
- Evidence over intuition. The next feature ships when data says it should, not when it sounds interesting.
MIT — see LICENSE for the full text.
