Skip to content

marlian/workmem

Repository files navigation

workmem

Hero graphic showing a SQLite-backed knowledge graph connected to MCP terminal clients

Working memory for AI reasoning.

workmem is a local knowledge graph that keeps the thread between sessions. It stores facts, decays what stops mattering, and surfaces what still does. One binary, one SQLite file, any MCP client.

The RAG preserves knowledge. workmem preserves the thread.

This is not an archive. It's the context that helps a model think now: recent decisions, open problems, corrections, preferences, relationship patterns. Things that keep coming back get reinforced. Things that don't, fade. That's the feature.

Why this exists

LLMs forget everything between sessions. System prompts can't hold your project's history. RAG is great for reference material but bad for recency, continuity, and working context. workmem fills the gap: lightweight enough to call on every session start, smart enough to surface what's relevant without being asked.

Install

Homebrew (macOS / Linux)

brew tap marlian/tap
brew install workmem
workmem version

Source: marlian/homebrew-tap. Updates with brew update && brew upgrade workmem.

Direct download

Download the archive for your platform from releases and extract. Each archive contains the workmem binary plus LICENSE and README.md:

# pick the archive that matches your OS/arch, e.g. darwin-arm64 / linux-amd64
VER=v0.1.0
curl -LO "https://github.com/marlian/workmem/releases/download/${VER}/workmem-darwin-arm64-${VER}.tar.gz"
tar -xzf workmem-darwin-arm64-${VER}.tar.gz
sudo install workmem-darwin-arm64-${VER}/workmem /usr/local/bin/workmem
workmem version

For integrity, download SHA256SUMS from the release page and verify only the archive you actually fetched, using the checksum tool that ships with your platform:

curl -LO "https://github.com/marlian/workmem/releases/download/${VER}/SHA256SUMS"

# macOS
grep "workmem-darwin-arm64-${VER}" SHA256SUMS | shasum -a 256 -c

# Linux
grep "workmem-linux-amd64-${VER}" SHA256SUMS | sha256sum -c

On macOS, Gatekeeper will warn on first launch of an unsigned binary downloaded this way. Remove the quarantine attribute with xattr -d com.apple.quarantine /usr/local/bin/workmem, or install via Homebrew (which does not trigger the warning).

Build from source

git clone https://github.com/marlian/workmem.git
cd workmem
go test ./...
go run ./cmd/workmem sqlite-canary
go build -o workmem ./cmd/workmem
./workmem version   # prints "workmem dev" without -ldflags

Or let Go fetch and install directly:

go install github.com/marlian/workmem/cmd/workmem@latest

Go 1.26+ is required (see go.mod for the exact minimum). No CGO and no external runtime dependencies — the result is a single self-contained binary. Source builds report workmem dev for version; tagged release binaries carry the real vX.Y.Z plus commit SHA and build timestamp.

Client configuration

workmem speaks MCP over stdio. Add it to your client's config:

Claude Code

{
  "mcpServers": {
    "memory": {
      "command": "/path/to/workmem"
    }
  }
}

Claude Desktop

{
  "mcpServers": {
    "memory": {
      "command": "/path/to/workmem"
    }
  }
}

VS Code / Cursor

.vscode/mcp.json:

{
  "servers": {
    "memory": {
      "command": "/path/to/workmem"
    }
  }
}

Any MCP client

/path/to/workmem

No arguments required. Configuration is optional via environment variables.

How it works

The decay model

Facts don't live forever. workmem implements cognitive decay inspired by how human memory works:

effective_confidence = confidence * 0.5 ^ (age_weeks / stability)
stability = half_life * (1 + log2(access_count + 1))
  • A fact recalled 0 times has stability equal to the half-life (12 weeks default)
  • A fact recalled 3 times has stability of 24 weeks
  • A fact recalled 7 times has stability of 36 weeks
  • Frequently recalled facts resist decay. Forgotten facts fade naturally.

Decay is computed at read time. No background jobs.

Composite ranking

Seven search channels feed a composite relevance score:

Channel Weight What it matches
fts_phrase 1.15 Adjacent terms in FTS
fts 1.0 Any term match in FTS
entity_exact 0.9 Exact entity name
entity_like 0.7 Fuzzy entity name
content_like 0.5 Substring in content
type_like 0.45 Entity type match
event_label 0.4 Event label match

Final score blends relevance (70%) with decayed memory strength (30%), plus bonuses for FTS position and multi-channel hits.

Project-scoped memory

remember({ entity: "API", observation: "rate limit 100/min", project: "~/my-app" })

Each project gets its own isolated SQLite database at <project>/.memory/memory.db, created lazily. Global memory (no project param) lives next to the binary.

Tools

12 MCP tools. No more, no less.

Tool Purpose
remember Store a fact about an entity
remember_batch Store multiple facts at once
recall Search by free text (composite ranked)
recall_entity Everything about one entity
relate Link two entities
forget Soft-delete a fact or entity
list_entities Browse what's stored
remember_event Group observations under a session/meeting/decision
recall_events Search events by label, type, date
recall_event Full event with all observations
get_observations Fetch by ID (provenance)
get_event_observations Fetch raw observations for an event

Compact recall

recall accepts compact: true to return truncated snippets instead of full content. Use get_observations to expand specific items. This keeps context windows lean.

Environment variables

Variable Default Description
MEMORY_DB_PATH Next to binary Path to the global SQLite database
MEMORY_HALF_LIFE_WEEKS 12 Decay half-life for global memory
PROJECT_MEMORY_HALF_LIFE_WEEKS 52 Decay half-life for project memory
COMPACT_SNIPPET_LENGTH 120 Max chars per observation in compact mode
PROJECT_DB_CACHE_MAX 16 Target max cached project-scoped SQLite handles; active leases may temporarily exceed it
WORKMEM_EMBEDDING_PROVIDER none Semantic reconcile provider config: none, openai-compatible, ollama, or openai
WORKMEM_EMBEDDING_BASE_URL unset Embedding provider base URL for non-none providers
WORKMEM_EMBEDDING_MODEL unset Embedding model identifier for non-none providers
WORKMEM_EMBEDDING_DIMENSIONS unset Embedding vector dimensions for non-none providers

Remote embedding opt-in is intentionally not an environment variable. Use the workmem reconcile semantic --allow-remote-embeddings CLI flag for openai or non-loopback endpoints.

Loading config from a .env file

Some MCP clients (e.g. Kilo, opencode-derivatives) ignore the env block in their server config — only command and args are portable. Use the -env-file flag to load variables from a file:

workmem -env-file /path/to/.env

The parser implements the documented workmem .env grammar: KEY=value, single/double quotes, # comments, export KEY=value, BOM, CRLF. No variable interpolation, no multi-line, no escape sequences. Missing file is not an error (silent fallback to defaults).

Precedence: explicit process env > -env-file values > built-in defaults. A key already present in the environment — even set to an empty string — is never overwritten by the file.

Running multiple instances

A common pattern: one for general knowledge, one for private notes. The client sees them as separate tool namespaces:

{
  "mcpServers": {
    "memory": {
      "command": "/path/to/workmem",
      "args": ["-env-file", "/path/to/memory/.env"]
    },
    "private_memory": {
      "command": "/path/to/workmem",
      "args": ["-env-file", "/path/to/private-memory/.env"]
    }
  }
}

Each .env holds that instance's MEMORY_DB_PATH, MEMORY_HALF_LIFE_WEEKS, and any other overrides — no duplication in the client config. For clients that support it, the env block still works and takes precedence over the file.

Recommended LLM instructions

Add to your system prompt or CLAUDE.md:

## Persistent Memory

You have access to a persistent memory store. Use it proactively:

- **`remember`** when you learn something worth retaining across sessions
- **`recall`** at session start or when you need context (it's free — local SQLite)
- **`remember_event`** to group related facts under a session or decision
- **`forget`** to remove stale or incorrect facts
- **`relate`** to link entities with named relationships

If `remember` returns `possible_conflicts`, review those observations before
storing more related facts. Use `forget(obs_id)` only when the old fact should
be deleted/erased. `workmem reconcile --mode propose` can report exact duplicate
candidates; `workmem reconcile --mode apply` and `workmem reconcile rollback
<run_id>` provide audited reversible exact-duplicate supersession.

Remember: preferences, corrections, names, decisions, conventions.
Don't remember: transient tasks, code snippets, things already in docs/git.

Database

SQLite with WAL mode. Tables include entities, observations, relations, events, reconcile audit tables, and memory_fts (FTS5). Schema created automatically. Soft-delete via deleted_at tombstones — forgotten facts are excluded from retrieval but remain in the database. Superseded observations are also excluded from active-memory reads while preserving auditability.

Backup

Produce an end-to-end encrypted snapshot with the backup subcommand. The snapshot is taken via VACUUM INTO (consistent, no lock on the live DB) and encrypted with age. The plaintext intermediate never leaves the temp directory; the output is written with 0600 permissions.

# single recipient
workmem backup --to backup.age --age-recipient age1yourpubkey...

# multiple recipients and/or a recipients file
workmem backup --to backup.age \
  --age-recipient age1alpha... \
  --age-recipient /path/to/recipients.txt

Restore with the standard age CLI:

age -d -i my-identity.txt backup.age > memory.db

Only the global memory DB is included. Project-scoped DBs live in their own workspaces and are out of scope. Telemetry data (if enabled) is operational and not included — rebuild freely.

Reconcile runner

workmem reconcile --mode propose runs a read-only hygiene scan and writes a local markdown report under review/ by default. --mode apply reruns the same deterministic exact-duplicate scan, validates each source/target pair in a short transaction, supersedes duplicate sources, and records audit rows. Rollback uses the recorded run ID:

workmem reconcile --mode propose
workmem reconcile --mode propose --scope project=/path/to/repo
workmem reconcile --mode propose --since 90d
workmem reconcile --mode propose --output /tmp/reconcile.md
workmem reconcile --mode apply
workmem reconcile rollback <run_id>
workmem reconcile semantic
workmem reconcile semantic --mode report \
  --embedding-provider openai-compatible \
  --embedding-base-url http://localhost:1234/v1 \
  --embedding-model local-embedding-model \
  --embedding-dimensions 768 \
  --max-embeddings-per-request 64 \
  --max-observations-per-entity 200 \
  --max-candidates-per-entity 100

The v0 runner detects exact duplicate observations within the same entity. It does not perform semantic matching, embedding lookup, or summarization. Propose opens the memory database read-only and does not create missing global/project DBs, apply supersession, mutate observations, or write audit rows. Apply and rollback require an existing DB, write reconcile_runs / reconcile_decisions, snapshot the duplicated content in the audit row, and fail rather than mutate if audited source/target state no longer matches. Because they are write commands, apply/rollback may run schema migrations on an existing DB before the reconcile transaction begins; use propose for a strictly read-only inspection. Rollback must be run against the same scope as the original apply run; use --scope project=/path/to/repo for project-scoped apply runs. The --since window selects entities with recent observations; once an entity is selected, older active source rows can still be reported when they duplicate a newer active observation.

workmem reconcile semantic --mode validate validates embedding provider configuration and exits without generating semantic candidates, making network calls, opening a memory database, or mutating memory. Validate mode ignores report-only flag values, so stale --db, --output, threshold, or scan-window flags cannot accidentally make validation touch a DB.

workmem reconcile semantic --mode report opens an existing global or project DB, embeds same-entity active observations through openai-compatible or ollama, populates/reuses observation_embeddings, and writes a markdown report under review/ by default. Report mode excludes deleted, expired-event, and superseded observations. It does not mutate observations, supersession fields, reconcile audit rows, access counts, FTS state, or schema migrations; embedding-cache writes are the only allowed persistence. Embedding requests are chunked by --max-embeddings-per-request; per-entity comparison/output work is bounded by --max-observations-per-entity and --max-candidates-per-entity, with limit signals written to the report. Reports include bounded candidate snippets for human review, candidate clusters, and manual decision checkboxes; they are local/private markdown files. Semantic apply does not exist.

The default provider is none. Non-none providers require --embedding-base-url, --embedding-model, and --embedding-dimensions. openai-compatible and ollama are supported for report mode; openai config can be validated but report mode rejects it. openai and endpoints whose host is not literal localhost or a loopback IP require the explicit --allow-remote-embeddings flag. Host aliases are not DNS-resolved for this trust decision. Environment variables can set provider details, but remote opt-in is intentionally CLI-only.

Optional model-assisted cleanup proposal

Semantic reports are evidence, not executable plans. You may paste a report into an LLM you trust to draft a human cleanup proposal, but workmem does not call that model or apply its suggestions. Reports contain memory snippets; only send them to providers you are comfortable sharing that local/private content with.

Use a provider and model of your choice. A useful prompt shape is:

You are a conservative memory hygiene reviewer. Analyze this semantic reconcile
report as a human cleanup spec. Do not execute actions. Do not output tool calls.

Core rules:
1. First classify relationship_type, then suggest action.
2. Semantic similarity means relatedness, not duplication.
3. Preserve timeline facts: opened vs merged, draft vs implemented, phase N vs
   phase N+1, and different commits/dates are not duplicates by default.
4. A source may be forgotten only if a proposed new observation fully preserves
   every distinct fact visible in the snippets.
5. Large heterogeneous clusters must be split by subtheme, not consolidated.

Allowed relationship_type values:
- lifecycle_pair
- same_topic_distinct_facts
- possible_duplicate
- broad_topic_blob
- scope_mismatch
- insufficient_evidence

Allowed action values:
- keep_all
- draft_synthetic_keep_sources
- draft_synthetic_then_human_may_forget
- split_into_subthemes
- move_scope_review
- inspect_only

Return:
- threshold_assessment
- stable_prompt_invariants
- cluster_decisions as JSON
- self_critique

Design principles

  • Stupidity of use, solidity of backend. The model doesn't think about memory. It just calls tools. The ranking, decay, and retrieval happen behind the curtain.
  • 12 tools is the ceiling, not the floor. Every tool costs context tokens on every model invocation. Adding tool 13 requires strong evidence.
  • Decay is the feature. What matters keeps surfacing. What doesn't, fades. This isn't a compromise — it's the mechanism.
  • Evidence over intuition. The next feature ships when data says it should, not when it sounds interesting.

License

MIT — see LICENSE for the full text.

About

Working memory for AI reasoning. Single binary, SQLite, MCP.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages