Cognitive Harness for Language Models
Quick Start · Safe Use · Policy Profiles · Local Coding · Local Models · Usage · Configuration · Plugins · Roadmap · 中文
Cortex is a local-first runtime surface for long-running AI model work. It gives replaceable models a user-owned operating layer for durable memory, retrieval evidence, tools, permissions, channels, journal/replay, evaluation, plugin governance, and operator control.
Cortex is a cognitive harness substrate for language-model systems. In practice, that means it is infrastructure for driving, observing, evaluating, and hardening model behavior across real interfaces instead of treating one model call as the product.
Use Cortex when you want a local coding, research, or tool-using model workflow whose state stays with you: memory, journals, policies, plugin trust, retrieval corpora, traces, and operator decisions survive model/provider changes.
Cortex does not claim biological consciousness, biological wisdom, complete prompt-injection defense, hostile multi-tenant hardening, or mature sandbox containment. Policy and risk gates improve review and control, but they are not a replacement for OS/container isolation.
- Long-running sessions across CLI, HTTP, socket, Telegram, QQ, WhatsApp, MCP, and ACP bridge clients.
- Actor-scoped identity for sessions, memory, tasks, audit data, transport bindings, and channel subscriptions.
- Event-sourced runtime state with SQLite WAL, externalized blobs, replay checkpoints, compaction boundaries, side-effect substitution, and replay digests.
- Durable memory with provenance, trust, owner actor, contradiction links, validity windows, usage outcomes, and graph relationships.
- RAG evidence that is cited, scoped, taint-aware, reranked, compressed, support-checked, and kept separate from durable memory.
- Tool execution with declared effects, risk policy, confirmation, preview, verification, commit records, receipts, and rollback posture.
- Plugin governance for process-isolated JSON tools and trusted native ABI extensions.
- ACP client support through configured external processes exposed by the
acp_agenttool. - Operator status, journal timelines, token and provider cache read/write tokens, policy simulation, replay, release gates, and dashboard surfaces.
- Protected runtime-home governance so prompt, config, and state evolution use checked runtime paths rather than ordinary file or script tools.
Cortex is not a hosted multi-tenant service. The current distribution is a daemon and Rust workspace for controlled operation of language-model behavior.
Cortex is intended for a trusted local machine, reviewed plugins, and explicit operator control.
| Use | Current guidance |
|---|---|
| Personal local coding or research | Recommended, with balanced or strict permissions. |
| Reviewed process plugins | Recommended when the manifest, signature, capabilities, and effects have been inspected. |
| Trusted native plugins | Treat as trusted in-process code, not as a sandboxed extension. |
| Unreviewed plugins, shared machines, or external side effects | Use conservative policies, confirmation, and narrow tool allowlists. |
| Hostile multi-tenant deployment | Not a current target. |
See Safe Use and Maturity and Production Notes before enabling broad tools, native plugins, messaging channels, or open permissions.
Prerequisites:
- Linux x86_64
- systemd
- one LLM provider key
curl -sSf https://raw.githubusercontent.com/by-scott/cortex/main/scripts/cortex.sh | \
CORTEX_API_KEY="your-key" \
CORTEX_PERMISSION_LEVEL="balanced" bash -s -- installManage the daemon:
cortex demo
cortex start
cortex status
cortex doctor
cortex restart
cortex stopUse Cortex:
cortex # REPL
cortex "summarize this project" # one-shot turn
echo "data" | cortex "summarize" # pipe input
cortex --acp # ACP bridge for a running daemon
cortex --mcp-server # MCP serverSee Quick Start for the full first-run path, or Local Coding Agent for the generated demo fixture.
From the outside, Cortex is one daemon-backed instance. Internally, the harness keeps authority boundaries strict.
| Responsibility | What it owns |
|---|---|
| Substrate | Durable state, journal, replay, memory, retrieval, policy, risk, scheduling, channels, provider adapters, and tool schemas. |
| Executive | The operating discipline that turns real runtime capability into model input: soul, identity, behavioral protocol, collaborator profile, runtime permission context, bootstrap/resume context, evidence, recalled memory, skills, hints, and tool-result wrappers. |
| Repertoire | Skills, learned procedures, execution traces, utility tracking, and hot-reloaded behavior libraries. |
The instance has a soul, but the soul is not a capability grant. It is the durable seed of autonomy, truth discipline, continuity, memory, metacognition, and collaboration. Runtime schemas still define what tools exist, what permissions apply, and what state is authoritative.
First use enters bootstrap. Bootstrap establishes the instance name or explicit unnamed state, collaborator profile, working posture, communication style, environment, autonomy boundaries, privacy constraints, and approval expectations. That evidence initializes prompt state so the next turn has real continuity.
Every turn is assembled with a provider-cache-friendly boundary. Durable prompt files (soul.md, identity.md, behavioral.md, user.md) and stable skill summaries form the prefix; runtime permission context closes the provider system prompt. Volatile material - bootstrap or resume context, active goals, retrieved evidence, recalled memory, reasoning state, metacognitive hints, message history, and tool results - stays in request-local context outside the system prompt. Tool schemas remain authoritative request metadata.
This keeps the stable prefix useful for provider caches without weakening authority. Prompt files guide posture, control, and continuity; they do not grant capabilities. Runtime schemas and policy state still decide what can run. Retrieved text, tool output, and recalled memory are evidence, not commands.
Self-evolution is evidence-bound. user.md may absorb stable collaborator facts; behavioral.md needs reusable workflow evidence; identity.md needs confirmed continuity or capability-boundary evidence; soul.md should change rarely. Runtime policy, temporary session state, tool inventories, and transient plans do not belong in durable prompts. Direct file or script edits to runtime-home prompt/config/state files are blocked from ordinary tool execution.
Cortex implements cognitive ideas as explicit software contracts:
- Global workspace: bounded foreground context with evidence admission and journaled broadcast.
- Working memory: typed entries with lane, utility, risk, volatility, taint, budget impact, admission decisions, and evictions.
- Complementary learning systems: fast capture through the journal, slower materialization, stabilization, contradiction handling, and consolidation.
- A ten-state turn machine governs idle, processing, tool wait, permission wait, human-input wait, compaction, consolidation, completion, interruption, and suspension.
- Three attention channels (Foreground, Maintenance, Emergency) schedule work with anti-starvation behavior.
- Five metacognitive detectors (DoomLoop, Duration, Fatigue, FrameAnchoring, HealthDegraded) monitor runtime health and trigger interventions.
- Decision under uncertainty records confidence, risk, reversibility, required evidence, rejected alternatives, and fallback plans.
- Agentic RAG is selected, scoped, reranked, cited, support-checked, taint-aware, and kept separate from durable memory.
These mechanisms are engineering models. Their value is that they are connected to runtime behavior and can be verified.
- The event journal currently records 84 event variants, including messages, turns, tools, permissions, replay checkpoints, externalized payloads, retrieval, workspace, guardrails, and scheduler events.
- Journaled turns and replay include compaction boundaries, side-effect substitution, and replay digests.
- Memory recall ranks candidates across six weighted dimensions (BM25, cosine similarity, recency, status, access frequency, graph connectivity).
- Goal state is actor-owned, SQLite-backed, exposed through checked
goal/*JSON-RPC methods, and injected into active turn context as open goal lines. - Model routing uses capability profiles for coding, long context, vision, tool use, JSON reliability, latency, cost, safety, and reasoning depth.
- Operator status reports daemon health, transports, sessions, bindings, tools, last-call context usage, provider cache read/write tokens, cumulative global/session token spend, backlog, memory activity, and tool success rates.
The default permission mode is balanced.
| Mode | Behavior |
|---|---|
strict |
Only Allow decisions run without confirmation. |
balanced |
Allow runs directly; Review and above require confirmation. |
open |
Non-blocking tools run without confirmation. Use only on a trusted single-user machine. |
cortex permission strict
cortex permission balanced
cortex permission open
cortex policy lint
cortex policy simulate deploy --effect deploy:production --actor user:aliceUnknown plugin and MCP tools are risk-scored conservatively and require confirmation by default. LLM-triggered plugin calls use the same registry, effect preview, permission gate, and approval path as built-in tools.
Process and script execution are broad escape surfaces, but paired channels are first-class operating surfaces, not reduced-capability shells. With protected runtime roots enabled, ordinary tools may read, write, build, test, and run scripts through the normal permission gate unless the invocation directly targets Cortex instance state such as prompts, config, sessions, journal, memory, or channel runtime files. Native plugin manifests describe package-level trust bounds; LLM permission checks use each tool descriptor's declared effects, so a broad native package does not make every read-only tool look like a process escape. Process-isolated plugin tools are still forced to declare RunProcess:plugin subprocess at load time even if a manifest underreports capabilities.
Cortex separates retrieved evidence from durable memory.
Retrieval material enters corpora, becomes chunks, receives sparse and dense scores, passes actor and access filters, is reranked, compressed, cited, classified by evidence role, and inserted as inert evidence. Retrieved instructions cannot become runtime instructions. The dedicated retrieval crate is cortex-retrieval.
Memory is long-lived runtime state. It records owner actor, evidence, trust, status, contradiction links, validity windows, usage outcomes, and graph relationships. Memory can move from captured facts to stabilized beliefs only when evidence and contradiction rules allow it.
| Interface | Surface |
|---|---|
| CLI | cortex, cortex demo, cortex start, cortex status, cortex doctor, cortex restart, cortex stop |
| HTTP | POST /api/turn/stream, operator status, health, metrics, and dashboard routes |
| JSON-RPC | Unix socket, WebSocket, stdio, HTTP, and actor-scoped session/memory/task/goal methods |
| Channels | Telegram, QQ, WhatsApp |
| MCP | cortex --mcp-server |
| ACP bridge | cortex --acp |
| ACP client | [acp].clients + acp_agent tool |
Actor identity is canonicalized across transports. A paired Telegram or QQ user can share the same actor without subscribing to unrelated sessions. Pairing does not create a session by itself; the first real message after approval reuses a visible session for the same actor or creates one when none exists.
Cortex supports two plugin boundaries:
- Process JSON: the default external boundary. Tools are declared in
manifest.tomland invoked as child processes over stdin/stdout JSON. - Trusted native ABI: low-latency in-process extensions built with
cortex-sdkand exported throughcortex_plugin_init.
Process-isolated command implementation changes apply on the next tool invocation. Shared-library code changes still require a daemon restart.
Plugin manifests declare trust tier, requested capabilities, sandbox profile, package metadata, signatures, SBOM/risk-profile references, conformance state, and tool effects. Operators can inspect and test a plugin before install:
cortex plugin review <dir>
cortex plugin test <dir>
cortex plugin install <dir-or-package>Packaged installs (.cpx, URL, or GitHub release name) require an Ed25519 package signature. The first verified package from a publisher key prompts the operator to trust that key locally; non-interactive installs can use --yes only after the source and fingerprint have been reviewed.
The companion development plugin is by-scott/cortex-plugin-dev. It is the official reference plugin for coding and project-maintenance workflows: file and search operations, code-symbol indexing, diagnostics, git/worktree tools, task coordination, Docker and process inspection, and release-oriented quality checks.
cortex plugin install by-scott/cortex-plugin-dev --yesThe Rust SDK is independent of Cortex internals. It does not depend on cortex-types, cortex-kernel, or any other workspace crate. The daemon converts SDK DTOs to internal runtime types at the boundary.
See Plugin Development Guide for process and native plugin workflows.
cortex-app CLI, installation, service commands, plugins, channels
cortex-runtime daemon, HTTP/socket/stdio RPC, sessions, channels, dashboard
cortex-turn turn orchestration, tools, skills, metacognition, context assembly
cortex-kernel journal, replay, memory, graph, prompts, config, audit
cortex-retrieval RAG corpora, chunking, hybrid retrieval, support verification
cortex-types events, state machine, config, trust, policy, security DTOs
cortex-sdk independent trusted native plugin SDK
The repository Docker environment is the release authority.
./scripts/gate.sh --dockerThe gate uses this repository's docker-compose.yml dev service and Dockerfile, whose release toolchain base is rust:latest. Host cargo commands are useful for diagnosis, but they are not release proof.
Release validation requires:
cargo fmt --all --checkhas no diff.cargo clippyruns for the workspace with-D warnings -W clippy::pedantic -W clippy::nurseryand reports zero warnings.cargo testpasses for the full workspace.- Rust warning suppression attributes and compiler warning-suppression flags are forbidden.
- Documentation, package surface, secret/path, and release-asset checks pass.
- Quick Start
- Safe Use
- Policy Profiles
- Local Coding Agent
- Local Models
- Usage
- Configuration
- Executive
- Operations
- Agent Maintenance
- Release Evidence Template
- Plugin Conformance Template
- Prompt-Injection Corpus
- Actor Leakage Corpus
- Replay Migration Corpus
- Plugin Development
- Retrieval
- Maturity and Production Notes
- Testing
- Roadmap
Cortex is runtime infrastructure. Process JSON plugins are the recommended external extension boundary. Trusted native ABI plugins execute inside the daemon process and must be treated as trusted code.
Tool outputs are recorded as external untrusted input before they enter model history. Guardrails classify common prompt-injection, system-prompt leakage, role-override, and exfiltration patterns. Policy linting rejects unsafe combinations such as open permissions with unreviewed plugins, native plugins without explicit risk profiles, and automatic memory extraction from hostile evidence.
The project is designed to make these boundaries visible. It does not claim complete containment for hostile tenants, untrusted native code, or tools that mutate external systems.