Skip to content

DreamLab-AI/agentbox

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,141 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentbox

Agentbox

A manifest-driven, reproducible runtime for sovereign software agents.

Build License Nix Multi-arch

One TOML manifest. One Nix flake. One runtime contract.

Maintainer: John O'Hare · Upstream IP: Melvin Carvalho (JSS, DID:Nostr) · MAINTAINERS.md

Quickstart · Why Agentbox · Capabilities · Sovereign Architecture · Docs


What is Agentbox?

Agentbox is a hardened, fully reproducible Linux container environment built specifically to host, orchestrate, and trace autonomous AI agents.

Instead of juggling custom Dockerfiles, scattered API keys, and brittle dependency scripts, everything in Agentbox is driven by a single agentbox.toml manifest. You declare the agents you want, the tools they need — from browser automation to 3D rendering — and the storage backends they use. Agentbox builds a byte-for-byte reproducible image using Nix, spins up the environment, and automatically routes all agent actions through local privacy filters and cryptographic audit trails.

Why Agentbox?

Most agent runtimes are just a collection of tools with no provenance, privacy, or reproducible state. Agentbox is built differently:

  • 🚀 Batteries Included (via MCP): Out-of-the-box support for Claude Code, Codex, Gemini, DeepSeek, and ruflo. Instantly equip them with 90+ skills including Playwright, ComfyUI, QGIS, Blender, LaTeX, and Jupyter via the Model Context Protocol (MCP).
  • 🔒 Privacy by Default: An embedded openai/privacy-filter sidecar intercepts every agent action, ensuring PII and secrets are redacted before any data hits your memory or logs.
  • 🛡️ Hardened & Reproducible: Built with Nix flakes — zero mutable npm install steps at runtime. Runs as non-root with a read-only filesystem and all capabilities dropped by default.
  • 🔗 Sovereign Data & Auditability: Agents own their data cryptographically. Every generated file, memory, and action is stamped with a did:nostr identity and stored in an embedded Solid Pod (solid-pod-rs). See The Sovereign Data Stack.
  • 🔌 Pluggable Adapters: Run entirely standalone on a laptop (SQLite + local JSONL), or effortlessly federate into a cloud mesh (Postgres pgvector + HTTP event sinks) by flipping a TOML switch.

Quickstart

Interactive onboarding (recommended)

Use the browser-based setup wizard to configure your manifest, select your tools, and boot the container:

git clone https://github.com/DreamLab-AI/agentbox.git
cd agentbox
./scripts/start-agentbox.sh

The wizard opens in your default browser — no dependencies beyond Python 3 (for the local HTTP server). It renders all agentbox.toml sections with schema-validated form controls and the DreamLab glassmorphism design system. Pass --tui to use the legacy terminal wizard instead.

Setup Wizard
Browser-based configuration wizard with schema-driven form controls

Fast path (pre-built image)

export AGENTBOX_IMAGE_REF=ghcr.io/dreamlab-ai/agentbox:latest
docker pull "$AGENTBOX_IMAGE_REF"
./agentbox.sh up --registry
./agentbox.sh health
./agentbox.sh shell

Build from source

git clone https://github.com/DreamLab-AI/agentbox.git
cd agentbox
./scripts/agentbox-config-validate.sh
./agentbox.sh up --build
./agentbox.sh health

Next steps:


Included Capabilities

Your agentbox.toml manifest toggles capabilities on or off. Disabled features add zero bloat to your final image.

Category Highlights
Agent toolchains claude-code, ruflo, antigravity (agy), agentic-qe, openai-codex
Consultants Meta-router for named external consultations: DeepSeek, Perplexity, Z.AI, Antigravity
Browser and web External browsercontainer sidecar (chrome-devtools-mcp, Chrome Beta 149+, GPU-accelerated)
Media and design Local ComfyUI (or external URL), ImageMagick, FFmpeg
Spatial and 3D QGIS geospatial analysis, Blender modelling, 3D Gaussian Splatting
Data science and docs PyTorch, Jupyter Lab, LaTeX, Mermaid rendering
Code-as-Harness Persistent Python kernel MCP, ExpeL post-task lesson distillation, Voyager verified-skill library, SWE-agent ACI MCP, execution-gated tree-search (PRD-008)
Governance Agent Control Surface Protocol (kinds 31400-31405) — cross-repo human-in-the-loop integration with the DreamLab forum and VisionClaw's BrokerActor via the embedded relay
Operations OTLP tracing, Prometheus metrics (:9191/metrics), Tailscale VPN integration

Code-as-Harness (PRD-008)

A persistent IPython kernel MCP exposes six tools (kernel.exec, kernel.list_vars, kernel.inspect, kernel.reset, kernel.interrupt, kernel.install_pkg) so that variable state, imported modules, and computed DataFrames survive across tool calls within a session. An ExpeL post-task hook distils completed trajectories into reusable DistilledLesson records in RuVector. A Voyager verified-skill library accumulates assertion-passing Python functions for retrieval and injection at future task start. A SWE-agent-style ACI MCP provides bounded file viewing, compact-diff editing, budget-capped search, structured test execution, and task submission for autonomous repo-level bug-fixing. An execution-gated tree-search skill generates N candidates, executes each in a fresh kernel session, and scores by assertion-pass rate. Multi-tier memory uses OWL2-typed RuVector namespaces (semantic / procedural / episodic) with no schema changes. All records carry did:nostr identity and PROV-O action receipts. Phase 1 surfaces (code_interpreter, codeact, expel_lesson_extraction) are opt-in; Phase 2 surfaces (voyager_skill_library, aci_shell, tree_search_coder) are scaffolded and default off. See docs/developer/code-as-harness.md for the operator guide.


The Sovereign Data Stack

The core differentiator of Agentbox is the Identity and Tracing Mesh.

When an agent acts, how do you know which agent did it? How do you prove it later? Without an identity root, audit logs are meaningless.

Agentbox solves this by generating a BIP-340 secp256k1 keypair at bootstrap. The agent's public key becomes a did:nostr:<hex-pubkey> identity. Every resource, action, and event in the system is rooted in this cryptographic identity.

From that single root, 18 kinds of urn:agentbox:<kind>:[<scope>:]<local> identifiers name every entity: pods, credentials, receipts, activities, events, memories, skills, architecture docs, and more. Owner-scoped kinds embed the hex pubkey — urn:agentbox:credential:<hex-pubkey>:<sha256-12-…> means that credential was issued by that agent and no other. Content-addressed kinds are deterministic: the same payload always produces the same URN, so re-emitting never double-counts and signed credentials keep a stable @id across JCS canonicalisation.

Identity root diagram
flowchart TB
    KP[secp256k1 keypair\nBIP-340 x-only]
    HEX[64-char hex pubkey]
    DID[did:nostr:hex-pubkey\nPrimary agent DID]
    KP --> HEX
    HEX --> DID

    subgraph identity["Identity surfaces"]
        POD_ID[Solid pod identity\nWAC agent field]
        RELAY_ID[Nostr relay NIP-42\nNIP-98 HTTP auth]
        DID_DOC[DID Document\nGET /.well-known/did.json]
    end

    DID --> POD_ID
    DID --> RELAY_ID
    DID --> DID_DOC

    subgraph owned["Owner-scoped URNs — hex pubkey in scope"]
        CRED[urn:agentbox:credential\nhex-pubkey:sha256-12-...]
        RECEIPT[urn:agentbox:receipt\nhex-pubkey:sha256-12-...]
        ACTIVITY[urn:agentbox:activity\nhex-pubkey:sha256-12-...]
        BEAD[urn:agentbox:bead\nhex-pubkey:local-id]
        EVENT[urn:agentbox:event\nhex-pubkey:sha256-12-...]
        MANDATE[urn:agentbox:mandate\nhex-pubkey:sha256-12-...]
        AGENT[urn:agentbox:agent\nhex-pubkey:sha256-12-...]
        ENVELOP[urn:agentbox:envelope\nhex-pubkey:sha256-12-...]
    end

    DID --> CRED
    DID --> RECEIPT
    DID --> ACTIVITY
    DID --> BEAD
    DID --> EVENT
    DID --> MANDATE
    DID --> AGENT
    DID --> ENVELOP
Loading
Request lifecycle and adapter dispatch pipeline

Every request through the management API follows a rigorous lifecycle: identity verification → adapter routing → privacy redaction → JSON-LD encoding → OTLP tracing.

sequenceDiagram
    participant AG as Agent did:nostr:hex
    participant MA as management-api
    participant AR as adapter resolver
    participant PF as privacy filter
    participant UM as uris.mint
    participant PO as solid-pod-rs
    participant OT as OTLP exporter

    AG->>MA: POST /v1/pods/:id/resources NIP-98 signed
    MA->>OT: span open agentbox.adapter.pods.write
    MA->>AR: resolve slot=pods
    AR->>PF: write(slot=pods payload=data)
    PF->>PF: policy=strict redact via opf-router
    PF-->>AR: redacted payload
    AR->>UM: mint kind=pod pubkey=hex payload=redacted
    UM-->>AR: urn:agentbox:pod:hex:sha256-12-abc
    AR->>PO: PUT resource atomic rename
    PO-->>AR: 201 ETag
    AR->>UM: mint kind=activity pubkey=hex action=write
    UM-->>AR: urn:agentbox:activity:hex:sha256-12-def
    MA->>OT: span close resource-urn=urn:agentbox:pod:...
    MA-->>AG: 201 JSON-LD @id=urn:agentbox:pod:hex:sha256-12-abc
Loading
Full URN kind taxonomy (18 kinds)
flowchart LR
    subgraph identity_k["Identity"]
        POD_K[pod]
        AGENT_K[agent]
    end

    subgraph comms["Communications"]
        ENVELOPE_K[envelope]
        EVENT_K[event]
        RECEIPT_K[receipt]
    end

    subgraph state["Durable state"]
        BEAD_K[bead]
        MEMORY_K[memory]
        DATASET_K[dataset]
        THING_K[thing]
    end

    subgraph auth["Auth and trust"]
        CRED_K[credential]
        MANDATE_K[mandate]
        MCP_K[mcp]
    end

    subgraph trace["Tracing"]
        ACTIVITY_K[activity]
        SKILL_K[skill]
    end

    subgraph docs["Governance docs"]
        ADR_K[adr]
        PRD_K[prd]
        DDD_K[ddd]
        META_K[meta]
    end
Loading
Kind Owner-scoped Content-addressed Example URN
pod yes yes urn:agentbox:pod:hex:sha256-12-abc
envelope yes yes urn:agentbox:envelope:hex:sha256-12-abc
credential yes yes urn:agentbox:credential:hex:sha256-12-abc
mandate yes yes urn:agentbox:mandate:hex:sha256-12-abc
receipt yes yes urn:agentbox:receipt:hex:sha256-12-abc
activity yes yes urn:agentbox:activity:hex:sha256-12-abc
event yes yes urn:agentbox:event:hex:sha256-12-abc
bead yes no urn:agentbox:bead:hex:local-id
agent yes yes urn:agentbox:agent:hex:sha256-12-abc
mcp no no urn:agentbox:mcp:server-slug
memory no no urn:agentbox:memory:name
skill no no urn:agentbox:skill:slug
dataset no yes urn:agentbox:dataset:sha256-12-abc
thing no yes urn:agentbox:thing:sha256-12-abc
adr no no urn:agentbox:adr:ADR-013
prd no no urn:agentbox:prd:PRD-006
ddd no no urn:agentbox:ddd:DDD-004
meta no no urn:agentbox:meta:slug

Because Agentbox uses canonical URIs and Linked Data (JSON-LD), you can spin up the built-in Linked-Data browser at /lo/* to navigate the graph of your agent's memories, architectural decisions, and credentials. The /v1/uri/<urn> resolver maps any URN to its current HTTP representation.

Deeper reading:


Federation Transports

Agentbox participates in all three DreamLab federation transport strata. Each stratum is independently enabled via agentbox.toml and .env configuration.

graph LR
    subgraph "This Agentbox"
        TS["Tailscale\nuserspace-networking"]
        NR["nostr-rs-relay\n:7777"]
        MA["management-api\n:8080"]
    end

    TS <-->|"WireGuard\nMagicDNS"| OTHER["Other Agentboxes\nsolid-pod-rs hosts"]
    NR <-->|"NIP-01 WS"| RELAY["Private/Public\nNostr Relays"]
    MA -->|"CF Tunnel\nHTTPS"| CF["Cloudflare Edge"]
Loading

Stratum 1 — Tailscale (Private Mesh)

Each agentbox container joins the tailnet with its own identity using --tun=userspace-networking (no /dev/net/tun needed). The container's MagicDNS hostname (configured via [networking].hostname in agentbox.toml) becomes the service discovery address for other mesh participants.

# agentbox.toml
[networking]
tailscale = true
hostname = "agentbox-london"

# .env
TAILSCALE_AUTHKEY=tskey-auth-...

Security: Tailscale runs inside the container, isolated from host networking. Tailscale ACLs control access — did:nostr signatures are not evaluated at this layer.

Stratum 2 — Nostr Relays (All Components)

The embedded nostr-rs-relay (:7777) serves as both a local event store and a mesh relay. Peer relays are configured in agentbox.toml:

[sovereign_mesh.mesh]
peer_relays = [
    "ws://agentbox-paris.tailnet-name.ts.net:7777",   # Tailscale peer
    "wss://relay.damus.io",                             # Public relay
]

All relay traffic is authenticated via NIP-98/NIP-42 did:nostr Schnorr signatures. Private relays keep governance events (kinds 31400-31405) within the organisation. Public relays provide censorship-resistant message passing when private infrastructure is unavailable.

Stratum 3 — Cloudflare Tunnels (Edge ↔ Local)

A Cloudflare tunnel exposes the management API and solid-pod-rs to CF Workers services (nostr-rust-forum, dreamlab-ai-website) without opening ports to the public internet. Configure via:

# .env
CLOUDFLARE_TUNNEL_TOKEN=eyJ...
AGENTBOX_PUBLIC_URL=https://pods-native.dreamlab-ai.com

CF Workers reach the local agentbox through the tunnel for pod provisioning, resource access, and NIP-05 federated resolution.

See Tailscale guide · Mesh deployment · Identity mesh


Documentation

For operators

For sovereign data and linked data

For developers

Canonical specs


Platforms

Target Build Run Notes
Linux x86_64 Native Native Full support, richest local feature set
Linux aarch64 Native Native Supported, subject to feature-specific gates
macOS Compose/dev tooling Docker Desktop/OrbStack/Colima CPU or remote-GPU paths
Windows Compose/dev tooling Docker Desktop + WSL2 WSL2 is the practical path
Remote Linux Native or registry Native OCI/Fly/Hetzner/bare workflows supported

Contributing

  1. Read docs/developer/architecture.md.
  2. Validate the manifest before changing build or runtime behavior.
  3. Prefer manifest-gated additions over ad hoc runtime mutation.
  4. Treat hardening, probe semantics, URI grammar, and linked-data surfaces as architectural changes — propose them via an ADR.

Part of VisionFlow

Agentbox is the harness engineering substrate of the VisionFlow coordination platform — a federated architecture for human–AI intelligence built on did:nostr identity, OWL 2 EL reasoning, and Nostr message passing.

Substrate Repository Role
VisionFlow DreamLab-AI/VisionFlow Ecosystem guide and coordination architecture
VisionClaw DreamLab-AI/VisionClaw Knowledge engineering — OWL 2 EL, 92 CUDA kernels, XR
Agentbox DreamLab-AI/agentbox Harness engineering — Nix, 90+ skills, sovereign pods
solid-pod-rs DreamLab-AI/solid-pod-rs Cryptographic foundation — JSS Rust port, DID:Nostr
nostr-rust-forum DreamLab-AI/nostr-rust-forum Forum kit — passkey auth, governance events
dreamlab-ai-website DreamLab-AI/dreamlab-ai-website Branded deployment — React, WASM, Cloudflare Workers

Deeper reading: Ecosystem integration guide


License

Core project: AGPL-3.0.

Using agentbox as a hosted service — including running it on behalf of other users — requires you to make the full source (including any modifications) available to those users. Self-hosted and internal use carry no additional obligations beyond the standard copyleft terms.

Optional components (linkedobjects/browser, solid-pod-rs) are also AGPL-3.0 and therefore consistent with the project license. Other bundled components are MIT or Apache-2.0. See Licensing details for the full matrix.


Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • JavaScript 41.1%
  • Python 23.0%
  • Shell 14.0%
  • TypeScript 13.2%
  • Nix 5.0%
  • HTML 1.7%
  • Other 2.0%