Describe what you want to build. A single ReAct agent plans it, writes the code, runs commands, and iterates — live, in Docker, in your browser. Supports OpenAI, Anthropic, Claude, Ollama, and any OpenAI-compatible local model.
If you find this project useful, please consider giving it a star! It helps others discover the project and motivates continued development.
- Why Obsidian WebDev?
- Features
- Single ReAct Agent
- Multi-Provider AI Support
- Tool Set
- Human-in-the-Loop (HITL)
- Agent-Initiated Clarification
- Parallel Tool Execution
- Automatic Context Management
- File Summarization
- Session Memory
- Live Workspace
- Project Isolation via Docker
- Project Templates
- Project Import
- Git Integration
- Secrets Vault
- User Preferences
- Security & Authentication
- Framework-Aware Agent Skills
- Architecture
- Quick Start
- Tech Stack
- Project Structure
- Roadmap
- Contributing
- License
Most AI coding tools are plugins that edit files in your local IDE. Obsidian WebDev goes further — it gives each project a completely isolated Docker container with a full Node.js and Python runtime, and a single autonomous agent that can write code, install packages, run build commands, fix errors, search the web, and iterate until the task is done.
- No environment setup — Every project runs in a fresh Docker container. Node 22, Python 3.12, uv, bun, git, and tmux are pre-installed. No conflicting versions, no polluted global state.
- Truly autonomous — The agent doesn't just autocomplete. It reads files, runs bash commands, inspects the output, fixes errors, and loops — just like a developer working in a terminal.
- Full visibility — Every tool call is shown inline in the chat. You see exactly which files the agent read, which commands it ran, and what they returned. Nothing happens behind the scenes.
- You stay in control — Destructive tool calls (writes, bash, installs) require your approval by default. Flip to "auto" mode when you trust the agent to run freely, or require approval for specific patterns always.
- No vendor lock-in — Switch between OpenAI, Anthropic, Ollama, and LM Studio per-project without changing any configuration. Your API keys are encrypted in the vault.
- Self-hosted & open-source — Run entirely on your own infrastructure. Your code and conversation history never leave your servers.
Every project is powered by a single ReAct (Reason + Act) agent loop. On each turn the agent receives the full conversation history plus a system prompt containing the project name, framework, and build standards. It reasons, decides which tools to call, executes them in parallel, appends the results to history, and loops — up to 50 iterations — until the task is done or there are no more tool calls to make.
- Reasoning before action — The agent produces a text response explaining its thinking before every tool call. You see the reasoning streamed live in the chat before tools execute.
- Streaming responses — Text tokens are streamed token-by-token via WebSocket as the LLM generates them. Tool call events, results, and status changes are pushed as separate typed events.
- Multi-turn context — Conversation history is preserved in MongoDB and restored on reconnect. The agent remembers all prior work across browser refreshes and WebSocket reconnections.
- Up to 50 iterations — A configurable maximum loop count prevents runaway tasks. The agent signals
donewhen it decides no further action is needed. - Stop at any time — Send a
stopmessage from the frontend. The agent'sasyncio.Eventis set, the current tool finishes (never cut mid-execution), and astoppedevent is returned.
Connect to any major LLM provider. Switch models per-project without changing any agent code. API keys are stored encrypted in the vault.
| Provider | Models | Type |
|---|---|---|
| OpenAI | GPT-4.1, GPT-4.1-mini, o4-mini | Cloud |
| Anthropic | Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5 | Cloud |
| Ollama | Any local model (Llama, Qwen, Mistral, DeepSeek…) | Local |
| LM Studio | Any local model via OpenAI-compatible endpoint | Local |
- Unified streaming interface — Anthropic and OpenAI use different streaming APIs (content blocks vs. delta chunks). The agent normalises both into the same internal event stream.
- Reasoning model support — OpenAI o-series models (o4-mini) are handled specially:
stream_options={"include_usage": True}is passed to capture actual token counts from the streaming response. - Accurate token tracking — Token counts come from the actual API response (
usage.input_tokensfor Anthropic,usage.prompt_tokensfor OpenAI) rather than character estimates. This drives the context management thresholds precisely. - Model-aware context limits — Each model has a known context window (200K for Claude, 1M for GPT-4.1, 8K default for local models). Pruning and compaction thresholds are calculated against the actual limit for each model.
- Per-project model selection — Set the provider and model when creating a project. Change it on any subsequent chat message without restarting the agent session.
The agent has ten built-in tools. Each tool has a permission tier that controls when it requires human approval.
| Tool | Tier | Description |
|---|---|---|
read_file(path) |
auto | Read a file from the project workspace |
write_file(path, content) |
ask | Create or overwrite a file |
edit_file(path, old_string, new_string) |
ask | Surgically replace a unique string in a file |
bash(command) |
ask | Run a shell command inside the Docker container |
glob(pattern) |
auto | Find files matching a glob pattern |
grep(pattern, path) |
auto | Search file contents by regex |
web_fetch(url) |
auto | Fetch and strip a web page |
web_search(query) |
auto | Search the web (Tavily primary, DuckDuckGo fallback) |
list_files_brief() |
auto | List all project files with AI-generated one-line summaries |
ask_user(question) |
auto | Ask the user a clarification question and suspend until answered |
Permission tiers explained:
auto— Executes immediately without user interaction. Used for read-only and non-destructive tools.ask— Requires explicit user approval when the session is inaskmode (the default). Inautomode, executes immediately.always— Always requires approval regardless of mode. Applied automatically to bash commands containing destructive patterns (rm -rf,DROP TABLE,git push --force,mkfs.*, etc.).
Tool output limits — All tool outputs are truncated to prevent context overflow. Bash output is truncated to the first + last N lines. File reads are similarly head+tail truncated. Web fetches are limited by character count. All limits are configurable via user preferences.
The agent suspends before executing ask-tier tools and sends a tool_approval_request event to the frontend. A ToolApprovalCard appears inline in the chat with the tool name, full parameters, and Approve / Deny buttons. The agent's asyncio.Future is resolved when the user responds.
- Inline approval cards — Each approval request is rendered as an interactive card in the chat thread. The card shows the tool name, formatted parameters (collapsible), and Approve / Deny buttons.
- Approved / denied state — After the user responds, the card updates to a non-interactive read-only state showing the outcome (green for approved, red for denied).
- Approve All — A "Approve all" button sends
set_permission_mode: "auto"and approves the current request in one click. All subsequent tool calls in the session run automatically. - Denied tool handling — If the user denies a tool call, the agent receives a denial message as the tool result and can adapt its approach — try a different strategy, ask for clarification, or stop.
- 300-second timeout — Approvals that receive no response within 5 minutes are automatically denied. The agent receives a timeout message and continues.
- Permission mode persistence — The
ask/automode set by "Approve All" persists for the entire session. Sending a new chat message no longer resets it to the default.
The agent can pause mid-task to ask you a clarification question using the ask_user tool. The loop suspends, a ClarificationCard appears inline in the chat, and execution resumes the moment you submit an answer.
- Suspend-and-resume — The agent's
asyncio.Futureis held until theclarification_responseWebSocket message arrives. No polling; zero CPU while waiting. - Inline question card — The
ClarificationCardshows the agent's question with a text input and a submit button. Press Enter or click the button to respond. - Answered state — After submission, the card transitions to a read-only "Agent asked / your answer" view that persists in the conversation history.
- Opt-in usage — The agent is instructed to make reasonable assumptions and only use
ask_userwhen genuinely blocked (e.g. "Should I use PostgreSQL or SQLite?" when neither is specified). It does not ask for things it can decide itself. - 300-second timeout — If no answer arrives within 5 minutes, the agent continues with a best-assumption response rather than hanging indefinitely.
- Stop signal respected — Pressing Stop while a clarification is pending immediately resolves the future with
"(agent stopped)"and terminates the loop cleanly.
When the LLM decides to call multiple tools in a single turn, they execute concurrently via asyncio.gather rather than sequentially. Reading three files or running two grep searches happens in parallel, cutting multi-tool turns from O(n × tool_latency) to O(max(tool_latency)).
- Full concurrency — All tool calls in a single LLM response are dispatched simultaneously. The results are collected and appended to history in the original declaration order.
- Pre-filled result map — Before
asyncio.gatheris called, everytool_useID is pre-filled with"Interrupted.". If aCancelledErrorfires mid-gather, every tool block already has a matching result, preventing Anthropic API 400 errors from mismatched tool_use/tool_result pairs. - Per-tool error isolation — If one tool raises an exception, its result is set to the error message. Other tools in the same batch continue to completion unaffected.
- HITL approval in parallel — Multiple approval requests fire concurrently. The user sees all pending cards at once and can approve or deny them in any order.
Large codebases and long build sessions generate histories that exceed any model's context window. Obsidian WebDev uses a two-tier approach to keep the agent running without losing important context.
Tier 1 — Lite prune (60% threshold):
When the conversation history reaches 60% of the model's context limit, old tool results are truncated in-place to 500 characters. The tool calls, file writes, and agent reasoning are preserved; only the verbose output is shortened. The agent continues without interruption and without discarding any history.
Tier 2 — Full compaction (80% threshold, configurable):
When the history reaches 80% of the context limit, the agent triggers a full compaction:
- A separate non-streaming LLM call (
_internal_llm_call) summarises the conversation up to that point. - The history is replaced with a
[Conversation Summary]block containing the summary, followed by the last 8 messages verbatim. - A
compactingevent is sent to the frontend, which renders a brief indicator in the chat. - The loop continues with the compressed history.
- Accurate token counts — Both thresholds are calculated against actual
usage.input_tokensfrom the last API response, not character estimates. This prevents premature compaction on small models and late compaction on large ones. - Configurable thresholds — The compaction trigger percentage is a per-user preference (Settings → Agent). Reduce it to compact more aggressively on small models; increase it to keep more history on large-context models.
- Preserved recent context — The last 8 messages are always kept verbatim after compaction, ensuring the agent retains the most recent instructions and file edits regardless of history length.
After every write_file or edit_file call, a background task generates a one-line AI summary of the modified file and stores it in MongoDB. The list_files_brief tool returns all summaries for the project, giving the agent a map of the entire codebase in a single tool call.
- Background generation — Summaries are generated as
asyncio.create_task(fire-and-forget). The agent loop is never blocked waiting for a summary to be written. - Persistent storage — Summaries are stored in
ProjectFileSummaryCollectionkeyed by(project_id, path). They survive browser refreshes and agent session restarts. - Self-healing — Files written before the feature was enabled don't have summaries.
list_files_brieffalls back gracefully for unsummarised files, showing just the path. - Codebase navigation — On complex existing projects, the agent calls
list_files_brieffirst to understand the structure before deciding which files to open in full. This reduces unnecessaryread_filecalls and keeps the context smaller.
Conversation history is persisted to MongoDB and restored when the WebSocket reconnects. The agent picks up exactly where it left off — no context loss on browser refresh, no need to repeat yourself.
- Full history persistence — Both the LLM message history (for the agent) and the display message history (for the chat UI) are saved to
ProjectConversationCollectionafter every message. - Reconnect restoration — When a WebSocket connection is established, the server sends a
historyevent with all saved display messages. The frontend rebuilds the chat UI instantly without any additional API calls. - Agent context restoration — On the first chat message of a new session, the agent's message list is populated from MongoDB. The LLM receives the full prior context as if the conversation never ended.
- Clear history — A
clear_historyWebSocket message wipes both the LLM history and the display history for the project. Useful for starting a fresh task without creating a new project.
The workspace is a four-panel environment built around the project's Docker container.
- Monaco editor — Full VS Code editor with syntax highlighting, bracket matching, and auto-indent. The active file reloads automatically whenever the agent writes to it.
- File tree — Shows the live file system of the project workspace. Refreshes automatically on every
file_changedWebSocket event from the agent. Files appear in the tree the moment the agent creates them — no manual refresh needed. - Integrated terminal — A full xterm.js terminal connected to a bash session inside the Docker container. Run commands directly, inspect logs, or test the built app without leaving the browser.
- Preview panel — Embedded iframe that loads the app's dev server URL (port 3000 for Next.js/Vite, 8000 for FastAPI). Refresh the preview while the app is running to see your changes live.
- Agent chat — Full-height right panel with streaming chat, tool call cards, approval cards, clarification cards, file chips, compaction indicators, and a done state. The chat input supports Shift+Enter for newlines, file/image attachments (PDF, PNG, JPG, GIF, WebP, plain text, Markdown, JSON), and drag-and-drop paste of clipboard images.
- Model picker — A compact dropdown in the chat toolbar lets you switch model provider and model without leaving the workspace. The selection is persisted per-project to localStorage.
Every project runs in its own Docker container. The container's /workspace directory is bind-mounted to backend/data/projects/{project_id}/ on the host, so file writes from the agent (via the host Python process) and file writes from bash commands (inside the container) both land in the same directory.
- Pre-built base image —
obsidian-webdev-base:latest(Ubuntu 24.04) ships with Node.js 22, npm, bun, Python 3.12, uv, git, curl, and tmux. No internet access is required at container start time for any of these tools. - Three exposed ports — Each container exposes ports 3000, 5173, and 8000. The workspace preview panel connects to whichever port the framework's dev server uses.
- Automatic cleanup — Containers idle beyond
CONTAINER_IDLE_TIMEOUT_MINUTES(default: 60) are stopped automatically. Hard removal happens afterCONTAINER_HARD_REMOVE_HOURS(default: 24). - Template injection — When a new project is created with a non-blank framework, the container runs the framework's scaffold command (
npx create-next-app@latest,uv init, etc.) as a background task. The workspace shows a "Preparing…" state until scaffolding completes. - Bind-mount sync — MongoDB is the authoritative file store.
sync_from_volume()scans the bind-mounted directory and upserts any files found, skippingnode_modules,.git,.next,.venv, and__pycache__. This runs automatically if MongoDB is empty when the workspace loads.
Choose a starting template when creating a project. The agent receives framework-specific context in its system prompt and the container is pre-scaffolded with the framework's standard boilerplate.
| Template | Scaffold command | Dev server port |
|---|---|---|
| Next.js | npx create-next-app@latest |
3000 |
| Vite + React | npm create vite@latest |
5173 |
| FastAPI | uv init + uv add fastapi uvicorn[standard] |
8000 |
| Express | npm init -y + minimal index.js |
8000 |
| Full-Stack | git clone https://github.com/sup3rus3r/nextapi.git |
3000 + 8000 |
| Blank | No scaffold | — |
- Framework context in system prompt — The agent's system prompt includes the project name and framework. This guides the agent to use the right package manager, file conventions, and port numbers without being explicitly told.
- Background scaffolding — Template injection runs as an
asyncio.create_taskso the API response returns immediately. The project status transitions frompreparingtorunningonce scaffolding completes. - File sync before ready —
sync_from_volume()is called after scaffolding and before the status is set torunning, ensuring the file tree is populated before the user can interact with the workspace.
Bring existing code into Obsidian WebDev without starting from scratch. The "New project" dialog has a Build new / Import existing tab switcher. Two import modes are supported:
GitHub URL (clone)
- Paste any public GitHub repository URL.
- The project is created immediately and the repo is cloned (
git clone --depth 1) inside the container when you first click Run. - The project stays in "Preparing…" while the clone runs, then transitions to "Running" once complete.
- Project name is auto-filled from the repository slug (editable before import).
ZIP upload
- Upload a
.zipfile (up to 100 MB) directly from your machine. - Files are extracted to the project volume on the server immediately — no container needed.
- The following are automatically excluded:
node_modules/,.git/,.env/*.envfiles,.next/,dist/,build/,__pycache__/,.venv/, and other build artifacts. - Extracted files are synced to MongoDB so the file tree is available before the container starts.
- If the zip has a single top-level folder (e.g.
my-repo-main/), it is stripped automatically so files land at/workspace/root. - Project name is auto-filled from the zip filename (editable before import).
Both import modes still let you choose the AI provider and model. Imported projects use the Blank framework setting — the agent receives no framework-specific scaffolding context, which is appropriate for arbitrary codebases.
Full git workflow inside every project container — pull, push, branch management, commits, and SSH authentication — without leaving the platform.
- SSH keypair generation — Generate an ED25519 keypair per project via the workspace UI. The private key is encrypted with Fernet and stored in the vault. The public key is displayed for you to add to GitHub, GitLab, or Bitbucket once.
- Automatic key injection — When a project container starts, the SSH private key is written to
~/.ssh/id_ed25519inside the container with correct permissions. Anssh_configentry setsStrictHostKeyChecking nofor GitHub, GitLab, and Bitbucket so push/pull works without prompts. - PAT support — Alternatively store a Personal Access Token for HTTPS-based git auth.
- Full git operations — Status, log, diff, pull, push, commit, checkout, branch list, remote management, and init — all available from the workspace git panel and via the agent using
bash. - Agent-aware — The workspace agent can run git commands directly via the
bashtool. SSH keys are already in place when the agent runs. - Force push disabled — The API explicitly blocks
--forcepush for safety.
SSH setup (one-time per project):
- Open the project workspace → Git panel → Generate SSH Key
- Copy the displayed public key
- Add it to GitHub: Settings → SSH and GPG keys → New SSH key
- Set your remote:
git remote add origin git@github.com:user/repo.git - Push/pull works from that point forward — including from the agent
Store API keys and git credentials in an encrypted vault. Values are encrypted with AES-256 (Fernet) at rest — the raw value is never accessible after saving.
- AI provider keys — One key per provider per user. Supported: Anthropic, OpenAI, Ollama, LM Studio, Obsidian AI (self-hosted).
- SSH keypairs — Project-scoped ED25519 keypairs for git authentication. Private key encrypted; public key exposed for GitHub/GitLab setup.
- PATs — Project-scoped Personal Access Tokens for HTTPS git auth.
- Encrypted at rest — All values are encrypted with a Fernet key derived from the user ID and a server-side master key (
FERNET_MASTER_KEY). The encrypted blob is what's stored in MongoDB / SQLite. - Key validation — A "Test" button on each AI provider key calls the provider's API to verify validity before running an agent.
- Environment fallback — If a user has no vault key for an AI provider, the backend falls back to the server-level environment variable (
ANTHROPIC_API_KEY,OPENAI_API_KEY).
Per-user agent behaviour settings stored in MongoDB. Changes take effect on the next chat message — no restart required.
| Setting | Default | Range | Description |
|---|---|---|---|
| Permission mode | Ask | Ask / Auto | Whether the agent requires approval before write/bash tools |
| Compaction trigger | 80% | 50–95% | % of context window at which history is compacted |
| Bash output limit | 400 lines | 50–2000 | Max lines of bash output kept in context |
| File read limit | 500 lines | 50–2000 | Max lines when reading a file |
| Web fetch limit | 20,000 chars | 5,000–100,000 | Max characters from a web page fetch |
Preferences are loaded from MongoDB on every chat message and forwarded to the Agent constructor. This means a preference change in Settings takes effect immediately without requiring a page reload or new session.
The workspace agent's system prompt is dynamically extended at session start with a framework-specific skill block. This gives the agent precise, version-accurate rules for the project it's working in — without relying on training knowledge that may be outdated.
- Per-framework skill injection — When an agent session starts,
agent.pyloads the matching skill file fromdocs/agent-skills/and appends it to the system prompt. A Next.js project gets Next.js 16 + Tailwind v4 rules; a FastAPI project gets Python 3.12 + Pydantic v2 + SQLAlchemy v2 rules. - Context7 integration — Each skill file instructs the agent to fetch live documentation from Context7 via
web_fetchbefore writing code for any library. The exact API URLs and topic parameters are embedded in the skill — the agent never has to guess the endpoint. - Tailwind v4 rules — Injected into all JS/TS framework skills. Covers the breaking changes from v3 (no
tailwind.config.js,@import "tailwindcss",@themeblock,@variantfor dark mode). - Git-aware — All skill files include git instructions and remind the agent that SSH keys are pre-injected and push/pull works without additional setup.
| Framework | Skill file | Context7 libraries covered |
|---|---|---|
| Next.js | docs/agent-skills/nextjs.md |
Next.js, React 19, Tailwind v4 |
| React (Vite) | docs/agent-skills/react.md |
React, React Router, Vite, Tailwind v4 |
| FastAPI | docs/agent-skills/fastapi.md |
FastAPI, SQLAlchemy v2, Pydantic v2 |
| Full-Stack | docs/agent-skills/fullstack.md |
Next.js/React + FastAPI/Express, Tailwind v4, Prisma |
| Blank | docs/agent-skills/blank.md |
TypeScript, any tech the user chooses |
- JWT authentication — All API endpoints and WebSocket connections are protected by JWT bearer tokens. WebSocket auth is passed as a
?token=query parameter to avoid the browser WS API's lack of custom header support. - NextAuth v5 — The frontend uses NextAuth with a credentials provider. Access tokens are stored in the session and included in every API and WebSocket request.
- AES end-to-end encryption — Sensitive values (API keys) are additionally encrypted client-side before transmission using a shared
NEXT_PUBLIC_ENCRYPTION_KEY/ENCRYPTION_KEYpair. The server receives and stores only the encrypted form. - Fernet secrets vault — Server-side vault encryption uses Fernet (AES-128-CBC + HMAC-SHA256) with a master key + per-user key derivation. Raw key values are never returned by any API endpoint.
- Role-based access control — Users have a
guestoradminrole. Admin-only operations are protected byrequire_role("admin")on the backend. - Rate limiting —
slowapienforces per-user and per-API-client rate limits on all routes.
Browser
│ WebSocket /ws/agent/{session_id}?token=... (streaming events)
│ REST API /auth /projects /vault /settings (CRUD + auth)
▼
FastAPI (port 7412)
├── websocket/agent_ws.py WS handler — receives chat, dispatches to runner
├── services/agent_runner.py asyncio session registry — start/stop/approval/clarification
├── services/project_service.py project CRUD + container lifecycle
├── services/container_service.py Docker SDK — run, exec, ports, cleanup
└── agents/
├── agent.py ReAct loop — LLM call → parallel tool dispatch → history → repeat
└── tools.py Tool definitions + permission tier registry
MongoDB (always) + SQLite/PostgreSQL (user auth)
Docker container (one per project)
└── /workspace ←── bind-mount: backend/data/projects/{project_id}/
1. User sends {"type": "chat", "content": "Add a login page"}
2. agent_ws.py loads user preferences from MongoDB
3. agent_runner.start_agent() creates Agent + asyncio.Task
4. Agent prepends user message to history
5. Agent calls LLM (streaming) → tokens arrive as {"type": "token"} events
6. LLM response contains 3 tool_calls → asyncio.gather fires all 3 concurrently
├── read_file("src/app/layout.tsx") → returns file contents
├── glob("src/**/*.tsx") → returns file list
└── web_search("NextAuth login route") → returns search results
7. Tool results appended to history
8. Agent calls LLM again → streams a plan, then calls write_file()
9. write_file requires approval → {"type": "tool_approval_request"} sent
10. User clicks Approve → agent_runner.resolve_approval() sets Future
11. write_file executes → file written to bind-mount → {"type": "file_changed"} sent
12. Frontend file tree refreshes; Monaco reloads if file was open
13. Loop continues until no more tool calls → {"type": "done"} sent
| Requirement | Windows | Linux / macOS |
|---|---|---|
| Docker Desktop | Download — must be running | sudo usermod -aG docker $USER then reboot |
| Node.js 18+ | Download or winget install OpenJS.NodeJS |
Package manager of choice |
| Python 3.12+ | Download or winget install Python.Python.3.12 |
apt install python3.12 / brew install python@3.12 |
| uv | winget install astral-sh.uv or pip install uv |
curl -Lsf https://astral.sh/uv/install.sh | sh |
| MongoDB | Local via Docker (see Step 3) or MongoDB Atlas (free tier) | Same |
git clone https://github.com/sup3rus3r/obsidian-webdev.git
cd obsidian-webdevnpm install
cd frontend && npm install && cd ..
cd backend && uv sync && cd ..If you don't have a MongoDB Atlas connection string, run a local instance via Docker:
docker run -d --name mongodb --restart unless-stopped -p 27017:27017 mongo:latestAlready have Atlas? Skip this step and put your Atlas URI in
backend/.envasMONGO_URL.
Backend:
# Linux / macOS
cp backend/.env.example backend/.env
# Windows (PowerShell)
copy backend\.env.example backend\.envFrontend:
# Linux / macOS
cp frontend/.env.example frontend/.env.local
# Windows (PowerShell)
copy frontend\.env.example frontend\.env.localGenerate the required secret keys:
# Linux / macOS — JWT_SECRET_KEY and ENCRYPTION_KEY
openssl rand -hex 32
# Windows (PowerShell) — JWT_SECRET_KEY and ENCRYPTION_KEY
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
# Any platform — FERNET_MASTER_KEY
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"Fill in backend/.env with your generated values:
| Variable | Required | Description |
|---|---|---|
DATABASE_TYPE |
Yes | sqlite (default) or mongo |
SQLITE_URL |
If SQLite | e.g. sqlite:///./obsidian.db |
MONGO_URL |
Yes | MongoDB connection string |
MONGO_DB_NAME |
Yes | MongoDB database name |
JWT_SECRET_KEY |
Yes | Random secret for JWT signing |
ENCRYPTION_KEY |
Yes | 32-byte hex key for AES encryption |
FERNET_MASTER_KEY |
Yes | Fernet key for vault encryption |
CORS_ORIGINS |
No | Default: ["http://localhost:3100"] |
ANTHROPIC_API_KEY |
No | Fallback if user has no vault key |
OPENAI_API_KEY |
No | Fallback if user has no vault key |
TAVILY_API_KEY |
No | Web search — falls back to DuckDuckGo |
PROJECTS_DATA_DIR |
No | Default: ./data/projects |
DOCKER_SOCKET |
No | Leave empty on Windows (Docker Desktop handles this) |
Fill in frontend/.env.local:
| Variable | Required | Description |
|---|---|---|
AUTH_SECRET |
Yes | NextAuth secret — any random 32-byte hex string |
NEXT_PUBLIC_API_URL |
No | Default: http://localhost:7412 |
NEXT_PUBLIC_WS_URL |
No | Default: ws://localhost:7412 |
NEXT_PUBLIC_ENCRYPTION_KEY |
Yes | Must match ENCRYPTION_KEY in backend/.env |
All project containers are created from this image. Build it once; rebuild only if backend/Dockerfile.base changes.
docker build -f backend/Dockerfile.base -t obsidian-webdev-base:latest backend/The image includes: Ubuntu 24.04, Node.js 22, npm 10, bun 1.3, Python 3.12, uv, git, curl, tmux.
npm run devThis single command:
- Syncs backend Python dependencies (fast/idempotent)
- Starts the Qdrant vector DB container (
obsidian-qdranton port6333) - Starts the frontend and backend concurrently
| Service | URL |
|---|---|
| Frontend | http://localhost:3100 |
| Backend API | http://localhost:7412 |
| API docs (Swagger) | http://localhost:7412/docs |
npm run devuses a cross-platform Node script and works on Windows, Linux, and macOS without requiring bash or WSL.
| Service | Port | Started by |
|---|---|---|
| Frontend (Next.js) | 3100 | npm run dev |
| Backend (FastAPI) | 7412 | npm run dev |
| Qdrant (vector DB) | 6333 | Docker — auto-started by npm run dev |
| MongoDB | 27017 | Docker — started in Step 3 (or Atlas) |
| Project containers | dynamic | Created on-demand per project |
| Layer | Technology |
|---|---|
| Frontend framework | Next.js 16, React 19, TypeScript |
| Frontend styling | Tailwind CSS v4, shadcn/ui, Radix UI |
| Authentication | NextAuth v5 (credentials provider) |
| Code editor | Monaco Editor (VS Code engine) |
| Terminal | xterm.js |
| Backend framework | FastAPI 0.128+, Python 3.12 |
| Async runtime | asyncio, Motor (async MongoDB driver) |
| SQL ORM | SQLAlchemy (SQLite / PostgreSQL) |
| Document DB | MongoDB |
| Agent | Custom ReAct loop — no framework dependency |
| LLM providers | Anthropic SDK, OpenAI SDK |
| Containers | Docker SDK for Python |
| Web search | Tavily API, DuckDuckGo (fallback) |
| Rate limiting | slowapi |
| Dev runner | concurrently |
obsidian-webdev/
│
├── package.json # Root — `npm run dev` starts both servers
├── scripts/dev.sh # Dev launcher with uvicorn reload config
│
├── backend/
│ ├── main.py # FastAPI app + lifespan (DB init, cleanup task)
│ ├── config.py # Pydantic settings — all env vars with defaults
│ ├── Dockerfile.base # Base container image (Node 22, Python 3.12, uv, bun)
│ │
│ ├── agents/
│ │ ├── agent.py # Single ReAct Agent — Anthropic + OpenAI + Ollama streaming
│ │ └── tools.py # Tool definitions (TOOLS_ANTHROPIC, TOOLS_OPENAI) + TOOL_TIER
│ │
│ ├── services/
│ │ ├── agent_runner.py # AgentSession dataclass + start/stop/approval/clarification
│ │ ├── project_service.py # Project CRUD + run_container + template + SSH injection
│ │ ├── container_service.py # Docker SDK — create/start/exec/cleanup + inject_ssh_key
│ │ ├── git_service.py # Git operations via container exec_run
│ │ └── file_service.py # list/read/write + sync_from_volume + export_zip
│ │
│ ├── websocket/
│ │ ├── agent_ws.py # Agent WebSocket endpoint — chat/stop/approval/clarification
│ │ ├── terminal_ws.py # Container terminal WebSocket (xterm.js ↔ bash)
│ │ └── manager.py # Connection registry
│ │
│ ├── routers/
│ │ ├── auth.py # Register, login, profile, password, API clients
│ │ ├── projects.py # Project CRUD + file endpoints + attachment parsing
│ │ ├── containers.py # Container start/stop/status
│ │ ├── agent.py # Agent session CRUD
│ │ ├── git.py # Git operations (status/log/pull/push/commit/branch)
│ │ ├── vault.py # Secrets CRUD + SSH key generation + validation
│ │ └── settings.py # GET/PUT /settings/preferences
│ │
│ ├── models/
│ │ ├── sql_models.py # SQLAlchemy: User, APIClient, UserSecret, ProjectSecret
│ │ └── mongo_models.py # Motor: Project, File, Conversation, AgentSession,
│ │ # UserPreferences, FileSummary, Export, ProjectSecret
│ │
│ ├── core/
│ │ ├── security.py # JWT encode/decode, get_current_user dependency
│ │ ├── vault.py # Fernet encrypt/decrypt
│ │ └── rate_limiter.py # slowapi limiter + user_limit helper
│ │
│ ├── database/
│ │ ├── mongo.py # Motor client connect/disconnect/get_database
│ │ └── sql.py # SQLAlchemy engine + SessionLocal + get_db
│ │
│ └── docs/
│ ├── ROADMAP.md # Phase-by-phase feature roadmap
│ ├── AGENT_SYSTEM_PROMPT.md # Agent base system prompt (loaded at runtime)
│ ├── AGENT_KNOWLEDGE_BASE.md # Build standards and conventions
│ └── agent-skills/ # Per-framework skill files (injected at agent init)
│ ├── nextjs.md # Next.js 16 + React 19 + Tailwind v4 rules + Context7
│ ├── react.md # React + Vite rules + Context7
│ ├── fastapi.md # FastAPI + SQLAlchemy v2 + Pydantic v2 rules + Context7
│ ├── fullstack.md # Combined frontend+backend rules + Context7
│ ├── tailwind.md # Tailwind CSS v4 rules (injected into all JS frameworks)
│ └── blank.md # Blank project orientation rules
│
└── frontend/
├── package.json
├── next.config.ts # API proxy rewrites
├── auth.ts # NextAuth v5 config
│
├── app/
│ ├── layout.tsx # Root layout (Toaster, SessionProvider)
│ ├── page.tsx # Redirect → dashboard
│ ├── auth/ # Login + register pages
│ └── dashboard/
│ ├── page.tsx # Project list + create/import modal
│ ├── workspace/[id]/ # Main workspace (editor + chat + terminal + preview)
│ ├── settings/ # API keys vault + agent preferences
│ └── config/ # Profile + password
│
├── components/
│ ├── workspace/
│ │ ├── agent-chat.tsx # Chat panel — streaming, tools, approvals, clarifications
│ │ ├── file-tree.tsx # File browser sidebar
│ │ ├── terminal.tsx # xterm.js terminal
│ │ └── preview.tsx # App preview iframe
│ └── ui/ # shadcn/ui + Radix UI primitives
│
├── lib/
│ ├── api/
│ │ ├── client.ts # apiFetch + apiUrl + wsUrl helpers
│ │ ├── ws.ts # AgentWsClient class + useAgentWs hook
│ │ ├── projects.ts # Project + file API calls
│ │ ├── agent.ts # Agent session API calls
│ │ ├── vault.ts # Vault key API calls
│ │ └── settings.ts # Preferences API calls
│ └── utils.ts # cn() + misc utilities
│
└── types/
└── api.ts # All shared TypeScript types (ServerEvent, ClientEvent,
# UserPreferences, Project, FileNode, AgentSession, …)
See docs/ROADMAP.md for full detail and phase-by-phase breakdown.
- Docker base image — Ubuntu 24.04 with Node.js 22, Python 3.12, uv, bun, git, tmux — one image, all frameworks
- Project templates — Next.js, Vite + React, FastAPI, Express, Full-Stack (nextapi), Blank — scaffold via CLI on first run
- Project import — Import existing code from a public GitHub URL (git clone on first run) or a ZIP upload (extracted immediately, node_modules/.env excluded)
- Single ReAct agent — Custom asyncio agent loop; no framework dependency; supports Anthropic + OpenAI streaming and Ollama/LMStudio via OpenAI compat
- Full tool set —
read_file,write_file,edit_file,bash,glob,grep,web_fetch,web_search,list_files_brief,ask_user - WebSocket streaming — Typed event protocol; token streaming, tool call events, file change events, history replay on reconnect
- Tool approval system — Permission tiers (auto / ask / always); inline
ToolApprovalCard; approve, deny, approve-all; 300s auto-deny timeout - Agent-initiated clarification —
ask_usertool;ClarificationCardinline in chat; suspend-and-resume viaasyncio.Future; answered state in history - Permission mode persistence — "Approve all" now survives to the next chat turn (session-level, not overwritten on each message)
- Stop signal — Interrupts cleanly between tool calls; never cuts mid-execution; pending approvals and clarifications are resolved immediately
- Two-tier context management — Lite prune at 60%, full compaction at 80%; accurate token counts from API response; configurable per user
- Parallel tool execution — All tool calls in a turn execute via
asyncio.gather; pre-filled result map prevents API 400 on cancellation - File summarization — Background AI summaries stored per file in MongoDB;
list_files_briefgives the agent a codebase map in one call - Session memory — Full conversation history persisted to MongoDB; restored on reconnect for both the LLM and the chat UI
- Live workspace — Monaco editor, file tree, xterm.js terminal, app preview iframe — all in one panel layout
- Live file sync — File tree updates on every
file_changedevent; Monaco reloads if the file is open; no manual refresh needed - Preparing state — "Preparing workspace…" shown while template scaffolding runs; chat and editor disabled until container is ready
- User preferences — Per-user agent settings (permission mode, compaction %, output limits) stored in MongoDB; applied per chat message
- JWT + NextAuth v5 — Token-based auth on all endpoints and WebSocket connections
- AES + Fernet vault — Client-side AES encryption + server-side Fernet storage; raw key values never returned by any endpoint
- RBAC + rate limiting — Guest / admin roles; slowapi rate limits per user
- SSH keypair generation — ED25519 keypair generated per project; private key Fernet-encrypted in vault; public key returned for GitHub/GitLab setup
- Automatic SSH injection — Private key written to
~/.ssh/id_ed25519on container start;ssh_configpre-configured for GitHub, GitLab, Bitbucket - PAT support — Store Personal Access Tokens as an alternative to SSH for HTTPS git auth
- Full git API —
/git/{project_id}/endpoints for status, log, diff, pull, push, commit, checkout, branches, remotes, init - Project-scoped secrets —
ProjectSecretmodel (SQL + MongoDB) with(user_id, project_id, secret_type)uniqueness
- Per-framework skill injection — Skill files loaded from
docs/agent-skills/and appended to system prompt at agent init - Context7 integration — Every skill file includes exact
web_fetchURLs + topic parameters for live documentation retrieval - Tailwind v4 skill — Injected into all JS/TS frameworks; covers all v3→v4 breaking changes
- Frameworks covered — Next.js 16, React + Vite, FastAPI, Full-Stack, Blank
- Django (
django-admin startproject, port 8000) - SvelteKit (
npm create svelte@latest, port 5173) - Vanilla HTML/CSS/JS (static server, port 8080)
- Astro (
npm create astro@latest, port 4321) - Flutter (
flutter create, if mobile toolchain added to base image)
Contributions are welcome. Whether it's bug reports, feature suggestions, or pull requests — all input is valued.
-
Fork the repository
Click the Fork button at the top right of this page.
-
Clone your fork
git clone https://github.com/your-username/obsidian-webdev.git cd obsidian-webdev -
Create a feature branch
git checkout -b feature/your-feature-name
-
Build the Docker base image (required to run the app)
docker build -f backend/Dockerfile.base -t obsidian-webdev-base:latest backend/
-
Make your changes and commit
git commit -m "Add your feature description" -
Push and open a Pull Request
git push origin feature/your-feature-name
Open a pull request against the
mainbranch with a clear description of your changes and what problem they solve.
Found a bug or have a feature request? Open an issue with as much detail as possible — steps to reproduce, expected vs. actual behaviour, and your environment (OS, Docker version, browser).
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
- Free to use — Use, study, and modify for any purpose.
- Copyleft — If you distribute this software or run it as a network service, you must make the complete source code available under the same AGPL-3.0 terms.
- No additional restrictions — You cannot impose further restrictions on recipients' exercise of the rights granted by this license.
See the LICENSE file for the full terms.
