LLM Knowledge

A self-hosted knowledge management tool that ingests documents, extracts content via LLM, and provides intelligent Q&A — all in a single binary.

LLM Knowledge is a personal knowledge base that helps you collect, understand, and query your documents. It ingests PDFs, web pages, and RSS feeds, uses Claude to extract and summarize content, and lets you chat with your documents through an SSE-powered conversational interface.

Features

Multi-source ingestion — Upload PDFs (drag & drop or URL), clip web pages via Chrome extension, subscribe to RSS feeds with auto-sync, or import newsletters via IMAP
LLM-powered extraction — Claude CLI extracts structured content, generates summaries, and translates between Chinese and English
Document chat — Multi-turn SSE streaming Q&A with session management, powered by Claude CLI
Query chat — Free-form conversational AI with stop/interrupt support, conversation history, and multimodal image input
PDF translation — Layout-preserving PDF translation via pdf2zh (OpenAI-compatible API)
Dual PDF viewer — Scroll/scale-synced side-by-side comparison of original and translated PDFs
Bilingual UI — Full i18n support for English and Chinese
Single binary — Frontend embedded in Go binary, just download and run

Prerequisites

Go 1.25+
Node.js & npm (for building frontend)
Claude CLI — available in PATH
Python 3.12 (optional) — for PDF translation via pdf2zh (PEP 695 syntax required)
qpdf (optional) — pdf2zh dependency for pikepdf

Quick Start

# Clone and start (default port 9999)
git clone https://github.com/bruceding/llm_knowledge.git
cd llm_knowledge
./start.sh

The start.sh script automatically:

Checks and installs pdftotext (poppler) for PDF text extraction
Checks Python 3.12 availability (prints warning if missing, PDF translation disabled)
Checks and installs qpdf for pdf2zh dependency
Builds backend and frontend
Starts the server on port 9999

# Custom port
PORT=8080 ./start.sh

# Or build and run manually
make build
./llm-knowledge -port 8080

# Development mode with hot reload
make dev                 # backend :3456, frontend :5173

Data is stored in ~/.llm-knowledge/ (configurable via DATA_DIR env var).

Configuration

Env Variable	Default	Description
`PORT`	`3456`	Server port
`DATA_DIR`	`~/.llm-knowledge`	Data and database directory
`PDF2ZH_VENV_DIR`	`$DATA_DIR/.venv`	pdf2zh Python venv path

Tech Stack

Backend: Go + Echo + GORM (SQLite) + Claude CLI
Frontend: React 19 + TypeScript + Vite + Tailwind CSS v4
PDF: pdfjs-dist (in-browser rendering) + pdf2zh (translation)

Chrome Extension — Wiki Web Clipper

One-click web page clipping to your knowledge base. Works with any modern Chromium-based browser (Chrome, Edge, Brave, etc.).

Installation

Open Chrome and navigate to chrome://extensions/
Enable Developer mode (toggle in top-right corner)
Click Load unpacked and select the extension/ folder from this project

Setup

After installation, click the extension icon to open the settings page:

Enter your Wiki URL (e.g., http://localhost:9999 or your deployed address)
Enter your username and password
Click Save and Connect to authenticate

The extension will store your credentials locally and auto-refresh tokens when needed.

Usage

Navigate to any web page you want to save
Click the extension icon in the toolbar
The page will be clipped and sent to your wiki's "Raw Documents" section
Success: green ✓ badge | Failure: red ✗ badge | Progress: gray "..." badge
Toast notifications appear on the page to confirm the result

Features

Full page capture — Extracts complete HTML content with preserved structure
Auto title detection — Uses page title as document name
WeChat article support — Special handling for WeChat public account articles
Authentication — Secure token-based auth with auto-expiry handling
Visual feedback — Badge and toast notifications for operation status

Supported Sites

Works on most public websites. Some sites with heavy JavaScript rendering may require the page to fully load before clipping.

Newsletter Import — IMAP Email Sync

Automatically import newsletters from your email inbox via IMAP. Perfect for subscribing to tech newsletters, industry updates, and curated content.

Setup

Go to Import → Newsletter tab in the UI
Configure your IMAP settings:
- Host: IMAP server address (e.g., imap.gmail.com for Gmail)
- Port: 993 (IMAPS, recommended) or 143 (IMAP)
- Username: Your email address
- Password: Email password or app-specific password
- Folder: Mailbox folder name (default: Newsletter)
Enable Auto Sync if you want hourly automatic syncing
Click Save and Connect

Gmail Setup

For Gmail, you need an App Password instead of your regular password:

Go to Google Account Security
Enable 2-Step Verification (required for app passwords)
Go to App passwords → Generate new password
Select "Mail" and "Other (Custom name)" → name it "LLM Knowledge"
Use the generated 16-character password in the setup

Usage

Click Sync Now to manually fetch new newsletters
First sync imports up to 10 newsletters (to avoid overwhelming)
Subsequent syncs fetch only emails since the last sync
Auto-sync runs hourly if enabled

Features

HTML extraction — Extracts clean HTML content from multipart emails
Image handling — Downloads embedded images, filters tracking pixels
Smart cleanup — Removes duplicate titles, footer noise, unsubscribe links
Sender tagging — Auto-creates tags based on newsletter sender
Claude summary — Background summary generation for each newsletter
View-in-browser links — Extracts and preserves original newsletter links

Folder Organization

Newsletters are stored in ~/.llm-knowledge/raw/newsletter/<sender>/:

<slug>.md — Markdown version with metadata header
<slug>.html — Original HTML for rich rendering
assets/ — Downloaded images

Name		Name	Last commit message	Last commit date
Latest commit History 326 Commits
backend		backend
docs		docs
extension		extension
frontend		frontend
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_ZH.md		README_ZH.md
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Knowledge

Features

Prerequisites

Quick Start

Configuration

Tech Stack

Chrome Extension — Wiki Web Clipper

Installation

Setup

Usage

Features

Supported Sites

Newsletter Import — IMAP Email Sync

Setup

Gmail Setup

Usage

Features

Folder Organization

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Knowledge

Features

Prerequisites

Quick Start

Configuration

Tech Stack

Chrome Extension — Wiki Web Clipper

Installation

Setup

Usage

Features

Supported Sites

Newsletter Import — IMAP Email Sync

Setup

Gmail Setup

Usage

Features

Folder Organization

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages