Sibling-repo home for converter plugins that extend dikw-core's
dikw client import command to non-markdown formats — PDF, EPUB, and
whatever comes next.
dikw-core ingests .md + assets only. Plugins in this repo turn other
formats (paper.pdf, book.epub, …) into that md+assets shape so they
can flow through the standard import pipeline. Conversion happens in the
client process; the server never loads converter dependencies.
This repo is structured as a uv workspace with one package per
plugin under packages/. Each is independently published to pypi and
versioned.
| Package | Engine name | Extensions | Status |
|---|---|---|---|
dikw-converter-example |
example |
.example |
reference stub — copy as template |
dikw-converter-epub |
epub |
.epub |
0.1.0 — pure-Python EPUB → markdown |
dikw-converter-mineru |
mineru |
.pdf, .docx, .doc, .pptx, .ppt, .xlsx, .xls |
0.1.0 — MinerU online API (PDF + Office); needs MinerUAPIKey env |
dikw-converter-mineru covers multiple input formats because they all
share one upstream tool. CLAUDE.md's "one format per package" guideline
calls this out as the explicit exception when formats share an engine.
# Install dikw-core first (the host).
pip install dikw-core
# Then install any plugin you want.
pip install dikw-converter-mineru # PDF + DOCX + PPTX + XLSX via MinerU online
pip install dikw-converter-epub # pure-Python EPUB
# MinerU is hosted — export your API token first.
# See packages/dikw-converter-mineru/README.md § Auth for the 3-tier
# resolution order, token rotation, and how to load .env per shell.
$env:MinerUAPIKey = "eyJ..." # PowerShell
# export MinerUAPIKey="eyJ..." # bash / zsh
# Use it.
dikw client import paper.pdf
dikw client import book.epubuv pip install works identically; substitute uv pip for pip if
you manage envs with uv.
Each dikw-converter-* is a normal PyPI package — upgrading, pinning,
and uninstalling follow the standard pip / uv semantics with no
special handling on dikw-core's side.
# Upgrade to the latest published version.
pip install --upgrade dikw-converter-epub
# Pin to a specific version (recommended for production environments —
# converter output is deterministic per pinned plugin version).
pip install 'dikw-converter-mineru==0.1.0'
# Inspect what's installed and which entry-points it registers.
pip show dikw-converter-epub
python -c "from importlib.metadata import entry_points; print([(e.name, e.value) for e in entry_points(group='dikw.client.converters')])"
# Uninstall — the entry-point disappears the next time dikw client
# does converter discovery, so there's nothing else to clean up
# inside the host. Files already imported into <base>/sources/ stay
# put (they're not the plugin's data).
pip uninstall dikw-converter-epubCI builds attach every release's wheel and sdist to its GitHub Release. For air-gapped environments, download the wheel from the release page and install from the local file:
# From https://github.com/opendikw/dikw-plugins/releases
pip install ./dikw_converter_epub-0.1.0-py3-none-any.whlThe wheel still declares its dikw-core dependency; if PyPI isn't
reachable, install dikw-core from its own GitHub Release first.
- PyPI:
https://pypi.org/project/dikw-converter-<format>/(e.g. dikw-converter-epub). - GitHub Releases:
https://github.com/opendikw/dikw-plugins/releases— taggeddikw-converter-<format>-vX.Y.Z, with release notes lifted verbatim from that package'sCHANGELOG.md. - Per-package changelog: each
packages/dikw-converter-<format>/CHANGELOG.mdis the authoritative history for that plugin.
Start with docs/plugin-author-guide.md
— end-to-end walkthrough from cargo-cult the dikw-converter-example
package to publishing on pypi.
The formal contract spec lives in
dikw-core's docs/converters.md; the rationale for why this
all lives in a sibling repo (rather than inside dikw-core) is in
dikw-core's ADR 0001.
dikw-plugins/
├── README.md (you are here)
├── CLAUDE.md agent guidance — start with docs/architecture.md
├── CONTEXT.md local terms (defers to dikw-core's glossary)
├── pyproject.toml uv workspace root
├── uv.lock committed for reproducible installs
├── .github/
│ ├── workflows/
│ │ ├── ci.yml ruff + mypy + pytest + packaging gate + build smoke
│ │ ├── codeql.yml Python static analysis on PR + weekly cron
│ │ └── release.yml tag → PyPI (trusted publisher) + GitHub Release
│ ├── dependabot.yml weekly uv + github-actions updates
│ ├── CODEOWNERS review routing
│ └── PULL_REQUEST_TEMPLATE.md
├── docs/
│ ├── architecture.md why client-side plugin, what the contract is
│ ├── plugin-author-guide.md tutorial: write a new converter
│ ├── release-process.md tag → CI → PyPI + GitHub Release pipeline
│ └── tool-survey.md PDF/EPUB tool ecosystem snapshot (2026-05)
├── scripts/
│ ├── check-package.py local pre-release gate (same checks as release.yml)
│ └── extract_changelog.py pulls one ## [X.Y.Z] block out of a CHANGELOG
├── tests/packaging/ artifact-level wheel/sdist/entry-point invariants
└── packages/
├── dikw-converter-example/ reference stub — copy as template for new plugins
├── dikw-converter-epub/ pure-Python EPUB → markdown (0.1.0, on PyPI)
└── dikw-converter-mineru/ MinerU online API for PDF + Office (0.1.0, on PyPI)
# From repo root:
uv sync # install workspace deps
uv run pytest # run all packages' tests
uv run ruff check .
uv run mypy packages/*/src
# Install a plugin into your dikw client env:
pip install -e packages/dikw-converter-examplePre-alpha along with dikw-core. The plugin contract (Protocol shape,
entry-points group name, selection order) is stable enough to build on
but may evolve through the deprecation cycle described in
dikw-core/docs/converters.md.