Turn your team's hard-won experience into knowledge your AI agents can act on. Institutional memory that doesn't die with the person who earned it — captured, governed, and made actionable for LLM agents.
By Raffaele Cipro — Principal Product Manager, OpenText · enterprise & agentic AI. A practitioner field report, not a maintained framework: no SDK to install, no support promised. No domain data here — just the architecture and the methodology. (LinkedIn)
Most "knowledge bases for agents" are a pile of documents wired to vector search. But the knowledge that actually wins the work usually isn't in the documents — it's in the heads of the people who know how the work is really done, and it walks out the door when they do. lore is a methodology for capturing that experience — codified facts and the practitioner's craft — and turning it into knowledge an agent can retrieve, navigate, and apply reliably. It comes from running such a system in production in a regulated professional domain.
Three ideas do the work: route by question type across cooperating layers, restructure long documents instead of chunking them (Book-to-Skill), and govern reliability so "is this still true, who validated it" is answerable — including for the tacit, experiential knowledge that has no external source to check against (how experience becomes agent-ready).
Take the ideas, adapt the schemas, ignore what doesn't fit.
In June 2026 Google Cloud published the Open Knowledge Format (OKF) — a vendor-neutral standard that represents knowledge as a directory of markdown files with YAML frontmatter, formalizing the "LLM wiki" pattern popularized by Andrej Karpathy. OKF standardizes the substrate: how a single knowledge file is shaped and how files link.
This methodology is complementary and sits above that substrate. OKF answers "what does one knowledge file look like, and how do files reference each other?" The patterns here answer the next questions:
- How do you route a query to the right kind of knowledge (structured lookup vs. thematic navigation vs. verbatim source)?
- How do you turn a long reference document into something an agent navigates deterministically instead of chunking it badly (Book-to-Skill)?
- How do you govern reliability — what's verified, what's contested, what's superseded — across the whole corpus (LLM Wiki governance)?
- How do you capture tacit expertise (the practitioner's craft), not just codified facts?
If you adopt OKF for your files, the patterns in methodology/ tell you what to build on top of them. They predate OKF in this system but map onto it cleanly.
Naive RAG over a pile of PDFs is easy to demo and expensive to operate: chunk roulette, non-determinism where you want a single right answer, and no governance layer for "is this still true?". The core move is to stop asking one vector index to be good at everything, and separate three jobs into three cooperating layers:
┌──────────────────────────────────────────────┐
agent asks │ router: which kind of question is this? │
────────────► │ structured? thematic? verbatim? │
└───────┬───────────────┬───────────────┬──────┘
▼ ▼ ▼
┌─────────────┐ ┌───────────────┐ ┌──────────────┐
│ L1: DB + │ │ L2: Skill │ │ L3: Files │
│ vector │ │ files │ │ (source of │
│ (hybrid) │ │ (book-to- │ │ truth) │
│ │ │ skill) │ │ │
└─────────────┘ └───────────────┘ └──────────────┘
- L1 — Structured store + vectors (hybrid search). Relational DB + a vector extension, exposed to agents through a small set of typed tools (e.g. an MCP server), not raw SQL. Answers "how many of X?", "look up Y", ranked semantic search with a score you can threshold on.
- L2 — Skill files (navigable knowledge). Long documents are restructured, not chunked, into a small file tree the agent navigates deterministically. See Book-to-Skill.
- L3 — Filesystem (verbatim source of truth). Original documents as plain files with YAML frontmatter. Everything upstream is a rendering of this layer; this is what you cite.
The three layers handle documents and facts. But the knowledge that actually wins the work is often tacit: how an expert sequences an argument, what to concede, which move backfires. It has no external source to verify against — and it normally dies with the person who holds it.
lore treats that experiential knowledge as a first-class, governed stream: captured as claims that start unverified and become trusted only once the expert validates them, and kept honest by an adherence loop that surfaces when an agent's output drifts from the captured craft. This is what turns "we knew that" into "the system knows that" — and it's what the whole methodology is ultimately for.
→ Capturing the practitioner's craft as a governed, applicable knowledge stream
Want to use this in your own system? → GETTING-STARTED.md — a 5-step adoption checklist, with:
templates/— copy-paste skeletons: source frontmatter,SKILL.md, chapters, the claims schema, and Layer-1 typed-tool stubs (templates/mcp-tools/: hybrid-searchschema.sql+tools.py).starter-example/— a complete, navigable mini-KB (synthetic domain) showing all three layers working together for one document.
| Doc | What it covers |
|---|---|
methodology/01-three-layer-kb.md |
The router and the three layers; how they cooperate at ingestion ("triple") |
methodology/02-book-to-skill.md |
Turning long documents into navigable, progressive-disclosure skill bundles |
methodology/03-llm-wiki-governance.md |
Claims, reliability, and deterministic dashboards: from human memory to institutional memory |
methodology/04-tacit-expertise-as-knowledge.md |
Capturing the practitioner's craft as a governed, applicable knowledge stream |
reference/frontmatter-schema.md |
Frontmatter conventions and a worked, synthetic example |
GETTING-STARTED.md · templates/ · starter-example/ |
Adoption checklist, copy-paste templates, and a runnable-shaped worked example |
- Route by question type. Three jobs, three layers — don't overload one index.
- Determinism where there is a right answer. Navigation inside a document is a table the model follows; reserve vectors for discovery, not traversal.
- Budget tokens explicitly. Master files are cheap to keep present; chapters are pay-per-use.
- Provenance and supersession are fields, not vibes.
- Separate application from governance. Operational knowledge for agents; explanatory, traceable knowledge for humans.
- Restructure, don't chunk the documents that matter.
- Humans in the loop at the seams — ingestion decisions and writes to the shared store sit behind explicit authorization.
This is a field report on assembling and operating existing ideas in production, not a claim to have invented the primitives. (Established standards and vocabulary — agent skills, MCP, RAG, hybrid search, embeddings — are used as-is and assumed familiar.) Two patterns this report builds on but did not originate:
- Book-to-Skill — turning a long document into a progressive-disclosure
SKILL.md-style bundle — is an existing community pattern, not original to this report. The name and the conversion approach come from thebook-to-skillproject by @virgiliojr94 (which builds on the Agent Skills standard). Doc 02 documents how it's applied inside a multi-layer KB. - The "LLM wiki" pattern — curated, linked, maintainable markdown over repeated document search — was popularized by Andrej Karpathy and standardized by Google Cloud's Open Knowledge Format (2026). The governance layer here (doc 03) builds on top of that pattern.
What this report actually contributes is the integration and the operational lessons — not the building blocks: routing by question type across three cooperating layers; a reliability/governance model (claims with status/confidence/evidence, deterministic dashboards, source↔synthesis separation); and a tacit-expertise / adherence-loop framing — all drawn from running a multi-agent KB in a regulated domain.
- Not a library or SDK. There is no code to import.
- Not a maintained product. Treat issues as discussion, not a support channel.
- Not a data release. Nothing here contains client, case, or domain-specific content — only architecture and method.
© 2026 Raffaele Cipro (Principal Product Manager, OpenText). Documentation is released under CC BY 4.0 — reuse freely with attribution. See DISCLAIMER.md for scope and confidentiality boundaries.