Customer due diligence risk checks powered by the Legal Entity Identifier (LEI), open data and open standards - including the Beneficial Ownership Data Standard (BODS).
Try the demo at https://opencheck.world/
You paste in a Legal Entity Identifier. OpenCheck queries GLEIF first, derives every cross-source identifier it can (UK Companies House number, Norwegian organisation number, Irish company registration number, Finnish Y-tunnus, Latvian registration number, Lithuanian entity code, Estonian registry code, Czech IČO, Polish KRS number, Austrian Firmenbuchnummer, Slovak IČO, French SIREN, Dutch KvK number, Swedish organisation number, Swiss UID, Canadian corporation number, Belgian enterprise number, Danish CVR number, Croatian MBS, Australian ACN/ABN, OpenCorporates ID, Wikidata Q-ID, and more), and uses those bridges to fan out across 29 national and international corporate data sources.
Everything maps into BODS v0.4. Cross-source links and risk signals are computed deterministically, and the whole bundle is one click away from a downloadable export (JSON / JSONL / XML / ZIP).
The risk-signal layer mirrors the EU AMLA draft customer due diligence regulatory technical standards conditions for "complex corporate structures" — trust/arrangement, non-EU jurisdiction, nominee, ≥3 ownership layers, plus the composite threshold rule and an advisory mirror of the subjective obfuscation condition.
Latest: Phase 53 — AI summaries: a grounded, source-cited narrative of each entity
An on-demand, plain-English summary of what OpenCheck found about an entity, written for a customer-due-diligence / financial-crime audience — where every statement is grounded in OpenCheck's own data. The model only rephrases a pre-built evidence packet (it never retrieves or infers), and a citation validator drops any claim it can't tie to a source, so "no unprovable information" is enforced in code, not just in the prompt.
- Evidence packet, not raw data.
build_evidence_packet()distils a lookup result into atomic, already-evidenced facts (each carrying its source, BODS statement ids and a confidence derived from source authority), structured risk items, sources consulted, and gaps. This packet is the only thing the model sees. - Cited claims + mechanical validator. Claude (
claude-sonnet-4-6, structured output, low temperature) returns one executive paragraph plus per-claim citations;validate_narrative()withholds anything ungrounded. Absence is evidence — clean results and gaps are themselves citable, so the model never fabricates a citation. GET /narrative. Reuses the cached lookup pipeline (so the summary can't diverge from the page), runs off the event loop, validates, and returns the packet for UI linking. Flag- and key-gated.- On-demand UI. A summary panel at the top of the result page with per-claim citation chips; clicking a chip scrolls to and flashes the source card and highlights the cited BODS node.
- Offline eval harness. A versioned prompt, six synthetic golden packets, and a machine-checkable rubric for iterating wording before any UI ships —
scripts/eval_narrative.py.
Previous: Phase 52 — GEM GEOT project-level ESG data
The backend ships with cache-first dispatch: in stub mode (no API keys, no OPENCHECK_ALLOW_LIVE) every adapter returns deterministic placeholder data. Live mode is opt-in per source via env vars.
cp .env.example .env
docker compose up --build- Frontend: http://localhost:5173
- Backend: http://localhost:8000 (OpenAPI docs at
/docs)
Backend:
cd backend
uv sync
uv run uvicorn opencheck.app:app --reload --port 8000Frontend:
cd frontend
npm install
npm run devThe first frontend build copies bundled images for @openownership/bods-dagre into public/bods-dagre-images/. If they're missing, run npm run build once.
| Page | Contents |
|---|---|
| How it works | Step-by-step lookup flow, per-adapter detail, Open Ownership BODS bundles, API surface, project structure |
| Sources | Full adapter table — 26 active sources plus inactive bulk-only adapters, license, entry point, description |
| Risk signals | All 12 signal codes: source-derived, AMLA CDD RTS, FATF jurisdiction, cross-source name match, ICIJ Offshore Leaks |
| Configuration | Environment variables, Render deployment, running the test suite |
| Development history | All 53 phases |
OpenCheck's own code is MIT-licensed. Data retrieved from third-party sources is licensed under each source's own terms — see ATTRIBUTIONS.md. Downloaded exports include a LICENSES.md listing every source that contributed data, with re-use guidance for the most-restrictive licence in the bundle.
The frontend also uses the Beneficial Ownership Visualisation System design tokens and @openownership/bods-dagre, both © Open Ownership and re-used under CC BY 4.0 / Apache 2.0 respectively.
- Live opentender.eu integration — the adapter is wired but
live_available=Falsefor now. - A "complex offshore" demo subject that fires every AMLA chip simultaneously.
- BODS RDF / SPARQL backbone via Oxigraph — load the assembled BODS bundle into a triple store, expose
/sparqlfor the published Open Ownership red-flag queries.
Open issues and discussion live in the GitHub repo.
- Beneficial Ownership Data Standard (BODS)
- BODS RDF vocabulary 0.4 — the
risk.pyrules are designed to be portable to a SPARQL/Oxigraph backbone. - GODIN — Global Open Data Integration Network — the LEI-as-connector vision OpenCheck is built around.
- AMLA draft CDD RTS public consultation.
- Open Ownership red flags in BODS data and risk-detection across BO + procurement + sanctions.