From dc00a3b93554924cfaebfa84b27928a1685e7bac Mon Sep 17 00:00:00 2001 From: Raymond Yee Date: Fri, 24 Apr 2026 07:46:24 -0700 Subject: [PATCH] docs(pubs): expand GitHub Repositories section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The current listing has four entries and no framing of how they relate. In practice iSamples is four-tier pipeline: metadata + vocabularies → pqg → data.isamples.org/Zenodo → consumers but the previous table didn't show this and was missing two of the five core repos (examples/pqg). Specifically: - Added `examples` (the Python client + notebooks) and `pqg` (the property-graph parquet framework) — both are core consumer/ serialization repos the previous table omitted. - Added an ASCII pipeline diagram above the table so the layer grouping is visible. - Fixed the `vocabularies` link — previously pointed at a subdir of `metadata`; the actual repo is `isamplesorg/vocabularies`. - Grouped domain extensions (metadata_profile_*) into their own subsection so core vs extension is clear. - Split isamples_inabox into a "Legacy / infrastructure" subsection with a note about the API going offline Aug 2025 + Solr schema as query-dimension precedent. - Added cross-links to query-spec.qmd and SERIALIZATIONS.md as the companion docs that document the substrate itself. - Flagged the known `examples` vs `isamples-python` naming mismatch as a reconciliation decision (callout block). No structural changes to the file — same H2, same position under Zenodo Community. Just replacing the inner table with layered listings and a diagram. --- pubs.qmd | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 8 deletions(-) diff --git a/pubs.qmd b/pubs.qmd index 698d538..9be9452 100644 --- a/pubs.qmd +++ b/pubs.qmd @@ -34,11 +34,52 @@ The [iSamples Zenodo Community](https://zenodo.org/communities/isamples) archive ## GitHub Repositories {.unnumbered} -All iSamples source code is available on [GitHub](https://github.com/isamplesorg/): - -| Repository | Description | -|------------|-------------| -| [isamplesorg.github.io](https://github.com/isamplesorg/isamplesorg.github.io) | This website — Quarto, Observable JS, DuckDB-WASM | -| [metadata](https://github.com/isamplesorg/metadata) | iSamples metadata model and schema documentation | -| [isamples_inabox](https://github.com/isamplesorg/isamples_inabox) | iSamples-in-a-Box server infrastructure | -| [iSamples vocabularies](https://github.com/isamplesorg/metadata/tree/develop/vocabulary) | SKOS vocabulary RDF files | +All iSamples source code is available at the [isamplesorg GitHub org](https://github.com/isamplesorg/). The repositories form a tight pipeline from **schema** through **serialization** to **consumers**: + +``` +metadata + vocabularies ← canonical data model & SKOS terms + │ + ▼ + pqg ← property-graph parquet format + tooling + │ + ▼ + data.isamples.org + Zenodo ← published parquet snapshots (narrow, wide, H3, lite, facet caches) + │ + ┌──────┴──────┐ + ▼ ▼ +examples isamplesorg.github.io +(Python) (Web + DuckDB-WASM + Cesium) +``` + +### Core repositories {.unnumbered} + +| Repository | Role | Layer | +|---|---|---| +| [metadata](https://github.com/isamplesorg/metadata) | Canonical data model — the 8 entity types (MaterialSampleRecord, SamplingEvent, SamplingSite, GeospatialCoordLocation, …) and their relationships | schema | +| [vocabularies](https://github.com/isamplesorg/vocabularies) | SKOS vocabularies for material type, context, and specimen categories | schema | +| [pqg](https://github.com/isamplesorg/pqg) | Property-graph Parquet format spec + conversion tooling (narrow ↔ wide); H3 augmentation and facet caches | serialization | +| [examples](https://github.com/isamplesorg/examples) | Python client and Jupyter notebooks — DuckDB + lonboard for interactive analysis. Also known as `isamples-python` (see below) | consumer | +| [isamplesorg.github.io](https://github.com/isamplesorg/isamplesorg.github.io) | This documentation site — Quarto, Observable, browser-side DuckDB-WASM, Cesium globe | consumer | + +### Domain extensions {.unnumbered} + +Domain-specific vocabularies extend the core terms via `skos:broader`: + +- [metadata_profile_earth_science](https://github.com/isamplesorg/metadata_profile_earth_science) — mineral groups, rock/sediment types, sampled-feature roles +- [metadata_profile_biology](https://github.com/isamplesorg/metadata_profile_biology) — sampled-feature extensions for biological specimens +- [metadata_profile_archaeology](https://github.com/isamplesorg/metadata_profile_archaeology) — OpenContext-style material and object-type extensions + +### Legacy / infrastructure {.unnumbered} + +- [isamples_inabox](https://github.com/isamplesorg/isamples_inabox) — the original iSamples-in-a-Box server (Solr + FastAPI). The public [iSamples Central](https://central.isample.xyz/isamples_central/) API was offline as of August 2025; the Solr schema there remains the authoritative precedent for query-dimension names (see [Query Specification](query-spec.qmd)) + +### Related documents {.unnumbered} + +- [Query Specification](query-spec.qmd) — substrate-neutral query contract (v0.1) +- [Serialization catalog](SERIALIZATIONS.md) — every published parquet file with role, size, upstream, and consumer + +::: {.callout-note} +### Naming note: `examples` vs `isamples-python` + +The Python client repo is called `examples` on GitHub but is referred to as `isamples-python` in its own README, the Zenodo deposition metadata, and most prose documentation. This mismatch is known and slated for reconciliation — likely a GitHub repo rename with automatic redirects handling prior links. +:::