Context
Raymond endorsed the sidecar enrichment pattern on 2026-04-17: per-source parquet sidecars keyed by pid, LEFT-JOINed into the wide parquet at build time. This keeps source-specific enrichment out of the canonical PQG pipeline while making it queryable from a single wide table.
Unlike a thumbnail-only sidecar, the schema should carry richer fields that the Explorer and downstream consumers (e.g., Charismatic samples audit #130) actually need.
Sidecar schema (minimum)
| Field |
Type |
Notes |
pid |
string |
Join key |
thumbnail_url |
string |
Image URL for Explorer card & homepage showcase |
license |
string |
SPDX or source-specific license identifier |
is_public |
bool |
Whether record can be surfaced publicly (embargo handling) |
media_url |
string |
Full-res media, if distinct from thumbnail |
harvested_at |
timestamp |
When sidecar was generated (provenance) |
Priority order (per Raymond)
- Smithsonian IPT (partner-first — relationship-driven priority)
- GEOME
- SESAR
OpenContext — already done (reference implementation)
Acceptance
Downstream consumers
Related
Context
Raymond endorsed the sidecar enrichment pattern on 2026-04-17: per-source parquet sidecars keyed by
pid, LEFT-JOINed into the wide parquet at build time. This keeps source-specific enrichment out of the canonical PQG pipeline while making it queryable from a single wide table.Unlike a thumbnail-only sidecar, the schema should carry richer fields that the Explorer and downstream consumers (e.g., Charismatic samples audit #130) actually need.
Sidecar schema (minimum)
pidthumbnail_urllicenseis_publicmedia_urlharvested_atPriority order (per Raymond)
OpenContext— already done (reference implementation)Acceptance
thumbnail_urlpopulated for all four charismatic samples (unblocks Showcase samples: make the 4 front-page images locate themselves on the globe #130)Downstream consumers
Related