Cheap access-only post-consolidate recompute (bulletproof cross-WSG mapping_code parity, efficiently)

## Problem

The #175 tunnel-free per-segment **mapping_code parity** needs a **post-consolidate recompute**: a WSG's accessibility (hence its `mapping_code` token1/token2) depends on whether a blocking barrier exists *downstream* — possibly in a different WSG (the provincial-accumulation property, `RUNBOOK.md` §5). When WSGs are distributed across machines, each machine only holds its own bucket's barriers while it runs, so a WSG's access is computed against an **incomplete** barrier set.

Caught 2026-05-25 (study-area run): FINA 75.5% / PARA 68.6% per-host → both **99%+** after re-modelling on the full consolidated barrier set. Drainage-closed + DS-first bucketing reduces but does **not** eliminate this (downstream barriers can be cross-bucket or arrive late in DS-first order). So the correct methodology is: distribute (any bucketing) → consolidate → **recompute** → compare.

The orchestrator has no recompute today because the provincial run compares habitat-km *rollups*, not per-segment mapping_code — this is new to #175.

## Current state (link#175, `study_area_run.sh`)

The driver recomputes diverged WSGs via the **full pipeline** (`wsg_run_one.R` = `lnk_pipeline_run(mapping_code=TRUE)`). That works but is slow and architecturally wasteful: re-running the full pipeline on the dispatcher re-derives `streams` + `streams_habitat` (which are already correct + persisted) just to redo `streams_access` + `streams_mapping_code`. Recompute-ALL via full pipeline would defeat distribution (dispatcher redoes everything ≈ running it all single-host twice).

## Ask: a cheap access-only recompute

Add a recompute path that redoes **only `lnk_pipeline_access` + `lnk_mapping_code`** for a WSG, **reusing the consolidated persisted `streams`, `streams_habitat_<sp>`, and `barriers`** — no full working-schema rebuild. Then post-consolidate `recompute-ALL` becomes cheap (~seconds/WSG) → **bulletproof correctness regardless of machine count or WSG bucketing**, efficiently. That is the durable methodology #175 is after.

Blocker (per the #175 Plan-agent review): `lnk_pipeline_access` currently needs `observations` + `crossings` working-schema artifacts (not persisted). Options:
- persist the access inputs (or the per-species `barriers_<sp>_access` views are already persist-backed — audit what's actually missing), OR
- a `lnk_pipeline_access` variant that reads its inputs from persist, OR
- pre-persist barriers only in pass 1 + a single full pass 2 with complete barriers (relates to #196 double-persist optimization).

`lnk_mapping_code()` (portable, schema-aware) already recomputes mapping_code from persist tables — the missing half is the persist-based access recompute.

## Refs
- link#175 (study-area mapping_code parity; `research/study_area_run.md`), #196 (double-persist / pre-persist barriers), #204 (persist shape-drift), RUNBOOK.md §5.
- SRED: NewGraphEnvironment/sred-2025-2026#24.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cheap access-only post-consolidate recompute (bulletproof cross-WSG mapping_code parity, efficiently) #205

Problem

Current state (link#175, `study_area_run.sh`)

Ask: a cheap access-only recompute

Refs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Cheap access-only post-consolidate recompute (bulletproof cross-WSG mapping_code parity, efficiently) #205

Description

Problem

Current state (link#175, study_area_run.sh)

Ask: a cheap access-only recompute

Refs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Current state (link#175, `study_area_run.sh`)