Context
Post-#187 <persist_schema>.streams_access carries per-species access cols (has_barriers_<sp>_dnstr, access_<sp>) but drops the per-barrier-source flag cols that lnk_pipeline_mapping_code needs to classify the second mapping_code token:
has_barriers_anthropogenic_dnstr
has_barriers_pscis_dnstr
has_barriers_dams_dnstr
has_barriers_remediations_dnstr
dam_dnstr_ind (sequence-aware dam-detection; resident flavor)
remediated_dnstr_ind
These columns exist in working schema's streams_access (lnk_pipeline_access writes them when barrier_sources is passed) but #187 Phase 2 dropped them from cols_streams_access_base on (incorrect) "they're conditional on remediations/observations" reasoning. Actually conditional only on barrier_sources arg, which lnk_pipeline_run always passes.
Impact
lnk_mapping_code reads SELECT * FROM <persist_schema>.streams_access and finds the per-source columns absent → lnk_pipeline_mapping_code's has() helper returns FALSE → any_anth = any_pscis = any_dam = FALSE → second token defaults to NONE.
Empirically PARS BT 2026-05-19:
- link:
ACCESS;NONE 4412 rows, ACCESS;NONE;INTERMITTENT 7080 rows
- bcfp:
ACCESS;DAM 4411, ACCESS;DAM;INTERMITTENT 5457, ACCESS;MODELLED 1775, ACCESS;MODELLED;INTERMITTENT 3113, ACCESS;ASSESSED 900, ACCESS;ASSESSED;INTERMITTENT 1100
Match_pct vs bcfp drops from historic 98%+ to ~30-50%.
Design choices
Column-naming inconsistency (pre-existing; not introduced by #187):
has_barriers_<source>_dnstr (boolean prefix has_, suffix _dnstr)
dam_dnstr_ind / remediated_dnstr_ind (no has_ prefix, suffix _ind for "indicator")
Two patterns coexist. Could rename to one convention (e.g., all has_<thing>_dnstr or all <thing>_dnstr_ind), but that's a separate cleanup. Scope of this fix: add the columns with existing names so lnk_pipeline_mapping_code's column probes find them.
Where to put the persist DDL:
Picking B: future per-source extension (new barrier classes) lives in one place, mirrors the per-species helper pattern. Less coupling between "scalar base" and "source-driven".
Defaults: source classes are hardcoded today (anthropogenic/pscis/dams/remediations). #189-style data-driving deferred — would mean a parameters_barrier_sources.csv declaring which sources exist per bundle. Not in this hotfix.
Acceptance
References
Context
Post-#187
<persist_schema>.streams_accesscarries per-species access cols (has_barriers_<sp>_dnstr,access_<sp>) but drops the per-barrier-source flag cols thatlnk_pipeline_mapping_codeneeds to classify the second mapping_code token:has_barriers_anthropogenic_dnstrhas_barriers_pscis_dnstrhas_barriers_dams_dnstrhas_barriers_remediations_dnstrdam_dnstr_ind(sequence-aware dam-detection; resident flavor)remediated_dnstr_indThese columns exist in working schema's
streams_access(lnk_pipeline_access writes them whenbarrier_sourcesis passed) but #187 Phase 2 dropped them fromcols_streams_access_baseon (incorrect) "they're conditional on remediations/observations" reasoning. Actually conditional only onbarrier_sourcesarg, whichlnk_pipeline_runalways passes.Impact
lnk_mapping_codereadsSELECT * FROM <persist_schema>.streams_accessand finds the per-source columns absent →lnk_pipeline_mapping_code'shas()helper returns FALSE →any_anth = any_pscis = any_dam = FALSE→ second token defaults toNONE.Empirically PARS BT 2026-05-19:
ACCESS;NONE4412 rows,ACCESS;NONE;INTERMITTENT7080 rowsACCESS;DAM4411,ACCESS;DAM;INTERMITTENT5457,ACCESS;MODELLED1775,ACCESS;MODELLED;INTERMITTENT3113,ACCESS;ASSESSED900,ACCESS;ASSESSED;INTERMITTENT1100Match_pct vs bcfp drops from historic 98%+ to ~30-50%.
Design choices
Column-naming inconsistency (pre-existing; not introduced by #187):
has_barriers_<source>_dnstr(boolean prefixhas_, suffix_dnstr)dam_dnstr_ind/remediated_dnstr_ind(nohas_prefix, suffix_indfor "indicator")Two patterns coexist. Could rename to one convention (e.g., all
has_<thing>_dnstror all<thing>_dnstr_ind), but that's a separate cleanup. Scope of this fix: add the columns with existing names solnk_pipeline_mapping_code's column probes find them.Where to put the persist DDL:
cols_streams_access_basedirectly with all 6 columns. Mirrors the bundle-wide DDL approach from lnk_persist_init: wide tables sized to active_species break on heterogeneous WSG runs #194..lnk_cols_streams_access_source_flags()generator (analogous to.lnk_cols_streams_access_per_sp(species)from mapping_code build decoupled from tunnel — persist streams_access + lnk_pipeline_run phase + rename with_mapping_code → mapping_code #187). Keeps per-source separated from base.Picking B: future per-source extension (new barrier classes) lives in one place, mirrors the per-species helper pattern. Less coupling between "scalar base" and "source-driven".
Defaults: source classes are hardcoded today (
anthropogenic/pscis/dams/remediations). #189-style data-driving deferred — would mean aparameters_barrier_sources.csvdeclaring which sources exist per bundle. Not in this hotfix.Acceptance
R/lnk_persist_init.R: new.lnk_cols_streams_access_source_flags()helper returns named vector forhas_barriers_<source>_dnstr(boolean, 4 sources) +dam_dnstr_ind(boolean) +remediated_dnstr_ind(boolean) — 6 cols total.lnk_persist_initincludes those cols instreams_accessCREATE TABLE.lnk_pipeline_persist's INSERT projection picks them up automatically (iteratesnames(access_cols_v)).mapping_code_btdistribution showsACCESS;DAM/ACCESS;MODELLED/ACCESS;ASSESSEDtokens (not justACCESS;NONE). PARS BT match_pct vs bcfp returns to ~98%.References
R/lnk_pipeline_mapping_code.R:174-194— column probes that consume the persisted flags.R/lnk_pipeline_mapping_code.R:196-210— hardcoded precedence chain (separate issue: rules engine).R/lnk_persist_init.Rcols_streams_access_base— currently missing the per-source columns.R/lnk_pipeline_access.Rlines ~250-260 — where the per-source flags get emitted.