Skip to content

link-authored bcfp-style rules for KO, RB, GR in bcfishpass config (for link-vs-link comparison on the full species set) #212

Description

@NewGraphEnvironment

Problem

The session that produced v0.41.0–v0.41.3 set up two link-vs-bcfp mapping_code parity runs:

  • Comparison A — link/bcfp config vs bcfp upstream (the fresh.streams_vw_bcfp snapshot). Validates that link reproduces bcfp's methodology. v0.40.5 baseline: 99.66% median across 50 study-area WSGs × 7 species (WCT absent from this study area).
  • Comparison B — link/default config vs link/bcfp config. Isolates the parameter-tuning delta between bcfp-style tuning (faithful to bcfp upstream) and link's "default" tuning (newgraph methodology choices). First run (v0.41.3, default config only): 96.45% median vs bcfp snapshot.

Comparison B is the part this issue is about. To make it apples-to-apples on the full species set, link/bcfp config has to model the same 11 species as link/default (BT, CH, CM, CO, PK, SK, ST, WCT, KO, RB, GR). Today it models 8 (BT, CH, CM, CO, PK, SK, ST, WCT) — its rules.yaml has no top-level keys for KO/RB/GR because bcfishpass/dimensions.csv has no rows for them.

The catch: bcfp upstream doesn't model KO/RB/GR at all — those species aren't in bcfp's pipeline. So any rules we add to link's "bcfishpass" config for them are link-authored guesses, not bcfp upstream. We need a record of that so future readers don't conflate the two.

Why "guesses" can be reasonable

Looking at the diff between bcfishpass/dimensions.csv and default/dimensions.csv for the 7 shared species (excluding CM/PK which are byte-identical), default differs from bcfp in only 4 columns, plus 2 single-species edge cases:

Column bcfp value default value Hits
rear_lake no yes BT, CH, CO, ST, WCT
rear_stream_in_waterbody no yes BT, CH, WCT
rear_wetland no yes BT, CH, ST, WCT
rear_wetland_polygon no yes BT, CH, CO, ST, WCT
rear_all_edges (BT only) yes no BT
spawn_connected_lake_adjacent (SK only) yes no SK

The pattern is tight: default expands rearing habitat (lakes, wetlands, in-waterbody streams); bcfp restricts it. Apply the inverse to default's KO/RB/GR rows and you get reasonable bcfp-style approximations.

Proposed rows

bcfishpass/dimensions.csv — 3 new rows

KO = default's KO row, AS-IS. Kokanee's lake-obligate biology (rear_lake_only=yes) forces both philosophies to the same answer; the 4 default-vs-bcfp expansion flips don't apply (default's KO already has rear_wetland=no, rear_wetland_polygon=no, rear_stream_in_waterbody=no because kokanee don't use those). Keep rear_lake_ha_min=200 — without it, "all lakes count" is wrong.

RB = default's RB with 4 inverse-flips:

  • rear_lake = no
  • rear_stream_in_waterbody = no
  • rear_wetland = no
  • rear_wetland_polygon = no
  • Drop rear_lake_ha_min, rear_wetland_ha_min (irrelevant once the above are no)
  • Everything else mirrors default's RB

Matches bcfp's ST/CO/CH pattern (no-rear-lake, no-rear-wetland, etc.) — RB ≈ ST with resident traits, per project domain knowledge.

GR = default's GR with 2 inverse-flips:

  • rear_lake = no
  • rear_stream_in_waterbody = no
  • (rear_wetland, rear_wetland_polygon already no in default's GR)
  • Drop rear_lake_ha_min
  • Everything else mirrors default's GR

Matches bcfp's CH pattern (no-rear-lake) — GR ≈ resident chinook (large stream, low gradient).

bcfishpass/parameters_fresh.csv — 2 new rows

bcfishpass/parameters_fresh.csv already has an RB row (identical to default's RB — link-authored at config-bundle time). KO and GR are missing.

  • KO row: clone default's KO row (no bcfp-vs-default delta for KO in default's params row)
  • GR row: clone default's GR row (same logic)

Both are residents (observation_control_apply = TRUE, residents-style access_gradient_max = 0.15 for GR, etc.) — matching default's choices.

bcfishpass/rules.yaml — regenerate

Via data-raw/build_rules.R. Diff the generated rules.yaml to confirm only KO/RB/GR top-level keys appear (no changes to existing 8 species).

Provenance markers

Every new row's notes column gets a marker like:

link-authored bcfp-style approximation: bcfp upstream does not model KO. Row derived from default config's KO row (kokanee biology forces both philosophies to converge). See link#<this issue>.

What this unblocks

  • Comparison B on all 11 species. Currently it's compromised — link/bcfp doesn't model KO/RB/GR, so the run-vs-run diff has nothing to say about them.
  • A baseline for "if bcfp upstream did add KO/RB/GR, here's what link would expect that to look like" — useful if bcfp upstream ever moves in that direction.

Out of scope

  • CT, DV are present in bcfishpass/parameters_fresh.csv AND default/dimensions.csv (rows exist with full habitat rules), but not in bcfishpass/dimensions.csv and not in either config's rules.yaml. Different story — those species are "modelled in default but disabled at rules.yaml generation time." Adding them is a separate concern (and a separate biology / methodology call).
  • Byte-identity of the 8 existing species' mapping_code after the re-run. Per-species barrier computation is independent (each barriers_<sp>_unified only consults that species' access threshold, observation set, habitat overrides). Adding KO/RB/GR rows doesn't touch BT's, CH's, etc. The 8 SHOULD land byte-identical to the v0.40.5 baseline. If they don't, that's a bug to investigate, not a methodology call.

Acceptance

  • 3 rows added to bcfishpass/dimensions.csv (KO/RB/GR) with provenance markers in notes
  • 2 rows added to bcfishpass/parameters_fresh.csv (KO/GR) with provenance markers in notes
  • bcfishpass/rules.yaml regenerated; diff confirms only KO/RB/GR top-level keys added
  • data-raw/study_area_run.sh --config=bcfishpass rerun on the 50 study-area WSGs → 11-species output lands in fresh.*
  • 8 existing species' rows byte-identical to v0.40.5 baseline (sanity check)
  • Linked: a follow-up issue to build the link-vs-link compare (reference = "link" arg in lnk_compare_mapping_code) so Comparison B can run on all 11 species

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions