You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The link R package (NewGraphEnvironment/link, v0.38.0) runs across multiple hosts: M4 (driver / orchestrator), M1 (sibling MacBook), and N DigitalOcean cypher droplets. The autonomous CLI (data-raw/wsgs_run_pipeline.sh) dispatches work in parallel across these hosts. Per-host work runs Rscript wsgs_run_host.R which depends on link + fresh R packages being installed and at compatible versions on every host.
Pre-flight version check at data-raw/wsgs_dispatch.sh:413-456 queries packageVersion("link") + packageVersion("fresh") across all hosts and fails-loud on mismatch. Operator's fix is always the same: Rscript -e 'pak::local_install(upgrade = FALSE, ask = FALSE)' on the lagging host(s).
Today on each PR merge the operator manually:
pak::local_install on M4 (driver).
ssh m1 'git fetch && git checkout main && git pull && pak::local_install' on M1.
(Cyphers self-install via cypher_prep.sh during Step 4, OK.)
We hit this exact case twice during v0.38.0 work (M4 missed install after merge of v0.38.0 release; M1 was on a feature branch when it should have been on main). Both surfaced via the version-mismatch pre-flight error — good catch, but operator-side fix is friction.
Problem
Frequent active development. Operator works on a branch on M4 → tests dispatch → expects M1 + cyphers to match. Active days have multiple branch checkouts. Each one needs M1 sync + reinstall before the next dispatch.
Version match ≠ code match.packageVersion("link") == "0.38.0" on both M4 and M1 doesn't guarantee the source code matches — both could be at v0.38.0 but on different commits (e.g. M4 on a feature branch with patches not in main; M1 on main). Today's check is version-only, not SHA.
Operator has to remember. The pre-flight error is informative but the fix lives outside the umbrella — operator types the same commands every time.
Goals
Parity hook at pre-flight. Driver (M4) is source of truth. Sibling hosts (M1 + cyphers) must match: same packageVersion + commit SHA for link AND fresh.
Auto-install option. When --auto-install is set (or stdin is non-TTY, i.e. background run), umbrella auto-installs on lagging hosts after detecting mismatch.
TTY prompt. When operator is interactive: pre-flight reports mismatch, prompts [y]es / [n]o / [s]kip for each lagging host. Yes → run the install; no → abort; skip → proceed with the warning (--skip-preflight-like behavior, today's escape hatch).
RemoteSha is set by pak when installing from a remote. For local checkouts via pak::local_install, RemoteSha may be empty — fall back to comparing git rev-parse HEAD on the package's source repo.
Sibling check
For each sibling host:
SIB_LINK_VER=$(ssh "$host"'Rscript -e "cat(as.character(packageVersion(\"link\")))"')
SIB_LINK_SHA=$(ssh "$host"'cd ~/Projects/repo/link && git rev-parse HEAD')if [ "$SIB_LINK_VER"!="$DRIVER_LINK_VER" ] || [ "$SIB_LINK_SHA"!="$DRIVER_LINK_SHA" ];thenecho" [parity] mismatch on $host: link ver=$SIB_LINK_VER sha=$SIB_LINK_SHA (driver: ver=$DRIVER_LINK_VER sha=$DRIVER_LINK_SHA)"
parity_fail=1
fi
Auto-install decision
AUTO_INSTALL=0
[ !-t 0 ] && AUTO_INSTALL=1 # non-TTY: auto-install by default
[ "$AUTO_INSTALL_FLAG"="1" ] && AUTO_INSTALL=1
if [ "$parity_fail"="1" ];thenif [ "$AUTO_INSTALL"="1" ];thenecho" [parity] auto-install enabled; running pak::local_install on lagging hosts"# for each mismatching host: ssh in, git fetch, checkout driver's branch + SHA, pak::local_installelse# interactive prompt loopecho" [parity] interactive mode; please choose action per host"# ...fifi
Branch sync
For sibling hosts to install from the same commit, they need to be on the same branch. Two scenarios:
Branch is on origin (already pushed): ssh $host 'git fetch && git checkout $DRIVER_BRANCH && git pull --ff-only'.
Branch is local-only on driver (operator is mid-PR, not pushed): warn loud, suggest git push -u origin $DRIVER_BRANCH or set --allow-unpushed flag (auto-pushes from driver before sibling sync).
Acceptance
data-raw/wsgs_run_pipeline.sh — new --auto-install flag; default-on when stdin is non-TTY.
Pre-flight parity check moves from data-raw/wsgs_dispatch.sh to umbrella's pre-flight block (or a shared helper script).
Compares packageVersion + git rev-parse HEAD for both link and fresh per sibling host.
On mismatch + auto-install: ssh's the host, fetches/checkouts driver's branch, runs pak::local_install. Re-checks.
On mismatch + interactive: prompt per host with [y]es/[n]o/[s]kip.
On mismatch + no auto-install + no TTY (e.g. cron): abort with clear FATAL message + the operator's manual-fix commands.
Unpushed-branch safety: detect when driver's branch is local-only; abort with --allow-unpushed opt-out (auto-pushes from driver before sibling sync).
Smoke tests:
Driver on main + sibling on main, both v0.38.0, same SHA → preflight passes silently.
Driver on feature-branch + sibling on main (different SHA) → mismatch detected; auto-install fires (or prompt fires); after install, parity restored.
Driver on local-only branch + --auto-install set → warn; --allow-unpushed set → push first; without it → FATAL.
Context
The
linkR package (NewGraphEnvironment/link, v0.38.0) runs across multiple hosts: M4 (driver / orchestrator), M1 (sibling MacBook), and N DigitalOcean cypher droplets. The autonomous CLI (data-raw/wsgs_run_pipeline.sh) dispatches work in parallel across these hosts. Per-host work runsRscript wsgs_run_host.Rwhich depends onlink+freshR packages being installed and at compatible versions on every host.Pre-flight version check at
data-raw/wsgs_dispatch.sh:413-456queriespackageVersion("link")+packageVersion("fresh")across all hosts and fails-loud on mismatch. Operator's fix is always the same:Rscript -e 'pak::local_install(upgrade = FALSE, ask = FALSE)'on the lagging host(s).Today on each PR merge the operator manually:
pak::local_installon M4 (driver).ssh m1 'git fetch && git checkout main && git pull && pak::local_install'on M1.cypher_prep.shduring Step 4, OK.)We hit this exact case twice during v0.38.0 work (M4 missed install after merge of v0.38.0 release; M1 was on a feature branch when it should have been on main). Both surfaced via the version-mismatch pre-flight error — good catch, but operator-side fix is friction.
Problem
packageVersion("link") == "0.38.0"on both M4 and M1 doesn't guarantee the source code matches — both could be at v0.38.0 but on different commits (e.g. M4 on a feature branch with patches not in main; M1 on main). Today's check is version-only, not SHA.Goals
packageVersion + commit SHAforlinkANDfresh.--auto-installis set (or stdin is non-TTY, i.e. background run), umbrella auto-installs on lagging hosts after detecting mismatch.[y]es / [n]o / [s]kipfor each lagging host. Yes → run the install; no → abort; skip → proceed with the warning (--skip-preflight-like behavior, today's escape hatch).Approach
Driver's source of truth
RemoteShais set bypakwhen installing from a remote. For local checkouts viapak::local_install, RemoteSha may be empty — fall back to comparinggit rev-parse HEADon the package's source repo.Sibling check
For each sibling host:
Auto-install decision
Branch sync
For sibling hosts to install from the same commit, they need to be on the same branch. Two scenarios:
ssh $host 'git fetch && git checkout $DRIVER_BRANCH && git pull --ff-only'.git push -u origin $DRIVER_BRANCHor set--allow-unpushedflag (auto-pushes from driver before sibling sync).Acceptance
data-raw/wsgs_run_pipeline.sh— new--auto-installflag; default-on when stdin is non-TTY.data-raw/wsgs_dispatch.shto umbrella's pre-flight block (or a shared helper script).packageVersion + git rev-parse HEADfor bothlinkandfreshper sibling host.pak::local_install. Re-checks.--allow-unpushedopt-out (auto-pushes from driver before sibling sync).--auto-installset → warn;--allow-unpushedset → push first; without it → FATAL.References
data-raw/logs/wsgs_run_pipeline/20260515_035053_wsgs_run_pipeline.log— M4 link=0.37.0 (operator forgot install after merge).data-raw/logs/wsgs_run_pipeline/20260515_035809_prep_job1.log— M1 v0.38.0 ok but on wrong branch.data-raw/wsgs_dispatch.sh:413-456— current preflight version check.data-raw/cypher_prep.sh— does the install for cyphers (the model to mirror for M4/M1).data-raw/snapshot_bcfp.sh— IS baked into the umbrella's Step 1+2 (correct pattern; this issue extends the same approach to package installs).Out of scope
pak::local_installitself (already works).--forcesemantics — separate sibling issue.state_clean.sh --tables=— separate sibling issue.cypher_prep.shpipefail masking — separate sibling issue.