Skip to content

fix: spot_train.sh stages predictor.yaml to experiment-package path + pins ALPHA_ENGINE_EXPERIMENT_ID (config#1066)#267

Merged
cipher813 merged 2 commits into
mainfrom
fix/spot-train-experiment-path-staging
Jun 14, 2026
Merged

fix: spot_train.sh stages predictor.yaml to experiment-package path + pins ALPHA_ENGINE_EXPERIMENT_ID (config#1066)#267
cipher813 merged 2 commits into
mainfrom
fix/spot-train-experiment-path-staging

Conversation

@cipher813

Copy link
Copy Markdown
Collaborator

What

Companion to alpha-engine-data#427 (SF cp alignment) + the alpha-engine-predictor#265 fail-loud guard. Makes the model-zoo rotation child spot deterministically load a predictor.yaml WITH model_specs populated.

Diagnosed mismatch

The child spot is a bare predictor clone with NO alpha-engine-config tree, so config.py's experiment-first search path (~/alpha-engine-config/experiments/$ALPHA_ENGINE_EXPERIMENT_ID/predictor/predictor.yaml) is absent on the spot and resolution silently falls through to config/predictor.yaml. That coincidence is the 6/13 inert-rotation fragility (MODEL_SPECS empty → select_rotation_specs()[] → 0 challengers trained — config#1051).

Fix (deterministic, no config.py edit)

  • Pin ALPHA_ENGINE_EXPERIMENT_ID (default reference, matching config.py:_EXPERIMENT_ID) and EXPORT it into every spot heredoc that imports config (preflight / smoke / model-zoo / full-training), so resolution is explicit rather than relying on the os.environ default + dir-existence coincidence.
  • Stage the S3-fetched yaml to BOTH the experiment-package path config.py searches FIRST AND config/predictor.yaml (legacy fallback). Byte-identical copies from the same staged source → MODEL_SPECS populates deterministically, and the spot resolves via the SAME path config.py uses on the box.

bash 3.2 note: the run_ssm heredocs sit inside "$(cat <<'X' ... X)" command substitution and bash 3.2 scans even a quoted heredoc body for the closing paren — so the in-heredoc comments are kept free of parens and apostrophes (caught by bash -n).

Tests

bash -n clean; full suite green (1481 passed) incl. test_model_zoo (31) + the test_spot_train_* battery (37). No test asserts the exact staging paths.

Companion PR

alpha-engine-data#427 (AUTO-DEPLOYS the SF on merge).

Refs config#1066, config#1051. Re-exam 2026-06-20 — next Saturday rotation must register ≥1 spec-* challenger (the #265 inert-rotation alert fires until then).

🤖 Generated with Claude Code

…th + pins ALPHA_ENGINE_EXPERIMENT_ID (config#1066)

Companion to alpha-engine-data SF cp alignment + the predictor#265 fail-loud
guard. The child spot is a BARE predictor clone with NO alpha-engine-config tree,
so config.py's experiment-first search path
(~/alpha-engine-config/experiments/$ALPHA_ENGINE_EXPERIMENT_ID/predictor/predictor.yaml)
is absent on the spot and resolution silently falls through to
config/predictor.yaml. That coincidence is the 6/13 inert-rotation fragility
(MODEL_SPECS empty -> select_rotation_specs() -> [] -> 0 challengers trained).

Fix (deterministic):
- Pin ALPHA_ENGINE_EXPERIMENT_ID (default "reference", matching config.py) and
  EXPORT it into every spot heredoc that imports config (preflight / smoke /
  model-zoo / full-training), so config.py resolution is explicit, not reliant on
  the os.environ default + dir-existence coincidence.
- Stage the S3-fetched yaml to BOTH the experiment-package path config.py
  searches FIRST AND config/predictor.yaml (legacy fallback). Both copies are
  byte-identical from the same staged source, so MODEL_SPECS populates
  deterministically and the spot now resolves via the SAME path config.py uses on
  the always-on box.

bash 3.2 note: the run_ssm heredocs sit inside "$(cat <<'X' ... X)" command
substitution and bash 3.2 scans even a quoted heredoc body for the closing paren,
so the in-heredoc comments are kept free of parens and apostrophes.

bash -n clean; full predictor suite green (1481 passed) incl. test_model_zoo +
the test_spot_train_* battery.

Refs config#1066, config#1051. Re-exam 2026-06-20 (next Saturday rotation must
register >=1 spec-* challenger; predictor#265 alert fires until then).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cipher813

Copy link
Copy Markdown
Collaborator Author

Companion data PR (SF cp alignment, auto-deploys on merge): nousergon/nousergon-data#427. Both must land for the next Saturday model-zoo rotation to train challengers.

@cipher813 cipher813 merged commit 9a351e7 into main Jun 14, 2026
1 check passed
@cipher813 cipher813 deleted the fix/spot-train-experiment-path-staging branch June 14, 2026 22:14
cipher813 added a commit that referenced this pull request Jun 14, 2026
…lse-fails (config#1073) (#269)

The deploy.sh canary invokes the freshly-built image with dry_run=true,
which ran the full PredictorPreflight.run() -> check_deploy_drift, comparing
the image's baked GIT_SHA against live origin/main HEAD. During a rapid
merge burst, main can advance after a deploy's image is built but before its
canary runs, so the canary trips the drift RuntimeError on a perfectly good
image and the deploy false-fails (+ false flow-doctor ERROR page + false SNS
canary-fail). Observed 2026-06-14: 3 PRs merged in ~70 min; #266's canary
false-failed when main had just advanced to #267. No correctness/safety
problem (the canary gate correctly never promoted a bad image), but the
false pages erode alert-signal integrity in a fail-loud system.

A dry_run canary writes no predictions and emails nothing, so drift-vs-main
is the wrong invariant for it. Gate check_deploy_drift on `not dry_run`.
Production (dry_run=false) inference and the SF DeployDriftCheck gate
(action=check_deploy_drift -> run_for_drift_gate, runs daily pre-pipeline)
are unchanged — real-run drift protection is fully preserved.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant