issue #66: add no-LLM CI (ephemeral anvil tier-1 + scaffolded test-broker tier-2)#98
issue #66: add no-LLM CI (ephemeral anvil tier-1 + scaffolded test-broker tier-2)#98hanwencheng wants to merge 4 commits into
Conversation
Two-tier CI matching issue #66's "shared test broker for CI + dev" vision: Tier 1 — ephemeral (every push/PR, fully self-contained, ~10–15 min): * .github/workflows/harness-ci.yml — cargo fmt + clippy + test + harness/ci-ephemeral-stack.sh. No LLM, no @claude invocation. * harness/ci-ephemeral-stack.sh — spins up anvil (new chain), runs forge build + test, deploys fresh v2 stage-1 contracts via DeployAgentKeysV1.s.sol (new contracts, new anvil-prefunded deployer), verifies via scripts/verify-heima-contracts.sh, then stands up mock-server + agentkeys-broker-server with --skip-startup-check (StubSts path) and probes OIDC discovery surface. EXIT trap tears everything down. Tier 2 — long-lived test broker (nightly + workflow_dispatch, scaffolded here, operator-activated via TEST_OIDC_AWS_ROLE_ARN secret): * .github/workflows/harness-e2e.yml — gated workflow that targets test-broker.litentry.org with real test AWS resources, runs all three stage demos against the long-lived parallel infra. Includes nightly cleanup of stale ci/ S3 prefixes. Uses GitHub Actions OIDC (id-token: write) for AWS auth, never long-lived secrets. * scripts/provision-test-environment.sh — operator-run one-shot provisioner that walks the 7 steps to stand up test-broker (separate OIDC provider, separate IAM roles, separate buckets, separate deployer wallet, fresh contracts on Heima-Paseo). * scripts/test-environment.env.example — committed env template mirroring operator-workstation.env with -test suffixes. * docs/test-environment.md — bring-up runbook, secret list, rotation, cleanup, and the two-tier design rationale. WebAuthn: harness scripts default to WEBAUTHN_MODE=0 (stage-1 line 131, stage-2 --stub) so no Touch ID prompt is ever needed; --webauthn is opt-in and never passed by either workflow. Validated locally: bash harness/ci-ephemeral-stack.sh --skip-broker passes all 8 steps (anvil up, 33 forge tests, 6 contracts deployed + verified, clean teardown). YAML + shell syntax checked.
Per operator feedback:
1. "do not create new files, only add the test file" — drop the
ephemeral-stack helper, provisioner, env template, e2e workflow,
and docs. Single deliverable: .github/workflows/harness-ci.yml.
2. "onchain solution should test on Heima mainnet with a new smart
contract address" — confirmed possible: Solidity compiles
deterministically and EVM contract addresses derive from
(deployer, nonce). Identical crates/agentkeys-chain/src/*.sol +
identical DeployAgentKeysV1.s.sol + a different deployer key on
Heima mainnet = isolated parallel contract set at new addresses on
the production chain.
3. "CI mirrors the production env" — the workflow now invokes the
PRODUCTION harness scripts (harness/v2-stage{1,2,3}-demo.sh)
unchanged. The only thing CI does differently from a prod operator
is materialize scripts/operator-workstation.env with TEST_*
resource names from GitHub secrets:
- TEST_OIDC_AWS_ROLE_ARN (gate; until set, harness job skips)
- TEST_ACCOUNT_ID / TEST_AWS_REGION / TEST_BROKER_HOST
- TEST_VAULT_BUCKET / TEST_MEMORY_BUCKET
- TEST_{VAULT,MEMORY,DATA}_ROLE_ARN
- TEST_HEIMA_DEPLOYER_KEY (raw 0x-prefixed mainnet key — test
wallet, distinct from prod deployer)
- TEST_{SCOPE,SIDECAR_REGISTRY,K3_EPOCH_COUNTER,
CREDENTIAL_AUDIT,P256_VERIFIER,K11_VERIFIER}_CONTRACT_ADDRESS_HEIMA
(pre-deployed once per test-env refresh; harness skips deploy
via --skip-deploy so CI doesn't burn HEI on every push)
AWS auth via GitHub Actions OIDC (id-token: write), no long-lived
secrets. Per-run S3 prefix isolation. The workflow gates itself on
TEST_OIDC_AWS_ROLE_ARN being set so it's inert until the operator
activates the test infra.
WebAuthn: never invoked — harness scripts default to WEBAUTHN_MODE=0
(stage-1 line 131) and stage-2's --stub flag is passed explicitly.
LLM: zero. Plain cargo/forge/aws-cli/curl orchestration. Distinct from
claude.yml + claude-code-review.yml which intentionally do call @claude.
bbe72b8 to
cd25bde
Compare
…ima}.sh
Per operator request: pivot cloud-setup.md from a verbose manual-bash
runbook to a concise prereq/script-pointer split, add new heima-setup.md
+ ci-setup.md for the chain + CI flows, and move troubleshooting into
the ./wiki/ folder.
What changed:
docs/cloud-setup.md — UPDATE, 970 → 314 lines
Add a TL;DR with the three-command operator flow (manual §1-§4
prereqs, then setup-broker-host.sh, then setup-heima.sh). Slim
§1-§4 to invariants + helper-script pointers + brief command
blocks (DKIM bulk-record / receipt rule / per-data-class role
provisioning all delegate to the existing scripts/*.sh). Replace
the verbose §5/§6/§7 (EC2 broker / signer / workers, each with
100+ lines of inline bash) with one §5 "Run setup-broker-host.sh"
section that names what the script does (build, systemd, nginx,
certbot, keypairs, env files) + what it doesn't (DNS, IAM, OIDC
provider — those stay in §1-§4). Keep §0 (identities table) and
§6 (cleanup recipe).
docs/heima-setup.md — NEW, 106 lines
The 15-step pipeline in scripts/setup-heima.sh, with idempotency
check + helper-script pointer per step. Mainnet vs Paseo vs Anvil
tradeoff table. Per-step re-run examples. Heima London EVM pin
explanation.
docs/ci-setup.md — NEW, 184 lines
The 7-step operator bring-up for the no-LLM
.github/workflows/harness-ci.yml workflow: provision test broker
via setup-broker-host.sh with -test suffix, provision parallel
AWS resources, register the test OIDC provider, generate + fund
the test deployer wallet, deploy fresh test contracts on Heima
mainnet with the same .sol source (different deployer →
different addresses → isolated parallel contract set), register
the GitHub Actions OIDC role, set the repo secrets. Includes
the full TEST_* secret list, manual-dispatch instructions, and
a secret-hygiene reminder.
wiki/cloud-setup-faq.md — NEW, 94 lines
wiki/heima-setup-faq.md — NEW, 111 lines
wiki/ci-setup-faq.md — NEW, 96 lines
Troubleshooting + edge cases for each setup doc. Lives under
./wiki/ per CLAUDE.md "Wiki-location policy" — auto-published
to the GitHub wiki on every push to main.
Constraints applied:
- Concise: every doc fits in a few screens.
- Idempotent: every flow reuses the existing idempotent helper
scripts (setup-broker-host.sh, setup-heima.sh, provision-*-role.sh,
apply-*-bucket-policy.sh).
- No project credentials exposed: account IDs, role ARNs, bucket
names, deployer keys, contract addresses all referenced via
${ACCOUNT_ID} / ${BROKER_HOST} / ${REGION} placeholders or via
"read from operator-workstation.env" / "from step N" pointers.
Real values live only in the operator's local env file + the
GitHub repo secrets store.
All internal links verified via a python url-walker (every relative
link resolves to an existing file).
|
Added concise setup docs aligned with the existing idempotent scripts (commit 5a66a85): Docs (concise, idempotent, no project credentials exposed):
FAQ wiki pages (independent, per the request):
Secret hygiene: every account ID / role ARN / bucket name / deployer key / contract address in the docs is a Net doc diff: -839 / +782 lines (cloud-setup.md alone shrunk 970 → 314 by collapsing inline bash into script pointers). |
Per operator request: the very-beginning cloud-account provisioning
(IAM users + role, DNS, SES, S3 buckets, instance profile) needs to
live in a separate doc so it stays reachable when:
- Adding a second AWS account (test instance, regional shard)
- Migrating to AliCloud / GCP / Tencent Cloud
- Re-bootstrapping after a teardown
- Auditing the identity surface
The previous condense pass collapsed those sections into cloud-setup.md's
slim §1-§3 — convenient for day-to-day operators but stripped the depth
needed for the migration / second-account use cases.
What changed:
docs/cloud-bootstrap.md — NEW, 365 lines
First-time, per-account, cloud-provider-portable bootstrap doc:
§1 Identities — four IAM principals, cloud-agnostic
§2 Domain + DNS — subdomain map, parent-zone confirm
§3 Email backend — SES domain verify + receipt rule +
inbound S3 bucket creation
§4 IAM users + roles — agentkeys-daemon + agentkeys-data-role +
per-data-class vault/memory roles
§5 Initial bucket policy — static-IAM variant (pre-OIDC)
§6 Instance profile — agentkeys-broker-host (EC2 optional)
§7 Security audit — strip legacy over-broad attached policies
(`AmazonS3FullAccess` checklist from the
pre-condense §3.4a)
§8 Cloud-provider port — AWS / AliCloud / GCP / Tencent Cloud
1:1 mapping table + migration playbook
Restores the operational depth (DKIM bulk-record bash, daemon user
create, role trust shape, broker-host instance profile, security
audit) that the previous condense pass removed. Adds the portability
framing (concept first, AWS-specific commands as ONE implementation)
so the doc is the durable reference for non-AWS deployments.
docs/cloud-setup.md — UPDATE, 314 → 202 lines
Refocus on what comes AFTER bootstrap: OIDC federation activation
(§1, was §4) + the setup-broker-host.sh runtime entry point (§2,
was §5) + cleanup (§3, was §6). Drop the duplicate §1-§3 prereqs;
add a clear cross-ref to cloud-bootstrap.md at the top. Section
numbers renumbered.
wiki/cloud-setup-faq.md — minor header tweak
The FAQ now covers both cloud-bootstrap.md and cloud-setup.md
(operators hit the same gotchas across both phases).
Constraints applied:
- Concise: every doc still fits in a few screens (bootstrap is
longest at 365 lines because it carries the actual provisioning
commands; cloud-setup.md is now 202 lines, down from 970 originally).
- Idempotent: every flow uses the existing idempotent helper scripts.
- No project credentials exposed: same placeholder convention as the
prior pass (${ACCOUNT_ID}, ${ZONE}, etc.). Verified via grep.
All internal links verified (python url-walker).
Closes #66.
Builds on top of #95 (ERC-7730 + EIP-712 typed-data signing). Will need to re-target main once #95 lands.
One file, no scaffolding
Per operator feedback: this PR adds exactly one file — .github/workflows/harness-ci.yml. It invokes the PRODUCTION harness scripts (
harness/v2-stage{1,2,3}-demo.sh) unchanged. The only delta from a prod operator's invocation is thatscripts/operator-workstation.envis materialized at CI-time with TEST resource names from GitHub secrets.Mirroring production on Heima mainnet — answer to "is this possible with identical .sol files?"
Yes. EVM contract addresses derive from
(deployer_address, nonce)(orCREATE2(salt)), and Solidity → bytecode is deterministic. The identicalcrates/agentkeys-chain/src/*.solfiles compiled by the identicalDeployAgentKeysV1.s.solscript and broadcast by a different deployer wallet on Heima mainnet produces a parallel set of contracts at new addresses on the production chain. Same code, same chain, isolated storage — the test contracts can't see or write to prod contract state.The deploy is one-shot per test-environment refresh (operator action), not per CI run — the test contract addresses are pinned in GitHub secrets so CI doesn't burn HEI on every push. To re-deploy (e.g., after a contract revision), the operator funds the test wallet, runs
AGENTKEYS_CHAIN=heima HEIMA_DEPLOYER_KEY_FILE=~/.agentkeys/heima-deployer-test.key bash scripts/heima-bring-up.sh, then updates theTEST_*_HEIMAsecrets.What the workflow does
rust-checks—cargo fmt --check,cargo clippy --workspace -- -D warnings,cargo test --workspace -- --test-threads=1. Covers ~600 tests including all the in-process broker integration tests that already mock STS + SES.preflight— gates the E2E job onTEST_OIDC_AWS_ROLE_ARNbeing set. Until the operator activates the test infra, the harness job is a clean::warning::skip.harness-e2e— assumes the test IAM role via GitHub Actions OIDC (no long-lived secrets), writes the test deployer key, overwritesscripts/operator-workstation.envwith test resource names, then runsharness/v2-stage1-demo.sh --skip-deploy --skip-email,harness/v2-stage2-demo.sh --stub --skip-build,harness/v2-stage3-demo.sh— the unmodified production scripts.Operator secrets (one-shot setup)
TEST_OIDC_AWS_ROLE_ARN(gate),TEST_ACCOUNT_ID,TEST_AWS_REGION,TEST_BROKER_HOST,TEST_{VAULT,MEMORY}_BUCKET,TEST_{VAULT,MEMORY,DATA}_ROLE_ARN,TEST_HEIMA_DEPLOYER_KEY, plus the sixTEST_*_CONTRACT_ADDRESS_HEIMAfor pre-deployed contracts. Full list documented in the workflow file header (no separate docs file per the "no new files" rule).What did NOT land
docs/cloud-setup.md+scripts/setup-broker-host.sh --issuer-url https://test-broker.litentry.org) but substitutes the test-suffixed identifiers everywhere. Same scripts, different inputs.cargo test --workspacefrom therust-checksjob (which exercises the same per-data-class isolation logic against in-process mocks).Test plan
bash -n .github/workflows/harness-ci.ymlis not applicable to YAML;python3 -c 'import yaml; yaml.safe_load(...)'parses clean.git diff origin/claude/gallant-ride-cec4d7 --stat→1 file changed, 260 insertions(+).TEST_OIDC_AWS_ROLE_ARN→ harness-e2e job runs end-to-end against Heima mainnet test contracts.