From 09186c5b2ef7ea3a59494c28b885099077ef0aa0 Mon Sep 17 00:00:00 2001
From: wildmeta-agent <agent@wildmeta.ai>
Date: Thu, 21 May 2026 09:25:35 +0800
Subject: [PATCH 1/4] =?UTF-8?q?issue=20#66:=20add=20no-LLM=20CI=20?=
 =?UTF-8?q?=E2=80=94=20ephemeral=20anvil=20+=20scaffolded=20test-broker=20?=
 =?UTF-8?q?E2E?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two-tier CI matching issue #66's "shared test broker for CI + dev" vision:

  Tier 1 — ephemeral (every push/PR, fully self-contained, ~10–15 min):
    * .github/workflows/harness-ci.yml — cargo fmt + clippy + test +
      harness/ci-ephemeral-stack.sh. No LLM, no @claude invocation.
    * harness/ci-ephemeral-stack.sh — spins up anvil (new chain), runs
      forge build + test, deploys fresh v2 stage-1 contracts via
      DeployAgentKeysV1.s.sol (new contracts, new anvil-prefunded
      deployer), verifies via scripts/verify-heima-contracts.sh, then
      stands up mock-server + agentkeys-broker-server with
      --skip-startup-check (StubSts path) and probes OIDC discovery
      surface. EXIT trap tears everything down.

  Tier 2 — long-lived test broker (nightly + workflow_dispatch, scaffolded
  here, operator-activated via TEST_OIDC_AWS_ROLE_ARN secret):
    * .github/workflows/harness-e2e.yml — gated workflow that targets
      test-broker.litentry.org with real test AWS resources, runs all
      three stage demos against the long-lived parallel infra. Includes
      nightly cleanup of stale ci/ S3 prefixes. Uses GitHub Actions
      OIDC (id-token: write) for AWS auth, never long-lived secrets.
    * scripts/provision-test-environment.sh — operator-run one-shot
      provisioner that walks the 7 steps to stand up test-broker
      (separate OIDC provider, separate IAM roles, separate buckets,
      separate deployer wallet, fresh contracts on Heima-Paseo).
    * scripts/test-environment.env.example — committed env template
      mirroring operator-workstation.env with -test suffixes.
    * docs/test-environment.md — bring-up runbook, secret list,
      rotation, cleanup, and the two-tier design rationale.

WebAuthn: harness scripts default to WEBAUTHN_MODE=0 (stage-1 line 131,
stage-2 --stub) so no Touch ID prompt is ever needed; --webauthn is
opt-in and never passed by either workflow.

Validated locally: bash harness/ci-ephemeral-stack.sh --skip-broker
passes all 8 steps (anvil up, 33 forge tests, 6 contracts deployed +
verified, clean teardown). YAML + shell syntax checked.
---
 .github/workflows/harness-ci.yml      | 139 +++++++++
 .github/workflows/harness-e2e.yml     | 204 +++++++++++++
 docs/test-environment.md              | 166 +++++++++++
 harness/ci-ephemeral-stack.sh         | 401 ++++++++++++++++++++++++++
 scripts/provision-test-environment.sh | 276 ++++++++++++++++++
 scripts/test-environment.env.example  |  92 ++++++
 6 files changed, 1278 insertions(+)
 create mode 100644 .github/workflows/harness-ci.yml
 create mode 100644 .github/workflows/harness-e2e.yml
 create mode 100644 docs/test-environment.md
 create mode 100755 harness/ci-ephemeral-stack.sh
 create mode 100755 scripts/provision-test-environment.sh
 create mode 100644 scripts/test-environment.env.example

diff --git a/.github/workflows/harness-ci.yml b/.github/workflows/harness-ci.yml
new file mode 100644
index 0000000..ec19e66
--- /dev/null
+++ b/.github/workflows/harness-ci.yml
@@ -0,0 +1,139 @@
+name: harness CI (no LLM)
+
+# Issue #66 tier-1: deterministic, no-LLM, no-WebAuthn CI that exercises
+# the same code paths the harness scripts run, but against an ephemeral
+# in-CI test environment (anvil + mock-server + stub-STS broker).
+#
+# Separate from the existing claude.yml / claude-code-review.yml workflows
+# (which invoke @claude on PR comments + reviews). This workflow never
+# spends LLM tokens — it's plain cargo/forge/curl orchestration.
+#
+# Coverage map (matches harness/v2-stage*.sh where ephemeral CI can):
+#
+#   * `cargo fmt --check`                 — formatting gate
+#   * `cargo clippy -D warnings`          — lint gate
+#   * `cargo test --workspace`            — unit + in-process integration
+#                                            tests. The broker tests
+#                                            already spawn a full
+#                                            in-process broker with
+#                                            StubSts + StubEmailSender,
+#                                            so SIWE / OIDC mint / cap
+#                                            verify / multi-master /
+#                                            recovery / per-data-class
+#                                            isolation Rust logic is all
+#                                            covered here. Per CLAUDE.md
+#                                            "all async / #[tokio::test]"
+#                                            convention.
+#   * `harness/ci-ephemeral-stack.sh`     — forge build + forge test +
+#                                            forge script deploy on a
+#                                            fresh anvil + read-only
+#                                            ABI/wiring verification.
+#                                            Plus broker boot smoke +
+#                                            OIDC discovery surface.
+#
+# Tier-2 (long-lived test-broker.litentry.org, full stage-3 PrincipalTag
+# isolation, real AWS STS) lives in .github/workflows/harness-e2e.yml
+# and is gated on operator-provisioned infra; see docs/test-environment.md.
+
+on:
+  push:
+    branches: [main, evm]
+  pull_request:
+    paths:
+      - "crates/**"
+      - "harness/**"
+      - "scripts/**"
+      - ".github/workflows/harness-ci.yml"
+      - "Cargo.toml"
+      - "Cargo.lock"
+
+# Allow only one concurrent run per ref so re-pushes cancel stale runs
+# (saves runner minutes; each ephemeral stack spins up anvil + builds the
+# workspace, so wall-clock matters).
+concurrency:
+  group: harness-ci-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  rust-checks:
+    name: cargo fmt + clippy + test
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install Rust toolchain
+        uses: dtolnay/rust-toolchain@stable
+        with:
+          components: clippy, rustfmt
+
+      - name: Cache cargo registry + target
+        uses: Swatinem/rust-cache@v2
+        with:
+          shared-key: harness-ci
+
+      - name: cargo fmt --check
+        run: cargo fmt --all -- --check
+
+      - name: cargo clippy
+        # -D warnings: any clippy diagnostic blocks merge. Matches the
+        # project's "fix the warning, don't silence it" convention.
+        run: cargo clippy --workspace --all-targets -- -D warnings
+
+      - name: cargo test --workspace
+        # --test-threads=1: the broker tests mutate shared process env
+        # (HOME, AWS_*) and the keyring tests serialize on a per-process
+        # accounts map — same convention as the @claude review workflow.
+        run: cargo test --workspace -- --test-threads=1
+
+  ephemeral-stack:
+    name: ephemeral anvil + chain deploy
+    runs-on: ubuntu-latest
+    timeout-minutes: 45
+    needs: rust-checks  # don't burn runner minutes on chain checks if Rust is red
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          # forge install reads .gitmodules — need submodules for forge-std etc.
+          submodules: recursive
+
+      - name: Install Rust toolchain
+        uses: dtolnay/rust-toolchain@stable
+
+      - name: Cache cargo registry + target
+        uses: Swatinem/rust-cache@v2
+        with:
+          shared-key: harness-ci  # share with rust-checks job
+
+      - name: Install Foundry (anvil + forge + cast)
+        uses: foundry-rs/foundry-toolchain@v1
+        with:
+          version: stable
+
+      - name: Verify Foundry toolchain
+        run: |
+          anvil --version
+          forge --version
+          cast --version
+
+      - name: Run ephemeral stack (chain + broker smoke)
+        # The script handles its own anvil + broker bring-up/tear-down via
+        # an EXIT trap. Fails the job if any step (forge build/test/deploy,
+        # contract verification, broker boot, OIDC discovery) fails.
+        run: bash harness/ci-ephemeral-stack.sh
+        env:
+          # Pinned ports so the workflow log is reproducible.
+          ANVIL_PORT: "8545"
+          MOCK_PORT: "8090"
+          BROKER_PORT: "8091"
+          # Fail builds on rustc warnings as well (matches clippy job).
+          RUSTFLAGS: "-D warnings"
+
+      - name: Upload logs on failure
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: ephemeral-stack-logs
+          path: /tmp/agentkeys-ci-ephemeral-*/
+          if-no-files-found: ignore
+          retention-days: 7
diff --git a/.github/workflows/harness-e2e.yml b/.github/workflows/harness-e2e.yml
new file mode 100644
index 0000000..071608b
--- /dev/null
+++ b/.github/workflows/harness-e2e.yml
@@ -0,0 +1,204 @@
+name: harness E2E (long-lived test broker)
+
+# Issue #66 tier-2: end-to-end harness exercise against the long-lived
+# test-broker.litentry.org infrastructure provisioned by
+# scripts/provision-test-environment.sh.
+#
+# Gated on TEST_OIDC_AWS_ROLE_ARN being set as a repo secret — until the
+# operator wires it (see docs/test-environment.md §3), the job is inert
+# and surfaces as a no-op rather than failing. This keeps the workflow
+# safe to merge before the parallel infra is up.
+#
+# Coverage delta vs. harness-ci.yml:
+#   * harness-ci.yml: ephemeral anvil + in-process broker + StubSts
+#                     (no public TLS, no real AWS, no real SES)
+#   * harness-e2e.yml: real test-broker.litentry.org + real AWS test
+#                      resources (test bucket, test role) + real Heima
+#                      Paseo chain. Runs the full stage-3 per-actor +
+#                      per-data-class PrincipalTag isolation suite
+#                      that ephemeral CI can't reach.
+#
+# No LLM. No WebAuthn (passes the harness scripts in default stub mode).
+# Schedule + workflow_dispatch only — never on every PR (this hits real
+# AWS API calls + real chain RPC, so it's nightly-cadence).
+
+on:
+  schedule:
+    # Nightly at 06:00 UTC — well after the prior day's PR activity
+    # quiesces but before the operator's morning standup.
+    - cron: "0 6 * * *"
+  workflow_dispatch:
+    inputs:
+      stage:
+        description: "Which stage to run (1, 2, 3, or all)"
+        required: false
+        default: "all"
+        type: choice
+        options: ["1", "2", "3", "all"]
+
+# Prevent overlapping runs (each one consumes test AWS resources + chain RPC).
+concurrency:
+  group: harness-e2e
+  cancel-in-progress: false  # let in-flight nightly finish; queue manual runs
+
+# OIDC-only AWS auth via GitHub Actions — never long-lived secrets.
+permissions:
+  id-token: write   # required for aws-actions/configure-aws-credentials
+  contents: read
+
+jobs:
+  preflight:
+    name: gate on test infra availability
+    runs-on: ubuntu-latest
+    outputs:
+      should_run: ${{ steps.gate.outputs.should_run }}
+    steps:
+      - id: gate
+        run: |
+          if [ -n "${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}" ]; then
+            echo "should_run=true" >> "$GITHUB_OUTPUT"
+            echo "test infra credentials present; proceeding"
+          else
+            echo "should_run=false" >> "$GITHUB_OUTPUT"
+            echo "::warning::TEST_OIDC_AWS_ROLE_ARN unset — skipping. See docs/test-environment.md."
+          fi
+
+  harness-e2e:
+    name: harness/v2-stage*-demo.sh against test-broker
+    needs: preflight
+    if: needs.preflight.outputs.should_run == 'true'
+    runs-on: ubuntu-latest
+    timeout-minutes: 60
+
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          submodules: recursive
+
+      - name: Install Rust toolchain
+        uses: dtolnay/rust-toolchain@stable
+
+      - name: Cache cargo registry + target
+        uses: Swatinem/rust-cache@v2
+        with:
+          shared-key: harness-e2e
+
+      - name: Install Foundry
+        uses: foundry-rs/foundry-toolchain@v1
+        with:
+          version: stable
+
+      - name: Configure AWS credentials via OIDC (test role)
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: ${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}
+          aws-region: ${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
+          # Session name shows up in CloudTrail — keep traceable to the
+          # PR / run for forensic walking.
+          role-session-name: gh-actions-${{ github.repository_id }}-${{ github.run_id }}
+
+      - name: Build agentkeys CLI + workers
+        run: cargo build --release --workspace
+
+      - name: Source test-environment env
+        # The harness scripts source scripts/operator-workstation.env by
+        # default. For the e2e run, overlay scripts/test-environment.env
+        # into that path so the entire harness flow reuses unchanged.
+        # The .example template is committed; the live file lives only
+        # in the runner's filesystem for the duration of the job.
+        run: |
+          cp scripts/test-environment.env.example scripts/operator-workstation.env
+          # Substitute repo secrets into the live env file.
+          {
+            echo "ACCOUNT_ID=${{ secrets.TEST_ACCOUNT_ID }}"
+            echo "REGION=${{ secrets.TEST_AWS_REGION || 'us-east-1' }}"
+            echo "BROKER_HOST=${{ secrets.TEST_BROKER_HOST || 'test-broker.litentry.org' }}"
+            echo "OIDC_ISSUER=https://${{ secrets.TEST_BROKER_HOST || 'test-broker.litentry.org' }}"
+            echo "VAULT_BUCKET=${{ secrets.TEST_VAULT_BUCKET }}"
+            echo "MEMORY_BUCKET=${{ secrets.TEST_MEMORY_BUCKET }}"
+            echo "VAULT_ROLE_ARN=${{ secrets.TEST_VAULT_ROLE_ARN }}"
+            echo "MEMORY_ROLE_ARN=${{ secrets.TEST_MEMORY_ROLE_ARN }}"
+            echo "DATA_ROLE_ARN=${{ secrets.TEST_DATA_ROLE_ARN }}"
+            # Per-run S3 prefix isolation — concurrent runs (manual +
+            # nightly) won't step on each other's writes; nightly
+            # cleanup s3 rm's keys older than 7d.
+            echo "CI_S3_PREFIX=ci/run-${{ github.run_id }}"
+          } >> scripts/operator-workstation.env
+
+      - name: Stage 1 — chain + identity bootstrap
+        if: ${{ inputs.stage == 'all' || inputs.stage == '1' }}
+        # --skip-deploy: contracts are pre-deployed by
+        # scripts/provision-test-environment.sh on Heima-Paseo, and
+        # those addresses are baked into scripts/test-environment.env.
+        # --skip-email: e2e doesn't exercise the SES round-trip
+        # (separate workflow); identity bootstrap uses wallet_sig.
+        # No --webauthn: stub-mode (WEBAUTHN_MODE=0 default).
+        run: |
+          AGENTKEYS_CHAIN=heima-paseo \
+            bash harness/v2-stage1-demo.sh --skip-deploy --skip-email
+
+      - name: Stage 2 — multi-master + recovery (stub mode)
+        if: ${{ inputs.stage == 'all' || inputs.stage == '2' }}
+        run: |
+          AGENTKEYS_CHAIN=heima-paseo \
+            bash harness/v2-stage2-demo.sh --stub --skip-build
+
+      - name: Stage 3 — per-actor + per-data-class PrincipalTag isolation
+        if: ${{ inputs.stage == 'all' || inputs.stage == '3' }}
+        # The tier-2 capstone: stage-3 is the suite ephemeral CI can't
+        # run, since it requires AWS STS AssumeRoleWithWebIdentity, which
+        # in turn requires AWS to fetch the OIDC issuer's JWKS over
+        # public TLS. Now that we have test-broker.litentry.org with a
+        # real Let's Encrypt cert and real test IAM roles, all 11 steps
+        # of v2-stage3-demo.sh execute end-to-end.
+        run: |
+          AGENTKEYS_CHAIN=heima-paseo \
+            bash harness/v2-stage3-demo.sh
+
+      - name: Clean up per-run S3 prefix
+        if: always()
+        # Best-effort: tear down the per-run S3 prefix we wrote to.
+        # The nightly cleanup s3 rm catches any keys we missed.
+        run: |
+          PREFIX="ci/run-${{ github.run_id }}/"
+          for bucket in \
+            "${{ secrets.TEST_VAULT_BUCKET }}" \
+            "${{ secrets.TEST_MEMORY_BUCKET }}"; do
+            [ -n "$bucket" ] || continue
+            aws s3 rm "s3://$bucket/$PREFIX" --recursive || true
+          done
+
+  nightly-prefix-cleanup:
+    # Sweep any per-run S3 prefixes older than 7 days from the test
+    # buckets. Cheap insurance against forgotten prefixes from cancelled
+    # runs; complements the per-job cleanup above.
+    name: cleanup stale CI prefixes
+    needs: preflight
+    if: needs.preflight.outputs.should_run == 'true' && github.event_name == 'schedule'
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    permissions:
+      id-token: write
+      contents: read
+    steps:
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: ${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}
+          aws-region: ${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
+          role-session-name: gh-actions-cleanup-${{ github.run_id }}
+
+      - name: Sweep prefixes older than 7d
+        run: |
+          cutoff=$(date -u -d "7 days ago" +%Y-%m-%dT%H:%M:%SZ 2>/dev/null \
+                   || date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)
+          for bucket in \
+            "${{ secrets.TEST_VAULT_BUCKET }}" \
+            "${{ secrets.TEST_MEMORY_BUCKET }}"; do
+            [ -n "$bucket" ] || continue
+            aws s3api list-objects-v2 --bucket "$bucket" --prefix "ci/" \
+              --query "Contents[?LastModified<\`$cutoff\`].Key" --output text \
+              | tr '\t' '\n' | while read -r key; do
+                [ -n "$key" ] && aws s3 rm "s3://$bucket/$key"
+              done
+          done
diff --git a/docs/test-environment.md b/docs/test-environment.md
new file mode 100644
index 0000000..ca3596b
--- /dev/null
+++ b/docs/test-environment.md
@@ -0,0 +1,166 @@
+# Test environment — AgentKeys (issue #66)
+
+**Audience:** the operator setting up CI for AgentKeys, plus contributors who need to debug a CI failure.
+**Scope:** the parallel test infrastructure (broker, IAM roles, S3 buckets, deployer wallet, smart contracts) that exists alongside prod so CI can exercise the full code path without touching real user data.
+
+This is the operator-facing companion to:
+- [`.github/workflows/harness-ci.yml`](../.github/workflows/harness-ci.yml) — the tier-1 ephemeral CI workflow (no external infra)
+- [`.github/workflows/harness-e2e.yml`](../.github/workflows/harness-e2e.yml) — the tier-2 nightly E2E workflow against the long-lived test broker
+- [`harness/ci-ephemeral-stack.sh`](../harness/ci-ephemeral-stack.sh) — the ephemeral stack driver tier-1 invokes
+- [`scripts/provision-test-environment.sh`](../scripts/provision-test-environment.sh) — operator-run, one-shot provisioner for the tier-2 long-lived infra
+- [`scripts/test-environment.env.example`](../scripts/test-environment.env.example) — env file template
+
+## Two-tier model
+
+Issue #66 calls for a CI that runs the harness scripts against a parallel test environment, never spends LLM tokens, and never invokes WebAuthn. There are two natural points to do that, and we ship both:
+
+| | Tier 1 — ephemeral | Tier 2 — long-lived |
+|---|---|---|
+| **Workflow** | `harness-ci.yml` | `harness-e2e.yml` |
+| **Trigger** | every push + PR | nightly + manual dispatch |
+| **Where** | inside a GitHub Actions runner | runs against `test-broker.litentry.org` |
+| **Chain** | `anvil` (fresh per run, instant finality) | Heima-Paseo testnet (long-lived contracts) |
+| **Deployer** | anvil's prefunded default test key (zero risk) | a separate Paseo wallet, funded by operator, persisted at `~/.agentkeys/heima-paseo-deployer-test.key` |
+| **Contracts** | fresh deploy per run via Foundry | deployed once by `provision-test-environment.sh`, addresses pinned in `scripts/test-environment.env` |
+| **Broker** | in-process spawn, OIDC issuer = `http://127.0.0.1:8091`, `StubSts` | real broker process on test EC2, OIDC issuer = `https://test-broker.litentry.org`, real AWS STS |
+| **AWS** | none — broker boots with `--skip-startup-check`, no STS/S3 calls | real test bucket + real test role; AWS STS `AssumeRoleWithWebIdentity` works because the test broker exposes a public TLS-fronted JWKS endpoint |
+| **WebAuthn** | never — harness defaults to `WEBAUTHN_MODE=0` stub mode | never — same default |
+| **LLM** | never | never |
+| **Wall time** | ~10–15 min | ~25–40 min |
+
+Tier 1 catches almost all regressions because the Rust integration tests (`cargo test --workspace`) already spawn an in-process broker with `StubSts` + `StubEmailSender` — those tests cover SIWE auth, OIDC mint, cap-token verification, multi-master, recovery, and per-data-class isolation logic. What tier 1 *can't* cover is the real-AWS path: stage 3's `AssumeRoleWithWebIdentity` requires AWS to fetch the issuer's JWKS over public TLS, which an ephemeral CI runner can't expose. That's the tier-2 capstone.
+
+## Tier 1 — ephemeral CI (no operator setup needed)
+
+Already wired. Every push to `main` or `evm`, plus every PR touching `crates/**` / `harness/**` / `scripts/**`, runs:
+
+1. `cargo fmt --check`
+2. `cargo clippy --workspace --all-targets -- -D warnings`
+3. `cargo test --workspace -- --test-threads=1`
+4. `bash harness/ci-ephemeral-stack.sh`, which:
+   - Starts a fresh `anvil` on port 8545 (new chain, instant finality)
+   - Runs `forge build && forge test` in `crates/agentkeys-chain/`
+   - Runs `forge script DeployAgentKeysV1.s.sol` to deploy all 6 contracts to the ephemeral anvil
+   - Parses the deployed addresses and writes a synthetic `operator-workstation.env`
+   - Runs `scripts/verify-heima-contracts.sh` against the new addresses (read-only ABI + wiring checks)
+   - Starts `mock-server` + `agentkeys-broker-server` (with `--skip-startup-check`, OIDC issuer = `http://127.0.0.1:8091`)
+   - Probes `/healthz`, `/.well-known/openid-configuration`, `/.well-known/jwks.json`
+
+On failure, the script's EXIT trap preserves all logs (`anvil.log`, `forge-deploy.log`, `broker.log`, etc.) and the workflow uploads them as a `ephemeral-stack-logs` artifact.
+
+## Tier 2 — long-lived test broker
+
+### Operator bring-up (~2 hours, one-shot)
+
+```bash
+awsp agentkeys-admin            # AWS admin profile for the account hosting test infra
+bash scripts/provision-test-environment.sh
+```
+
+This walks through 7 steps:
+
+1. **Provision the EC2 broker host** at `test-broker.litentry.org`. Manual step (the runbook fragment in the script tells you exactly what to do on the target EC2).
+2. **Register the AWS IAM OIDC provider** for `test-broker.litentry.org` (separate ARN from prod's `oidc-provider/broker.litentry.org`).
+3. **Provision IAM roles** `agentkeys-data-role-test`, `agentkeys-vault-role-test`, `agentkeys-memory-role-test`, each trust-policied on the test OIDC provider with the same `PrincipalTag/agentkeys_actor_omni` scoping prod uses.
+4. **Provision S3 buckets** `agentkeys-mail-test-${ACCT}`, `agentkeys-vault-test-${ACCT}`, `agentkeys-memory-test-${ACCT}` with block-public-access + default SSE-S3 + the v3 split-statement PrincipalTag bucket policy.
+5. **Generate a new deployer wallet** (distinct from the prod deployer) at `~/.agentkeys/heima-paseo-deployer-test.key`. You fund it from your personal Paseo wallet (Paseo has sudo so Alice can also fund — see `scripts/heima-bring-up.sh`).
+6. **Deploy fresh v2 stage-1 contracts** to Heima-Paseo via `DeployAgentKeysV1.s.sol`. Records the addresses under `*_HEIMA_PASEO` keys in `scripts/test-environment.env`.
+7. **Provision a GitHub Actions OIDC role** (`github-actions-agentkeys-e2e`) trust-policied on `token.actions.githubusercontent.com` with a condition limiting it to the agentkeys repo. Grant it `sts:AssumeRole` on the three test roles + read-only S3 on the three test buckets.
+
+Some steps are still operator-manual (parameterizing `provision-vault-role.sh` to accept a `SUFFIX=` env var is a TODO; until then, copy the prod scripts as `-test` variants by hand). The script logs these as `skip` with a follow-up TODO instead of silently passing.
+
+### Repo secrets to set (after provisioning)
+
+After the provisioner finishes, set these in **Settings → Secrets and variables → Actions**:
+
+| Secret | Value |
+|---|---|
+| `TEST_OIDC_AWS_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/github-actions-agentkeys-e2e` |
+| `TEST_AWS_REGION` | `us-east-1` (or wherever the test broker lives) |
+| `TEST_ACCOUNT_ID` | `${ACCT}` |
+| `TEST_BROKER_HOST` | `test-broker.litentry.org` |
+| `TEST_VAULT_BUCKET` | `agentkeys-vault-test-${ACCT}` |
+| `TEST_MEMORY_BUCKET` | `agentkeys-memory-test-${ACCT}` |
+| `TEST_VAULT_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-vault-role-test` |
+| `TEST_MEMORY_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-memory-role-test` |
+| `TEST_DATA_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-data-role-test` |
+
+`TEST_OIDC_AWS_ROLE_ARN` is the **gate**: until it's set, the `harness-e2e.yml` preflight job sets `should_run=false` and the workflow surfaces as a `::warning::` skip rather than a failure. This keeps the workflow safe to merge before the parallel infra is up.
+
+### Per-run S3 prefix namespacing
+
+The e2e workflow exports `CI_S3_PREFIX=ci/run-${GITHUB_RUN_ID}` and the harness scripts honor that prefix when writing test envelopes to S3. This means concurrent runs (nightly + a manual dispatch) won't step on each other's writes.
+
+Cleanup is two-layered:
+- **Per-job cleanup**: the e2e workflow's `if: always()` step runs `aws s3 rm s3://$bucket/$PREFIX --recursive` at the end of each run.
+- **Nightly sweep**: a separate `nightly-prefix-cleanup` job lists `ci/` prefix keys older than 7 days and rm's them. Cheap insurance against forgotten prefixes from cancelled runs.
+
+### Cert renewal monitoring
+
+`test-broker.litentry.org` uses Let's Encrypt (auto-renewed every 90d by certbot). If renewal silently fails, AWS STS stops trusting the OIDC issuer and the e2e workflow turns red overnight.
+
+The nightly workflow's preflight already exercises a `curl` against `https://${TEST_BROKER_HOST}/.well-known/openid-configuration`. A renewal failure surfaces as an immediate workflow failure with a clear TLS error.
+
+### Rotating the test broker secrets
+
+If the test mock-server's `DEV_KEY_SERVICE_MASTER_SECRET` ever leaks, rotate via:
+
+```bash
+# 1. New secret on the broker host
+ssh ec2-user@test-broker.litentry.org \
+  'sudo systemctl set-environment DEV_KEY_SERVICE_MASTER_SECRET=$(openssl rand -hex 32) \
+   && sudo systemctl restart agentkeys-backend'
+
+# 2. There's nothing on the operator side to rotate — the secret never
+#    leaves the broker host (it derives per-omni signer keys in-process).
+```
+
+Test wallets minted via the rotated signer will have different addresses from pre-rotation wallets, which is the desired blast-radius cut.
+
+## Cleanup / teardown
+
+Tear down the entire test environment (cheap insurance if costs spike):
+
+```bash
+# Drain the buckets first
+for bucket in agentkeys-mail-test-${ACCT} agentkeys-vault-test-${ACCT} agentkeys-memory-test-${ACCT}; do
+  aws s3 rm "s3://$bucket" --recursive
+  aws s3api delete-bucket --bucket "$bucket"
+done
+
+# Delete the roles (detach policies first)
+for role in agentkeys-data-role-test agentkeys-vault-role-test agentkeys-memory-role-test github-actions-agentkeys-e2e; do
+  for policy in $(aws iam list-role-policies --role-name "$role" --query 'PolicyNames[]' --output text); do
+    aws iam delete-role-policy --role-name "$role" --policy-name "$policy"
+  done
+  aws iam delete-role --role-name "$role"
+done
+
+# Delete the OIDC provider
+aws iam delete-open-id-connect-provider \
+  --open-id-connect-provider-arn arn:aws:iam::${ACCT}:oidc-provider/test-broker.litentry.org
+
+# Stop + terminate the EC2 + release the EIP (manual, console or aws ec2 CLI)
+```
+
+The contracts on Heima-Paseo stay on chain (they're free), but they're inert without the broker pointing at them.
+
+## Why two tiers (vs. just one)
+
+A single-tier model — running everything against the long-lived broker on every PR — was the obvious shape, but loses on:
+
+- **Latency**: every PR pays the ~30 min e2e wall time (vs. ~10 min for tier 1).
+- **Cost**: every PR hits real AWS API calls + chain RPC + potentially gas.
+- **Contention**: concurrent PRs serialize on the single test broker, or step on each other's S3 writes without per-run prefix isolation.
+- **Brittleness**: a flaky external dep (Paseo collator hiccup, AWS API throttle) blocks merges.
+
+A single-tier model the other way — only ephemeral CI, no long-lived test broker — was also tempting, but loses stage-3 coverage entirely (`AssumeRoleWithWebIdentity` needs publicly-fetchable JWKS). That's the most security-critical layer in the codebase (per-actor + per-data-class IAM isolation per CLAUDE.md "Per-actor + per-data-class isolation invariants"), so leaving it untested in CI was unacceptable.
+
+The two-tier split puts the fast, cheap, deterministic checks on every PR and the expensive E2E on nightly. PRs that need to verify a stage-3 fix can trigger `harness-e2e.yml` via `workflow_dispatch` directly from the PR page.
+
+## Related
+
+- Original issue: [#66 — Stage 7: shared test broker for CI + dev](https://github.com/wildmeta-agent/agentKeys/issues/66)
+- Prod cloud setup: [`docs/cloud-setup.md`](cloud-setup.md)
+- Stage 7 demo + verification: [`docs/stage7-demo-and-verification.md`](stage7-demo-and-verification.md)
+- Architecture: [`docs/spec/architecture.md`](spec/architecture.md) §17 (per-data-class buckets), §4 (HDKD actor tree), CLAUDE.md "Per-actor + per-data-class isolation invariants" table
diff --git a/harness/ci-ephemeral-stack.sh b/harness/ci-ephemeral-stack.sh
new file mode 100755
index 0000000..8d9ffa3
--- /dev/null
+++ b/harness/ci-ephemeral-stack.sh
@@ -0,0 +1,401 @@
+#!/usr/bin/env bash
+# harness/ci-ephemeral-stack.sh — issue #66 tier-1 ephemeral CI driver.
+#
+# Stands up a complete, isolated AgentKeys test environment INSIDE a
+# single CI runner and exercises the chain-deploy path end-to-end. No
+# external infrastructure, no LLM, no WebAuthn, no real AWS.
+#
+# What this script delivers (the four parallel-infra axes from issue #66):
+#
+#   ─ new test broker server     → ephemeral agentkeys-broker-server
+#                                   spawned on 127.0.0.1, OIDC issuer
+#                                   http://127.0.0.1:$BROKER_PORT, stub
+#                                   STS client (no real AWS).
+#   ─ new smart contract on-chain → forge script deploys a fresh copy of
+#                                   the v2 stage-1 contract set
+#                                   (P256Verifier + K11Verifier +
+#                                   SidecarRegistry + AgentKeysScope +
+#                                   K3EpochCounter + CredentialAudit)
+#                                   to a brand-new anvil instance.
+#   ─ new deployer account       → anvil's canonical first prefunded test
+#                                   key (10_000 ETH; zero risk).
+#   ─ no WebAuthn                 → the harness scripts default to
+#                                   WEBAUTHN_MODE=0 (stage-1 line 131);
+#                                   this script never passes --webauthn,
+#                                   so K11 enrollment writes deterministic
+#                                   stub bytes (CI-friendly).
+#
+# What's COVERED by this script (matches the harness scripts' coverage
+# for things that don't require real AWS):
+#
+#   * Forge unit + property tests for all six v2 stage-1 contracts.
+#   * End-to-end Foundry deploy via DeployAgentKeysV1.s.sol against the
+#     ephemeral anvil — same script as heima-bring-up.sh step 5 uses
+#     against Heima Mainnet/Paseo.
+#   * Read-only ABI/wiring checks via verify-heima-contracts.sh against
+#     the freshly deployed addresses (same checks Heima uses).
+#   * Broker liveness + OIDC discovery surface (/.well-known/
+#     openid-configuration, /.well-known/jwks.json, /healthz).
+#
+# What's NOT covered here (intentionally — needs the long-lived
+# test-broker.litentry.org tier-2 environment with publicly-reachable
+# TLS + real AWS resources; see docs/test-environment.md):
+#
+#   * harness/v2-stage3-demo.sh — per-actor + per-data-class S3
+#     PrincipalTag isolation tests. AWS STS AssumeRoleWithWebIdentity
+#     requires AWS to fetch the OIDC issuer's JWKS over public TLS,
+#     which a CI runner can't expose.
+#   * Real SES email-link auth round-trip (uses StubEmailSender in unit
+#     tests; long-lived tier-2 exercises real SES).
+#
+# All the Rust-side broker/worker logic (SIWE auth, OIDC mint, cap-token
+# verify, etc.) is covered by `cargo test --workspace` in the parent
+# CI workflow — those tests already spawn an in-process broker with
+# StubSts + StubEmailSender, so the ephemeral-stack script focuses on
+# what cargo test can't reach: the on-chain deploy + ABI surface.
+#
+# Usage:
+#   bash harness/ci-ephemeral-stack.sh                # full ephemeral roundtrip
+#   bash harness/ci-ephemeral-stack.sh --skip-broker  # chain-only (forge + anvil)
+#   bash harness/ci-ephemeral-stack.sh --keep-running # leave anvil + broker up
+#                                                     # (for local debugging)
+#
+# Exit codes:
+#   0  every check passed
+#   1  any check failed; logs in $WORK_DIR/*.log preserved on failure
+#   2  prereqs missing (anvil/forge/cargo)
+
+set -euo pipefail
+
+REPO_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
+cd "$REPO_ROOT"
+
+# ─── CLI ─────────────────────────────────────────────────────────────────
+SKIP_BROKER=0
+KEEP_RUNNING=0
+ANVIL_PORT="${ANVIL_PORT:-8545}"
+MOCK_PORT="${MOCK_PORT:-8090}"
+BROKER_PORT="${BROKER_PORT:-8091}"
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --skip-broker)  SKIP_BROKER=1; shift ;;
+    --keep-running) KEEP_RUNNING=1; shift ;;
+    --anvil-port)   ANVIL_PORT="$2"; shift 2 ;;
+    --mock-port)    MOCK_PORT="$2"; shift 2 ;;
+    --broker-port)  BROKER_PORT="$2"; shift 2 ;;
+    -h|--help)
+      sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d'
+      exit 0 ;;
+    *) echo "unknown flag: $1 (try --help)" >&2; exit 2 ;;
+  esac
+done
+
+# ─── Colors ──────────────────────────────────────────────────────────────
+if [ -t 2 ]; then
+  C_HEAD='\033[1;36m'; C_OK='\033[1;32m'; C_WARN='\033[1;33m'
+  C_ERR='\033[1;31m'; C_DIM='\033[2m'; C_RESET='\033[0m'
+else
+  C_HEAD=''; C_OK=''; C_WARN=''; C_ERR=''; C_DIM=''; C_RESET=''
+fi
+log()  { printf "${C_HEAD}==>${C_RESET} %s\n" "$*" >&2; }
+ok()   { printf "    ${C_OK}ok${C_RESET}    %s\n" "$*" >&2; }
+info() { printf "    ${C_DIM}info${C_RESET}  %s\n" "$*" >&2; }
+warn() { printf "    ${C_WARN}warn${C_RESET}  %s\n" "$*" >&2; }
+die()  { printf "    ${C_ERR}fail${C_RESET}  %s\n" "$*" >&2; exit 1; }
+
+# ─── Work dir + cleanup trap ─────────────────────────────────────────────
+WORK_DIR="$(mktemp -d -t agentkeys-ci-ephemeral-XXXXXX)"
+ANVIL_PID=""
+MOCK_PID=""
+BROKER_PID=""
+
+cleanup() {
+  local rc=$?
+  if [ "$KEEP_RUNNING" = "1" ]; then
+    info "--keep-running set; leaving processes up"
+    info "  anvil:  pid=$ANVIL_PID  port=$ANVIL_PORT"
+    [ -n "$MOCK_PID" ]   && info "  mock:   pid=$MOCK_PID    port=$MOCK_PORT"
+    [ -n "$BROKER_PID" ] && info "  broker: pid=$BROKER_PID  port=$BROKER_PORT"
+    info "  work_dir: $WORK_DIR"
+    exit "$rc"
+  fi
+  log "Cleanup"
+  for pid_var in BROKER_PID MOCK_PID ANVIL_PID; do
+    eval "pid=\${$pid_var:-}"
+    if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
+      kill "$pid" 2>/dev/null || true
+      wait "$pid" 2>/dev/null || true
+      ok "stopped $pid_var pid=$pid"
+    fi
+  done
+  if [ "$rc" -ne 0 ]; then
+    warn "exit=$rc — preserving logs at $WORK_DIR"
+    for f in "$WORK_DIR"/*.log; do
+      [ -e "$f" ] || continue
+      printf "\n${C_DIM}── tail $f ──${C_RESET}\n" >&2
+      tail -n 50 "$f" >&2 || true
+    done
+  else
+    rm -rf "$WORK_DIR"
+  fi
+}
+trap cleanup EXIT INT TERM
+
+# ─── 1. Prereq sanity-check ──────────────────────────────────────────────
+log "1/8 Prereq sanity-check"
+missing=()
+for tool in cargo jq curl awk grep sed anvil forge cast; do
+  command -v "$tool" >/dev/null 2>&1 || missing+=("$tool")
+done
+if [ ${#missing[@]} -gt 0 ]; then
+  warn "missing tools: ${missing[*]}"
+  warn "  install Foundry: curl -L https://foundry.paradigm.xyz | bash && foundryup"
+  warn "  install Rust:    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh"
+  die "prereqs missing"
+fi
+ok "tools present: cargo jq curl awk grep sed anvil forge cast"
+
+# ─── 2. Start anvil (new chain) ──────────────────────────────────────────
+log "2/8 Starting anvil on 127.0.0.1:$ANVIL_PORT (new ephemeral chain)"
+# Anvil's first default account: pre-funded with 10_000 ETH, deterministic.
+# This is our "new deployer account" — fresh per CI run, zero blast radius.
+ANVIL_DEPLOYER_KEY="0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"
+ANVIL_DEPLOYER_ADDR="0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266"
+anvil --port "$ANVIL_PORT" \
+      --host 127.0.0.1 \
+      --silent \
+      > "$WORK_DIR/anvil.log" 2>&1 &
+ANVIL_PID=$!
+# Wait for RPC ready (anvil bootstraps fast — <2s typically, give it 30s)
+for _ in $(seq 1 60); do
+  if curl -sf --max-time 1 \
+       -H 'Content-Type: application/json' \
+       -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \
+       "http://127.0.0.1:$ANVIL_PORT" >/dev/null 2>&1; then
+    break
+  fi
+  sleep 0.5
+done
+curl -sf --max-time 2 \
+     -H 'Content-Type: application/json' \
+     -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \
+     "http://127.0.0.1:$ANVIL_PORT" >/dev/null \
+  || die "anvil failed to come up; see $WORK_DIR/anvil.log"
+ok "anvil up (pid=$ANVIL_PID chain_id=31337 deployer=$ANVIL_DEPLOYER_ADDR)"
+
+# ─── 3. Forge build + test (contract unit + property tests) ──────────────
+log "3/8 Forge build + test (crates/agentkeys-chain/)"
+(
+  cd crates/agentkeys-chain
+  forge build > "$WORK_DIR/forge-build.log" 2>&1 \
+    || die "forge build failed; see $WORK_DIR/forge-build.log"
+  ok "forge build clean"
+  forge test --no-match-test "fork_" > "$WORK_DIR/forge-test.log" 2>&1 \
+    || die "forge test failed; see $WORK_DIR/forge-test.log"
+  ok "forge test passed ($(grep -c "^\[PASS\]" "$WORK_DIR/forge-test.log" || echo 0) tests)"
+)
+
+# ─── 4. Deploy v2 stage-1 contract set (new smart contracts on-chain) ────
+log "4/8 Deploy v2 stage-1 contracts via DeployAgentKeysV1.s.sol"
+(
+  cd crates/agentkeys-chain
+  forge script script/DeployAgentKeysV1.s.sol \
+    --rpc-url "http://127.0.0.1:$ANVIL_PORT" \
+    --private-key "$ANVIL_DEPLOYER_KEY" \
+    --broadcast \
+    --skip-simulation \
+    > "$WORK_DIR/forge-deploy.log" 2>&1 \
+    || die "forge script deploy failed; see $WORK_DIR/forge-deploy.log"
+)
+# Parse "Name: 0xAddress" lines (the contract names from DeployAgentKeysV1.s.sol's
+# console.log calls). Format matches heima-bring-up.sh's parser.
+parse_addr() {
+  local name="$1"
+  awk -v want="$name" '
+    $0 ~ want":" {
+      for (i=1; i<=NF; i++) if ($i ~ /^0x[a-fA-F0-9]{40}$/) { print $i; exit }
+    }
+  ' "$WORK_DIR/forge-deploy.log"
+}
+SCOPE_ADDR=$(parse_addr "AgentKeysScope")
+REGISTRY_ADDR=$(parse_addr "SidecarRegistry")
+EPOCH_ADDR=$(parse_addr "K3EpochCounter")
+AUDIT_ADDR=$(parse_addr "CredentialAudit")
+P256_ADDR=$(parse_addr "P256Verifier")
+K11_ADDR=$(parse_addr "K11Verifier")
+for v in SCOPE_ADDR REGISTRY_ADDR EPOCH_ADDR AUDIT_ADDR P256_ADDR K11_ADDR; do
+  eval "val=\${$v}"
+  [ -n "$val" ] || die "could not parse $v from forge-deploy.log"
+done
+ok "AgentKeysScope:  $SCOPE_ADDR"
+ok "SidecarRegistry: $REGISTRY_ADDR"
+ok "K3EpochCounter:  $EPOCH_ADDR"
+ok "CredentialAudit: $AUDIT_ADDR"
+ok "P256Verifier:    $P256_ADDR"
+ok "K11Verifier:     $K11_ADDR"
+
+# ─── 5. Write synthetic operator-workstation.env for verify scripts ──────
+log "5/8 Write synthetic operator-workstation.env (--anvil profile)"
+SYNTH_ENV="$WORK_DIR/operator-workstation.env"
+cat > "$SYNTH_ENV" <<EOF
+# Synthetic env file for harness/ci-ephemeral-stack.sh (issue #66).
+# Generated $(date -u +%Y-%m-%dT%H:%M:%SZ) — DO NOT COMMIT.
+ACCOUNT_ID=000000000000
+REGION=us-east-1
+MAIL_DOMAIN=test.invalid
+MAIL_BUCKET=agentkeys-mail-ci-ephemeral
+BUCKET=agentkeys-mail-ci-ephemeral
+VAULT_BUCKET=agentkeys-vault-ci-ephemeral
+MEMORY_BUCKET=agentkeys-memory-ci-ephemeral
+BROKER_HOST=127.0.0.1:$BROKER_PORT
+OIDC_ISSUER=http://127.0.0.1:$BROKER_PORT
+BACKEND_URL=http://127.0.0.1:$MOCK_PORT
+AGENTKEYS_SIGNER_URL=http://127.0.0.1:$MOCK_PORT
+DATA_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-data-role-ci
+VAULT_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-vault-role-ci
+MEMORY_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-memory-role-ci
+
+# v2 stage-1 contracts (anvil profile)
+SCOPE_CONTRACT_ADDRESS_ANVIL=$SCOPE_ADDR
+SIDECAR_REGISTRY_ADDRESS_ANVIL=$REGISTRY_ADDR
+K3_EPOCH_COUNTER_ADDRESS_ANVIL=$EPOCH_ADDR
+CREDENTIAL_AUDIT_ADDRESS_ANVIL=$AUDIT_ADDR
+P256_VERIFIER_ADDRESS_ANVIL=$P256_ADDR
+K11_VERIFIER_ADDRESS_ANVIL=$K11_ADDR
+EOF
+ok "wrote $SYNTH_ENV"
+
+# ─── 6. Verify deployed contracts (read-only ABI + wiring checks) ────────
+log "6/8 verify-heima-contracts.sh (anvil profile)"
+# verify-heima-contracts.sh reads scripts/operator-workstation.env, so
+# overlay the synthetic file in place for the duration of this step.
+# Restored even on failure via the trap.
+REAL_ENV="$REPO_ROOT/scripts/operator-workstation.env"
+BACKUP_ENV=""
+if [ -f "$REAL_ENV" ]; then
+  BACKUP_ENV="$WORK_DIR/operator-workstation.env.original"
+  cp "$REAL_ENV" "$BACKUP_ENV"
+fi
+restore_env() {
+  if [ -n "$BACKUP_ENV" ] && [ -f "$BACKUP_ENV" ]; then
+    cp "$BACKUP_ENV" "$REAL_ENV"
+  elif [ -f "$REAL_ENV" ] && [ -z "$BACKUP_ENV" ]; then
+    rm -f "$REAL_ENV"
+  fi
+}
+cp "$SYNTH_ENV" "$REAL_ENV"
+verify_rc=0
+AGENTKEYS_CHAIN=anvil bash "$REPO_ROOT/scripts/verify-heima-contracts.sh" \
+  > "$WORK_DIR/verify-contracts.log" 2>&1 || verify_rc=$?
+restore_env
+if [ "$verify_rc" -ne 0 ]; then
+  warn "verify-heima-contracts.sh exited $verify_rc; full log:"
+  cat "$WORK_DIR/verify-contracts.log" >&2
+  die "contract verification failed"
+fi
+ok "all six v2 stage-1 contracts verified (bytecode + ABI + wiring)"
+
+# ─── 7. Optional: stand up the broker server (skipped by default) ────────
+if [ "$SKIP_BROKER" = "1" ]; then
+  log "7/8 Broker bring-up SKIPPED (--skip-broker)"
+else
+  log "7/8 Stand up ephemeral broker (new test broker server)"
+
+  # Pre-generate keypairs so the broker boots clean. The keygen
+  # subcommand writes 0600 files; matches the production setup-broker-host
+  # flow but in $WORK_DIR instead of /var/lib/agentkeys.
+  BROKER_DATA_DIR="$WORK_DIR/broker-data"
+  mkdir -p "$BROKER_DATA_DIR"
+  info "building agentkeys-broker-server (release)"
+  cargo build --release -p agentkeys-broker-server \
+    > "$WORK_DIR/cargo-build-broker.log" 2>&1 \
+    || die "cargo build broker failed; see $WORK_DIR/cargo-build-broker.log"
+  BROKER_BIN="$REPO_ROOT/target/release/agentkeys-broker-server"
+  [ -x "$BROKER_BIN" ] || die "broker binary missing at $BROKER_BIN"
+
+  "$BROKER_BIN" keygen --purpose oidc \
+    --out "$BROKER_DATA_DIR/oidc-keypair.json" >/dev/null
+  "$BROKER_BIN" keygen --purpose session \
+    --out "$BROKER_DATA_DIR/session-keypair.json" >/dev/null
+  ok "broker keypairs generated"
+
+  info "building agentkeys-mock-server (release)"
+  cargo build --release -p agentkeys-mock-server \
+    > "$WORK_DIR/cargo-build-mock.log" 2>&1 \
+    || die "cargo build mock-server failed; see $WORK_DIR/cargo-build-mock.log"
+  MOCK_BIN="$REPO_ROOT/target/release/agentkeys-mock-server"
+  [ -x "$MOCK_BIN" ] || die "mock-server binary missing at $MOCK_BIN"
+
+  info "starting mock-server on 127.0.0.1:$MOCK_PORT"
+  "$MOCK_BIN" --port "$MOCK_PORT" \
+    > "$WORK_DIR/mock-server.log" 2>&1 &
+  MOCK_PID=$!
+  for _ in $(seq 1 60); do
+    curl -sf --max-time 1 "http://127.0.0.1:$MOCK_PORT/healthz" >/dev/null 2>&1 && break
+    sleep 0.25
+  done
+  curl -sf --max-time 2 "http://127.0.0.1:$MOCK_PORT/healthz" >/dev/null \
+    || die "mock-server failed to come up; see $WORK_DIR/mock-server.log"
+  ok "mock-server up (pid=$MOCK_PID)"
+
+  info "starting broker on 127.0.0.1:$BROKER_PORT (--skip-startup-check)"
+  # No real AWS creds in CI — broker runs OIDC-only mint path per issue #71,
+  # so the only thing AWS would do is the optional GetCallerIdentity probe,
+  # which --skip-startup-check disables.
+  BROKER_OIDC_ISSUER="http://127.0.0.1:$BROKER_PORT" \
+  BROKER_BACKEND_URL="http://127.0.0.1:$MOCK_PORT" \
+  BROKER_DATA_ROLE_ARN="arn:aws:iam::000000000000:role/agentkeys-data-role-ci" \
+  BROKER_AWS_REGION="us-east-1" \
+  BROKER_OIDC_KEYPAIR_PATH="$BROKER_DATA_DIR/oidc-keypair.json" \
+  BROKER_SESSION_KEYPAIR_PATH="$BROKER_DATA_DIR/session-keypair.json" \
+  BROKER_AUDIT_DB_PATH="$BROKER_DATA_DIR/audit.sqlite" \
+  RUST_LOG=info \
+    "$BROKER_BIN" --bind 127.0.0.1 --port "$BROKER_PORT" --skip-startup-check \
+    > "$WORK_DIR/broker.log" 2>&1 &
+  BROKER_PID=$!
+  for _ in $(seq 1 60); do
+    curl -sf --max-time 1 "http://127.0.0.1:$BROKER_PORT/healthz" >/dev/null 2>&1 && break
+    sleep 0.25
+  done
+  curl -sf --max-time 2 "http://127.0.0.1:$BROKER_PORT/healthz" >/dev/null \
+    || die "broker failed to come up; see $WORK_DIR/broker.log"
+  ok "broker up (pid=$BROKER_PID)"
+
+  # OIDC discovery surface — same endpoints AWS would hit in tier-2.
+  info "probing OIDC discovery surface"
+  curl -sf --max-time 2 \
+       "http://127.0.0.1:$BROKER_PORT/.well-known/openid-configuration" \
+       > "$WORK_DIR/oidc-config.json" \
+    || die "openid-configuration unreachable"
+  jq -e '.issuer == "http://127.0.0.1:'"$BROKER_PORT"'"' \
+       "$WORK_DIR/oidc-config.json" >/dev/null \
+    || die "openid-configuration issuer claim mismatch (see $WORK_DIR/oidc-config.json)"
+  ok ".well-known/openid-configuration → issuer matches"
+
+  curl -sf --max-time 2 \
+       "http://127.0.0.1:$BROKER_PORT/.well-known/jwks.json" \
+       > "$WORK_DIR/jwks.json" \
+    || die "jwks.json unreachable"
+  jq -e '.keys | length >= 1' "$WORK_DIR/jwks.json" >/dev/null \
+    || die "jwks.json has no keys (see $WORK_DIR/jwks.json)"
+  ok ".well-known/jwks.json → at least one key present"
+fi
+
+# ─── 8. Summary ──────────────────────────────────────────────────────────
+log "8/8 Summary"
+ok "ephemeral environment passed all checks"
+info "  chain      : anvil  (chain_id 31337, ephemeral)"
+info "  deployer   : $ANVIL_DEPLOYER_ADDR"
+info "  contracts  : 6/6 deployed + verified on chain"
+if [ "$SKIP_BROKER" != "1" ]; then
+  info "  broker     : http://127.0.0.1:$BROKER_PORT"
+  info "  oidc issuer: http://127.0.0.1:$BROKER_PORT"
+  info "  backend    : http://127.0.0.1:$MOCK_PORT (mock-server)"
+fi
+info ""
+info "Not covered here (needs long-lived test-broker.litentry.org —"
+info "see docs/test-environment.md):"
+info "  * stage-3 per-actor + per-data-class S3 PrincipalTag isolation"
+info "  * real AWS STS AssumeRoleWithWebIdentity"
+info "  * real SES email-link auth round-trip"
diff --git a/scripts/provision-test-environment.sh b/scripts/provision-test-environment.sh
new file mode 100755
index 0000000..6155522
--- /dev/null
+++ b/scripts/provision-test-environment.sh
@@ -0,0 +1,276 @@
+#!/usr/bin/env bash
+# scripts/provision-test-environment.sh — issue #66 tier-2 one-shot
+# provisioner for the long-lived parallel test environment.
+#
+# What this script provisions (every resource parallel to prod, every
+# name carrying a -test suffix so misconfigured CI runs targeting prod
+# fail closed):
+#
+#   1. AWS IAM OIDC provider for test-broker.litentry.org
+#   2. AWS IAM roles:
+#        - agentkeys-data-role-test    (email subsystem)
+#        - agentkeys-vault-role-test   (credentials, scoped to vault bucket)
+#        - agentkeys-memory-role-test  (long-term memory, scoped to memory bucket)
+#      All three trust-policied on the test OIDC provider, with the same
+#      PrincipalTag/agentkeys_actor_omni scoping that prod uses.
+#   3. AWS S3 buckets (per-data-class, per arch.md §17.2):
+#        - agentkeys-mail-test-${ACCT}
+#        - agentkeys-vault-test-${ACCT}
+#        - agentkeys-memory-test-${ACCT}
+#      Each with block-public-access + default SSE-S3 + the v3
+#      split-statement PrincipalTag bucket policy from prod
+#      (scripts/apply-vault-bucket-policy.sh + apply-memory-bucket-policy.sh).
+#   4. EC2 broker host at test-broker.litentry.org via:
+#        bash scripts/setup-broker-host.sh \
+#          --issuer-url https://test-broker.litentry.org \
+#          --account-id ${ACCOUNT_ID} \
+#          --signer-host signer-test.litentry.org \
+#          --audit-host audit-test.litentry.org \
+#          --email-host email-test.litentry.org \
+#          --cred-host cred-test.litentry.org \
+#          --memory-host memory-test.litentry.org \
+#          --chain-rpc https://rpc.paseo-parachain.heima.network \
+#          --vault-bucket agentkeys-vault-test-${ACCOUNT_ID} \
+#          --memory-bucket agentkeys-memory-test-${ACCOUNT_ID}
+#   5. A new deployer wallet on Heima-Paseo (distinct from the prod
+#      deployer), persisted at ~/.agentkeys/heima-paseo-deployer-test.key.
+#      Funded from the operator's personal Paseo wallet (no sudo on
+#      mainnet; sudo is fine on Paseo via Alice if collators are up).
+#   6. Fresh v2 stage-1 contracts deployed via DeployAgentKeysV1.s.sol
+#      to Heima-Paseo, distinct addresses from prod, written to
+#      scripts/test-environment.env under the *_HEIMA_PASEO keys.
+#
+# Idempotent: re-run safely. Each step pre-checks "is this already done?"
+# before acting. Failed runs leave a paper trail in $WORK_DIR.
+#
+# This is the OPERATOR script — runs once per account. The CI workflow
+# (.github/workflows/harness-e2e.yml) consumes the provisioned env via
+# GitHub Actions secrets + scripts/test-environment.env.
+#
+# Usage:
+#   awsp agentkeys-admin                             # admin profile required
+#   bash scripts/provision-test-environment.sh       # full provisioning
+#   bash scripts/provision-test-environment.sh --dry-run
+#   bash scripts/provision-test-environment.sh --only-step N
+#
+# Per CLAUDE.md, this script is the SINGLE ENTRY POINT for test-env
+# changes. No ad-hoc aws iam / aws s3api edits — extend this script
+# instead and re-run.
+
+set -euo pipefail
+
+REPO_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
+cd "$REPO_ROOT"
+
+# ─── Config defaults ─────────────────────────────────────────────────────
+DRY_RUN=0
+ONLY_STEP=""
+TEST_BROKER_HOST="${TEST_BROKER_HOST:-test-broker.litentry.org}"
+TEST_SIGNER_HOST="${TEST_SIGNER_HOST:-signer-test.litentry.org}"
+TEST_AUDIT_HOST="${TEST_AUDIT_HOST:-audit-test.litentry.org}"
+TEST_EMAIL_HOST="${TEST_EMAIL_HOST:-email-test.litentry.org}"
+TEST_CRED_HOST="${TEST_CRED_HOST:-cred-test.litentry.org}"
+TEST_MEMORY_HOST="${TEST_MEMORY_HOST:-memory-test.litentry.org}"
+TEST_ENV_FILE="$REPO_ROOT/scripts/test-environment.env"
+TEST_ENV_EXAMPLE="$REPO_ROOT/scripts/test-environment.env.example"
+WORK_DIR="$(mktemp -d -t agentkeys-provision-test-XXXXXX)"
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --dry-run)        DRY_RUN=1; shift ;;
+    --only-step)      ONLY_STEP="$2"; shift 2 ;;
+    --test-broker-host) TEST_BROKER_HOST="$2"; shift 2 ;;
+    -h|--help)
+      sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d'
+      exit 0 ;;
+    *) echo "unknown flag: $1 (try --help)" >&2; exit 2 ;;
+  esac
+done
+
+# ─── Colors ──────────────────────────────────────────────────────────────
+if [ -t 2 ]; then
+  C_HEAD='\033[1;36m'; C_OK='\033[1;32m'; C_SKIP='\033[1;33m'
+  C_WARN='\033[1;33m'; C_ERR='\033[1;31m'; C_RESET='\033[0m'
+else
+  C_HEAD=''; C_OK=''; C_SKIP=''; C_WARN=''; C_ERR=''; C_RESET=''
+fi
+log()  { printf "${C_HEAD}==>${C_RESET} %s\n" "$*" >&2; }
+ok()   { printf "    ${C_OK}ok${C_RESET}    %s\n" "$*" >&2; }
+skip() { printf "    ${C_SKIP}skip${C_RESET}  %s\n" "$*" >&2; }
+warn() { printf "    ${C_WARN}warn${C_RESET}  %s\n" "$*" >&2; }
+die()  { printf "    ${C_ERR}fail${C_RESET}  %s\n" "$*" >&2; exit 1; }
+
+should_run_step() {
+  [ -z "$ONLY_STEP" ] && return 0
+  [ "$1" = "$ONLY_STEP" ]
+}
+
+run_or_dry() {
+  if [ "$DRY_RUN" = "1" ]; then
+    printf "    ${C_WARN}dry-run${C_RESET} %s\n" "$*" >&2
+  else
+    "$@"
+  fi
+}
+
+# ─── Step 0: prerequisite check ──────────────────────────────────────────
+log "0/7 Prereq check"
+caller_arn=$(aws sts get-caller-identity --query Arn --output text 2>&1) \
+  || die "aws sts get-caller-identity failed: $caller_arn — run: awsp agentkeys-admin"
+caller_lc=$(printf '%s' "$caller_arn" | tr '[:upper:]' '[:lower:]')
+case "$caller_lc" in
+  *":user/agentkeys-admin"*) ok "caller: $caller_arn" ;;
+  *) die "caller is $caller_arn — admin required. Run: awsp agentkeys-admin" ;;
+esac
+ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
+REGION="${AWS_REGION:-us-east-1}"
+ok "ACCOUNT_ID=$ACCOUNT_ID REGION=$REGION"
+
+# Seed the env file if missing
+if [ ! -f "$TEST_ENV_FILE" ]; then
+  [ -f "$TEST_ENV_EXAMPLE" ] || die "missing $TEST_ENV_EXAMPLE (committed template)"
+  cp "$TEST_ENV_EXAMPLE" "$TEST_ENV_FILE"
+  ok "seeded $TEST_ENV_FILE from .example"
+fi
+
+env_set() {
+  local key="$1" val="$2" file="$3"
+  if grep -qE "^${key}=" "$file" 2>/dev/null; then
+    if [ "$(uname)" = "Darwin" ]; then
+      sed -i '' -E "s|^${key}=.*|${key}=${val}|" "$file"
+    else
+      sed -i -E "s|^${key}=.*|${key}=${val}|" "$file"
+    fi
+  else
+    printf '%s=%s\n' "$key" "$val" >> "$file"
+  fi
+}
+env_set ACCOUNT_ID "$ACCOUNT_ID" "$TEST_ENV_FILE"
+env_set REGION "$REGION" "$TEST_ENV_FILE"
+
+# ─── Step 1: provision the broker host (mirrors prod §5) ─────────────────
+if should_run_step 1; then
+  log "1/7 Provision broker host (test-broker.${TEST_BROKER_HOST#test-broker.})"
+  cat >&2 <<EOF
+    This step is OPERATOR-DRIVEN — setup-broker-host.sh runs on the
+    target EC2, not on your laptop. The runbook:
+
+      1. Stand up a fresh t3.micro EC2 with an Elastic IP.
+      2. Add an A record for ${TEST_BROKER_HOST} pointing at the EIP.
+         (Same for signer-test / audit-test / email-test / cred-test /
+          memory-test — five additional A records.)
+      3. SSH into the EC2 as ec2-user, then:
+           git clone https://github.com/<owner>/agentKeys && cd agentKeys
+           bash scripts/setup-broker-host.sh \\
+             --issuer-url https://${TEST_BROKER_HOST} \\
+             --account-id ${ACCOUNT_ID} \\
+             --signer-host ${TEST_SIGNER_HOST} \\
+             --audit-host ${TEST_AUDIT_HOST} \\
+             --email-host ${TEST_EMAIL_HOST} \\
+             --cred-host ${TEST_CRED_HOST} \\
+             --memory-host ${TEST_MEMORY_HOST} \\
+             --chain-rpc https://rpc.paseo-parachain.heima.network \\
+             --vault-bucket agentkeys-vault-test-${ACCOUNT_ID} \\
+             --memory-bucket agentkeys-memory-test-${ACCOUNT_ID} \\
+             --email-from noreply-test@bots-test.litentry.org \\
+             --non-interactive --yes
+      4. Confirm: curl -sf https://${TEST_BROKER_HOST}/healthz
+
+    See docs/test-environment.md §3 for the full host runbook.
+EOF
+  skip "manual operator step; rerun --only-step 2 once the host is up"
+fi
+
+# ─── Step 2: IAM OIDC provider for test-broker ───────────────────────────
+if should_run_step 2; then
+  log "2/7 IAM OIDC provider (oidc-provider/${TEST_BROKER_HOST})"
+  oidc_arn="arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${TEST_BROKER_HOST}"
+  if aws iam get-open-id-connect-provider --open-id-connect-provider-arn "$oidc_arn" \
+       >/dev/null 2>&1; then
+    skip "OIDC provider already registered: $oidc_arn"
+  else
+    # Fetch the broker's TLS leaf thumbprint (AWS requires it for OIDC
+    # provider registration). Public TLS cert, so this is fine to
+    # fetch from any network.
+    thumb=$(echo | openssl s_client -servername "$TEST_BROKER_HOST" \
+                                     -connect "${TEST_BROKER_HOST}:443" 2>/dev/null \
+              | openssl x509 -fingerprint -noout 2>/dev/null \
+              | awk -F'=' '{print $2}' | tr -d ':' | tr 'A-Z' 'a-z')
+    [ -n "$thumb" ] || die "could not fetch TLS thumbprint for ${TEST_BROKER_HOST}; is the broker reachable?"
+    run_or_dry aws iam create-open-id-connect-provider \
+      --url "https://${TEST_BROKER_HOST}" \
+      --client-id-list "sts.amazonaws.com" \
+      --thumbprint-list "$thumb"
+    ok "registered $oidc_arn (thumbprint=$thumb)"
+  fi
+  env_set OIDC_PROVIDER_ARN "$oidc_arn" "$TEST_ENV_FILE"
+fi
+
+# ─── Step 3: IAM roles (data, vault, memory) ─────────────────────────────
+if should_run_step 3; then
+  log "3/7 IAM roles (data-test, vault-test, memory-test)"
+  # These wrap the existing prod provisioning scripts with a -test
+  # suffix on every name. The scripts read role/bucket names from env,
+  # so set env then call.
+  warn "extend scripts/provision-vault-role.sh + provision-memory-role.sh"
+  warn "to accept a SUFFIX env var, or copy them as -test variants."
+  warn "Tracking as a TODO in this script — exercise once the prod"
+  warn "scripts are parameterized (~ 1 PR of work)."
+fi
+
+# ─── Step 4: S3 buckets ──────────────────────────────────────────────────
+if should_run_step 4; then
+  log "4/7 S3 buckets (mail-test, vault-test, memory-test)"
+  warn "same parameterization story as step 3 — see TODO above."
+fi
+
+# ─── Step 5: deployer wallet + funding ───────────────────────────────────
+if should_run_step 5; then
+  log "5/7 Deployer wallet on Heima-Paseo (distinct from prod deployer)"
+  KEYFILE="$HOME/.agentkeys/heima-paseo-deployer-test.key"
+  if [ -f "$KEYFILE" ]; then
+    skip "$KEYFILE exists"
+  else
+    mkdir -p "$(dirname "$KEYFILE")"
+    run_or_dry cast wallet new --json \
+      | tee "$WORK_DIR/wallet.json" \
+      | jq -r .[0].private_key > "$KEYFILE"
+    chmod 600 "$KEYFILE"
+    addr=$(jq -r .[0].address "$WORK_DIR/wallet.json")
+    ok "generated $KEYFILE (addr=$addr) — fund this address from your"
+    ok "  personal Paseo wallet, then re-run --only-step 6 to deploy contracts."
+  fi
+fi
+
+# ─── Step 6: deploy v2 stage-1 contracts on Heima-Paseo ──────────────────
+if should_run_step 6; then
+  log "6/7 Deploy v2 stage-1 contracts to Heima-Paseo (new contracts on-chain)"
+  KEYFILE="$HOME/.agentkeys/heima-paseo-deployer-test.key"
+  [ -f "$KEYFILE" ] || die "missing $KEYFILE — run --only-step 5 first"
+  run_or_dry env HEIMA_DEPLOYER_KEY_FILE="$KEYFILE" \
+    AGENTKEYS_CHAIN=heima-paseo \
+    bash "$REPO_ROOT/scripts/heima-bring-up.sh"
+  ok "contract addresses recorded in scripts/operator-workstation.env;"
+  ok "  copy the *_HEIMA_PASEO lines into $TEST_ENV_FILE."
+fi
+
+# ─── Step 7: GitHub Actions OIDC role for the e2e workflow ───────────────
+if should_run_step 7; then
+  log "7/7 GitHub Actions OIDC role (test-only)"
+  warn "Create an additional IAM role 'github-actions-agentkeys-e2e'"
+  warn "with trust policy on token.actions.githubusercontent.com and a"
+  warn "condition limiting to the agentkeys repo + branch ref. Grant"
+  warn "agentkeys-vault-role-test + agentkeys-memory-role-test assume"
+  warn "perms and read-only S3 on the three test buckets."
+  warn ""
+  warn "Then store the role ARN as the TEST_OIDC_AWS_ROLE_ARN repo secret."
+  warn "Until that secret is set, .github/workflows/harness-e2e.yml is"
+  warn "inert (the job is gated on its presence)."
+fi
+
+# ─── Done ────────────────────────────────────────────────────────────────
+log "Done"
+ok "test environment provisioning complete (or skip-noted above)"
+ok "next: bash harness/v2-stage3-demo.sh against \$OIDC_ISSUER=${TEST_BROKER_HOST}"
+ok "  with AGENTKEYS_ENV_FILE=$TEST_ENV_FILE"
+rm -rf "$WORK_DIR"
diff --git a/scripts/test-environment.env.example b/scripts/test-environment.env.example
new file mode 100644
index 0000000..68c8787
--- /dev/null
+++ b/scripts/test-environment.env.example
@@ -0,0 +1,92 @@
+# AgentKeys long-lived test environment — env file template (issue #66 tier-2).
+#
+# Companion to scripts/operator-workstation.env, but for the PARALLEL
+# test infrastructure (not prod):
+#
+#   - Hostname:   test-broker.litentry.org     (vs. broker.litentry.org)
+#   - OIDC iss:   https://test-broker.litentry.org
+#   - IAM role:   agentkeys-data-role-test     (vs. agentkeys-data-role)
+#   - Vault role: agentkeys-vault-role-test    (vs. agentkeys-vault-role)
+#   - Mem role:   agentkeys-memory-role-test   (vs. agentkeys-memory-role)
+#   - Mail/vault/memory buckets: -test suffix on every bucket name
+#   - Chain:      heima-paseo (testnet — no real-HEI cost on every CI run)
+#   - Deployer:   separate keypair, persisted only in operator wallet
+#   - Contracts:  deployed fresh by scripts/provision-test-environment.sh,
+#                 distinct addresses from prod (recorded below per chain)
+#
+# Why mirror operator-workstation.env instead of forking it: the harness
+# scripts (harness/v2-stage*.sh) source ONE env file. Setting
+# AGENTKEYS_ENV_FILE=./scripts/test-environment.env before invoking a
+# harness script reuses the entire flow against the test infra unchanged.
+#
+# Bring-up: bash scripts/provision-test-environment.sh
+# Activate: cp scripts/test-environment.env.example scripts/test-environment.env
+#           (then fill in the values below from the provisioner output)
+#
+# This .example file commits as-is. The non-example copy MUST NOT be
+# committed (it carries no secrets in itself, but its contents are the
+# canonical "this account hosts the test infra" pointer — gated behind
+# the operator's deliberate copy).
+#
+# See docs/test-environment.md for the full bring-up runbook.
+
+# ─── AWS account ─────────────────────────────────────────────────────────
+# Same account as prod is fine for cost, but every resource name carries
+# a -test suffix so a misconfigured CI run targeting prod fails closed
+# (the role / bucket / OIDC provider simply won't exist in prod).
+ACCOUNT_ID=000000000000
+REGION=us-east-1
+
+# ─── Hostname + OIDC issuer ──────────────────────────────────────────────
+# DNS A record + TLS cert + nginx + systemd all per scripts/setup-broker-host.sh
+# with --issuer-url https://test-broker.litentry.org. Long-lived because
+# AWS validates the OIDC issuer URL byte-for-byte against the JWT `iss`
+# claim — every reboot must restore the same URL.
+BROKER_HOST=test-broker.litentry.org
+OIDC_ISSUER=https://${BROKER_HOST}
+OIDC_PROVIDER_ARN=arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${BROKER_HOST}
+
+# ─── IAM roles (parallel to prod, distinct ARNs) ─────────────────────────
+DATA_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role-test
+VAULT_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-vault-role-test
+MEMORY_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-memory-role-test
+
+# ─── S3 buckets (parallel to prod, distinct names) ───────────────────────
+MAIL_DOMAIN=bots-test.litentry.org
+MAIL_BUCKET=agentkeys-mail-test-${ACCOUNT_ID}
+BUCKET=${MAIL_BUCKET}
+VAULT_BUCKET=agentkeys-vault-test-${ACCOUNT_ID}
+MEMORY_BUCKET=agentkeys-memory-test-${ACCOUNT_ID}
+
+# ─── Backend (signer) URL ────────────────────────────────────────────────
+# Test env runs the mock-server backend (the production dev_key_service
+# shape). Real TEE workers are out of scope for the test environment —
+# see issue #74 step 2.
+AGENTKEYS_SIGNER_URL=https://signer-test.litentry.org
+BACKEND_URL=${AGENTKEYS_SIGNER_URL}
+
+# ─── Chain (Heima-Paseo testnet) ─────────────────────────────────────────
+# Defaults to Paseo for zero real-HEI cost. Override to `anvil` for
+# fully local runs; never `heima` (mainnet — prod-only).
+AGENTKEYS_CHAIN=heima-paseo
+
+# Contract addresses — populated by scripts/provision-test-environment.sh.
+# Keep one set per chain so re-bring-up against another chain doesn't
+# clobber. The non-test file commits the actual addresses post-deploy.
+SCOPE_CONTRACT_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+SIDECAR_REGISTRY_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+K3_EPOCH_COUNTER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+CREDENTIAL_AUDIT_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+P256_VERIFIER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+K11_VERIFIER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
+
+# ─── Deployer key path ───────────────────────────────────────────────────
+# Operator-held only; the test deployer is a DIFFERENT wallet from prod.
+# Provisioner persists it at ~/.agentkeys/heima-paseo-deployer-test.key.
+HEIMA_DEPLOYER_KEY_FILE=${HOME}/.agentkeys/heima-paseo-deployer-test.key
+
+# ─── CI namespacing (per-run S3 prefix isolation) ────────────────────────
+# Set by the e2e workflow at run time so concurrent CI runs don't step
+# on each other's writes. Cleaned up by nightly s3-prefix-rm job (see
+# docs/test-environment.md §Cleanup).
+CI_S3_PREFIX=ci/pr-${PR_NUMBER:-manual}/run-${GITHUB_RUN_ID:-local}

From cd25bdecd549f786856ed5471a6bb7dce652424e Mon Sep 17 00:00:00 2001
From: wildmeta-agent <agent@wildmeta.ai>
Date: Thu, 21 May 2026 09:33:21 +0800
Subject: [PATCH 2/4] issue #66: collapse to one CI file; mirror prod env on
 Heima mainnet
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per operator feedback:

1. "do not create new files, only add the test file" — drop the
   ephemeral-stack helper, provisioner, env template, e2e workflow,
   and docs. Single deliverable: .github/workflows/harness-ci.yml.

2. "onchain solution should test on Heima mainnet with a new smart
   contract address" — confirmed possible: Solidity compiles
   deterministically and EVM contract addresses derive from
   (deployer, nonce). Identical crates/agentkeys-chain/src/*.sol +
   identical DeployAgentKeysV1.s.sol + a different deployer key on
   Heima mainnet = isolated parallel contract set at new addresses on
   the production chain.

3. "CI mirrors the production env" — the workflow now invokes the
   PRODUCTION harness scripts (harness/v2-stage{1,2,3}-demo.sh)
   unchanged. The only thing CI does differently from a prod operator
   is materialize scripts/operator-workstation.env with TEST_*
   resource names from GitHub secrets:

     - TEST_OIDC_AWS_ROLE_ARN  (gate; until set, harness job skips)
     - TEST_ACCOUNT_ID / TEST_AWS_REGION / TEST_BROKER_HOST
     - TEST_VAULT_BUCKET / TEST_MEMORY_BUCKET
     - TEST_{VAULT,MEMORY,DATA}_ROLE_ARN
     - TEST_HEIMA_DEPLOYER_KEY  (raw 0x-prefixed mainnet key — test
                                 wallet, distinct from prod deployer)
     - TEST_{SCOPE,SIDECAR_REGISTRY,K3_EPOCH_COUNTER,
            CREDENTIAL_AUDIT,P256_VERIFIER,K11_VERIFIER}_CONTRACT_ADDRESS_HEIMA
       (pre-deployed once per test-env refresh; harness skips deploy
        via --skip-deploy so CI doesn't burn HEI on every push)

   AWS auth via GitHub Actions OIDC (id-token: write), no long-lived
   secrets. Per-run S3 prefix isolation. The workflow gates itself on
   TEST_OIDC_AWS_ROLE_ARN being set so it's inert until the operator
   activates the test infra.

WebAuthn: never invoked — harness scripts default to WEBAUTHN_MODE=0
(stage-1 line 131) and stage-2's --stub flag is passed explicitly.

LLM: zero. Plain cargo/forge/aws-cli/curl orchestration. Distinct from
claude.yml + claude-code-review.yml which intentionally do call @claude.
---
 .github/workflows/harness-ci.yml      | 299 +++++++++++++------
 .github/workflows/harness-e2e.yml     | 204 -------------
 docs/test-environment.md              | 166 -----------
 harness/ci-ephemeral-stack.sh         | 401 --------------------------
 scripts/provision-test-environment.sh | 276 ------------------
 scripts/test-environment.env.example  |  92 ------
 6 files changed, 210 insertions(+), 1228 deletions(-)
 delete mode 100644 .github/workflows/harness-e2e.yml
 delete mode 100644 docs/test-environment.md
 delete mode 100755 harness/ci-ephemeral-stack.sh
 delete mode 100755 scripts/provision-test-environment.sh
 delete mode 100644 scripts/test-environment.env.example

diff --git a/.github/workflows/harness-ci.yml b/.github/workflows/harness-ci.yml
index ec19e66..0505d45 100644
--- a/.github/workflows/harness-ci.yml
+++ b/.github/workflows/harness-ci.yml
@@ -1,39 +1,67 @@
 name: harness CI (no LLM)
 
-# Issue #66 tier-1: deterministic, no-LLM, no-WebAuthn CI that exercises
-# the same code paths the harness scripts run, but against an ephemeral
-# in-CI test environment (anvil + mock-server + stub-STS broker).
+# Issue #66: deterministic, no-LLM, no-WebAuthn CI that runs the SAME
+# production harness scripts (harness/v2-stage{1,2,3}-demo.sh) against
+# a parallel TEST instance of the production environment.
 #
-# Separate from the existing claude.yml / claude-code-review.yml workflows
-# (which invoke @claude on PR comments + reviews). This workflow never
-# spends LLM tokens — it's plain cargo/forge/curl orchestration.
+# "Mirror production" means: same Heima mainnet chain, same Solidity
+# source files, same harness scripts, same broker code, same AWS
+# IAM/STS/S3 surfaces. The only delta is identifiers — a different
+# deployer wallet → different contract addresses; a different OIDC
+# provider URL → different IAM role + bucket. Every test resource
+# carries a -test suffix so a misconfigured run targeting prod fails
+# closed (the role/bucket simply won't exist in prod).
 #
-# Coverage map (matches harness/v2-stage*.sh where ephemeral CI can):
+# Operator-provided GitHub repo secrets (one-shot setup, then immutable
+# for the life of the test environment):
 #
-#   * `cargo fmt --check`                 — formatting gate
-#   * `cargo clippy -D warnings`          — lint gate
-#   * `cargo test --workspace`            — unit + in-process integration
-#                                            tests. The broker tests
-#                                            already spawn a full
-#                                            in-process broker with
-#                                            StubSts + StubEmailSender,
-#                                            so SIWE / OIDC mint / cap
-#                                            verify / multi-master /
-#                                            recovery / per-data-class
-#                                            isolation Rust logic is all
-#                                            covered here. Per CLAUDE.md
-#                                            "all async / #[tokio::test]"
-#                                            convention.
-#   * `harness/ci-ephemeral-stack.sh`     — forge build + forge test +
-#                                            forge script deploy on a
-#                                            fresh anvil + read-only
-#                                            ABI/wiring verification.
-#                                            Plus broker boot smoke +
-#                                            OIDC discovery surface.
+#   TEST_OIDC_AWS_ROLE_ARN  IAM role assumed by this workflow via GitHub
+#                           Actions OIDC. Trust policy:
+#                             "token.actions.githubusercontent.com",
+#                             conditioned on this repo + ref. Grants:
+#                             sts:AssumeRole on the test data roles +
+#                             read-only S3 on the test buckets.
+#   TEST_ACCOUNT_ID         AWS account ID hosting the test infra.
+#                           Same account as prod is fine — isolation is
+#                           by resource name, not by account.
+#   TEST_AWS_REGION         e.g. us-east-1
+#   TEST_BROKER_HOST        test-broker.litentry.org (long-lived; AWS
+#                           validates OIDC issuer URLs byte-for-byte,
+#                           so this must outlast any single CI run).
+#   TEST_VAULT_BUCKET       agentkeys-vault-test-${ACCOUNT_ID}
+#   TEST_MEMORY_BUCKET      agentkeys-memory-test-${ACCOUNT_ID}
+#   TEST_VAULT_ROLE_ARN     arn:aws:iam::${ACCT}:role/agentkeys-vault-role-test
+#   TEST_MEMORY_ROLE_ARN    arn:aws:iam::${ACCT}:role/agentkeys-memory-role-test
+#   TEST_DATA_ROLE_ARN      arn:aws:iam::${ACCT}:role/agentkeys-data-role-test
+#   TEST_HEIMA_DEPLOYER_KEY 0x-prefixed Heima mainnet test wallet private
+#                           key (DIFFERENT from prod deployer). Deploys
+#                           the same crates/agentkeys-chain/src/*.sol to
+#                           new addresses on mainnet via the same
+#                           DeployAgentKeysV1.s.sol script. Solidity
+#                           bytecode is deterministic and contract
+#                           addresses derive from (deployer, nonce), so
+#                           a different key + same source = isolated
+#                           parallel contract set on the production
+#                           chain. Fund this wallet once from the
+#                           operator's personal Heima wallet.
+#   TEST_SCOPE_CONTRACT_ADDRESS_HEIMA      pinned addresses of the
+#   TEST_SIDECAR_REGISTRY_ADDRESS_HEIMA    test-deployer's mainnet deploy
+#   TEST_K3_EPOCH_COUNTER_ADDRESS_HEIMA    (so CI doesn't burn HEI on
+#   TEST_CREDENTIAL_AUDIT_ADDRESS_HEIMA     every run). One-shot deploy
+#   TEST_P256_VERIFIER_ADDRESS_HEIMA        per test-environment refresh.
+#   TEST_K11_VERIFIER_ADDRESS_HEIMA
 #
-# Tier-2 (long-lived test-broker.litentry.org, full stage-3 PrincipalTag
-# isolation, real AWS STS) lives in .github/workflows/harness-e2e.yml
-# and is gated on operator-provisioned infra; see docs/test-environment.md.
+# Gating: until TEST_OIDC_AWS_ROLE_ARN is set, the workflow's preflight
+# job surfaces a ::warning:: skip and exits clean — safe to merge before
+# the operator activates the test infra.
+#
+# WebAuthn: never invoked. harness/v2-stage1-demo.sh defaults to
+# WEBAUTHN_MODE=0 (line 131), v2-stage2-demo.sh accepts --stub, neither
+# this workflow nor the harness scripts call WebAuthn paths in this mode.
+#
+# LLM: never invoked. This workflow is plain cargo/forge/aws-cli/curl —
+# distinct from claude.yml + claude-code-review.yml which DO call @claude
+# on PR comments + reviews. This workflow consumes zero LLM tokens.
 
 on:
   push:
@@ -46,14 +74,23 @@ on:
       - ".github/workflows/harness-ci.yml"
       - "Cargo.toml"
       - "Cargo.lock"
+  workflow_dispatch:
+    inputs:
+      stage:
+        description: "Which harness stage to run (1, 2, 3, or all)"
+        required: false
+        default: "all"
+        type: choice
+        options: ["1", "2", "3", "all"]
 
-# Allow only one concurrent run per ref so re-pushes cancel stale runs
-# (saves runner minutes; each ephemeral stack spins up anvil + builds the
-# workspace, so wall-clock matters).
 concurrency:
   group: harness-ci-${{ github.ref }}
   cancel-in-progress: true
 
+permissions:
+  id-token: write   # GitHub Actions OIDC → assume TEST_OIDC_AWS_ROLE_ARN
+  contents: read
+
 jobs:
   rust-checks:
     name: cargo fmt + clippy + test
@@ -62,78 +99,162 @@ jobs:
     steps:
       - uses: actions/checkout@v4
 
-      - name: Install Rust toolchain
-        uses: dtolnay/rust-toolchain@stable
+      - uses: dtolnay/rust-toolchain@stable
         with:
           components: clippy, rustfmt
 
-      - name: Cache cargo registry + target
-        uses: Swatinem/rust-cache@v2
+      - uses: Swatinem/rust-cache@v2
         with:
           shared-key: harness-ci
 
-      - name: cargo fmt --check
-        run: cargo fmt --all -- --check
+      - run: cargo fmt --all -- --check
+      - run: cargo clippy --workspace --all-targets -- -D warnings
+      # --test-threads=1: broker tests mutate shared process env (HOME,
+      # AWS_*) and the keyring tests serialize on a per-process accounts
+      # map — same convention as the existing @claude review workflow.
+      - run: cargo test --workspace -- --test-threads=1
 
-      - name: cargo clippy
-        # -D warnings: any clippy diagnostic blocks merge. Matches the
-        # project's "fix the warning, don't silence it" convention.
-        run: cargo clippy --workspace --all-targets -- -D warnings
-
-      - name: cargo test --workspace
-        # --test-threads=1: the broker tests mutate shared process env
-        # (HOME, AWS_*) and the keyring tests serialize on a per-process
-        # accounts map — same convention as the @claude review workflow.
-        run: cargo test --workspace -- --test-threads=1
+  preflight:
+    # Gate the harness jobs on the test infra credentials being present.
+    # Until the operator sets TEST_OIDC_AWS_ROLE_ARN, the harness jobs
+    # surface as skipped rather than failing.
+    name: gate on test infra availability
+    runs-on: ubuntu-latest
+    needs: rust-checks
+    outputs:
+      should_run: ${{ steps.gate.outputs.should_run }}
+    steps:
+      - id: gate
+        run: |
+          if [ -n "${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}" ]; then
+            echo "should_run=true" >> "$GITHUB_OUTPUT"
+            echo "test infra credentials present; proceeding"
+          else
+            echo "should_run=false" >> "$GITHUB_OUTPUT"
+            echo "::warning::TEST_OIDC_AWS_ROLE_ARN unset — harness E2E skipped. See workflow header for operator setup."
+          fi
 
-  ephemeral-stack:
-    name: ephemeral anvil + chain deploy
+  harness-e2e:
+    name: harness/v2-stage*-demo.sh on Heima mainnet (test deployer)
+    needs: preflight
+    if: needs.preflight.outputs.should_run == 'true'
     runs-on: ubuntu-latest
-    timeout-minutes: 45
-    needs: rust-checks  # don't burn runner minutes on chain checks if Rust is red
+    timeout-minutes: 60
+
     steps:
       - uses: actions/checkout@v4
         with:
-          # forge install reads .gitmodules — need submodules for forge-std etc.
-          submodules: recursive
-
-      - name: Install Rust toolchain
-        uses: dtolnay/rust-toolchain@stable
+          submodules: recursive  # forge install reads .gitmodules
 
-      - name: Cache cargo registry + target
-        uses: Swatinem/rust-cache@v2
+      - uses: dtolnay/rust-toolchain@stable
+      - uses: Swatinem/rust-cache@v2
         with:
-          shared-key: harness-ci  # share with rust-checks job
+          shared-key: harness-ci
 
-      - name: Install Foundry (anvil + forge + cast)
-        uses: foundry-rs/foundry-toolchain@v1
+      - uses: foundry-rs/foundry-toolchain@v1
         with:
           version: stable
 
-      - name: Verify Foundry toolchain
-        run: |
-          anvil --version
-          forge --version
-          cast --version
-
-      - name: Run ephemeral stack (chain + broker smoke)
-        # The script handles its own anvil + broker bring-up/tear-down via
-        # an EXIT trap. Fails the job if any step (forge build/test/deploy,
-        # contract verification, broker boot, OIDC discovery) fails.
-        run: bash harness/ci-ephemeral-stack.sh
-        env:
-          # Pinned ports so the workflow log is reproducible.
-          ANVIL_PORT: "8545"
-          MOCK_PORT: "8090"
-          BROKER_PORT: "8091"
-          # Fail builds on rustc warnings as well (matches clippy job).
-          RUSTFLAGS: "-D warnings"
-
-      - name: Upload logs on failure
-        if: failure()
-        uses: actions/upload-artifact@v4
+      - name: Configure AWS credentials via OIDC (test role)
+        uses: aws-actions/configure-aws-credentials@v4
         with:
-          name: ephemeral-stack-logs
-          path: /tmp/agentkeys-ci-ephemeral-*/
-          if-no-files-found: ignore
-          retention-days: 7
+          role-to-assume: ${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}
+          aws-region: ${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
+          # Session name shows up in CloudTrail — keep traceable per run.
+          role-session-name: gh-ci-${{ github.run_id }}
+
+      - name: Build agentkeys CLI + workers (release)
+        run: cargo build --release --workspace
+
+      - name: Materialize the production env file with TEST values
+        # The harness scripts source scripts/operator-workstation.env
+        # unchanged. We OVERWRITE it with the test resource names so
+        # the entire production harness flow re-points at the test
+        # infra without modifying a single script — that's what
+        # "mirror production env" means.
+        #
+        # Same chain (heima mainnet), same .sol code, same scripts.
+        # Different deployer key → different contract addresses on the
+        # SAME mainnet → fully isolated parallel contract set.
+        run: |
+          cat > scripts/operator-workstation.env <<EOF
+          ACCOUNT_ID=${{ secrets.TEST_ACCOUNT_ID }}
+          REGION=${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
+          BROKER_HOST=${{ secrets.TEST_BROKER_HOST }}
+          OIDC_ISSUER=https://${{ secrets.TEST_BROKER_HOST }}
+          OIDC_PROVIDER_ARN=arn:aws:iam::${{ secrets.TEST_ACCOUNT_ID }}:oidc-provider/${{ secrets.TEST_BROKER_HOST }}
+          MAIL_DOMAIN=bots-test.litentry.org
+          MAIL_BUCKET=agentkeys-mail-test-${{ secrets.TEST_ACCOUNT_ID }}
+          BUCKET=agentkeys-mail-test-${{ secrets.TEST_ACCOUNT_ID }}
+          VAULT_BUCKET=${{ secrets.TEST_VAULT_BUCKET }}
+          MEMORY_BUCKET=${{ secrets.TEST_MEMORY_BUCKET }}
+          DATA_ROLE_ARN=${{ secrets.TEST_DATA_ROLE_ARN }}
+          VAULT_ROLE_ARN=${{ secrets.TEST_VAULT_ROLE_ARN }}
+          MEMORY_ROLE_ARN=${{ secrets.TEST_MEMORY_ROLE_ARN }}
+          AGENTKEYS_SIGNER_URL=https://signer-test.litentry.org
+          BACKEND_URL=https://signer-test.litentry.org
+          AGENTKEYS_CHAIN=heima
+          SCOPE_CONTRACT_ADDRESS_HEIMA=${{ secrets.TEST_SCOPE_CONTRACT_ADDRESS_HEIMA }}
+          SIDECAR_REGISTRY_ADDRESS_HEIMA=${{ secrets.TEST_SIDECAR_REGISTRY_ADDRESS_HEIMA }}
+          K3_EPOCH_COUNTER_ADDRESS_HEIMA=${{ secrets.TEST_K3_EPOCH_COUNTER_ADDRESS_HEIMA }}
+          CREDENTIAL_AUDIT_ADDRESS_HEIMA=${{ secrets.TEST_CREDENTIAL_AUDIT_ADDRESS_HEIMA }}
+          P256_VERIFIER_ADDRESS_HEIMA=${{ secrets.TEST_P256_VERIFIER_ADDRESS_HEIMA }}
+          K11_VERIFIER_ADDRESS_HEIMA=${{ secrets.TEST_K11_VERIFIER_ADDRESS_HEIMA }}
+          HEIMA_DEPLOYER_KEY_FILE=$HOME/.agentkeys/heima-deployer.key
+          # Per-run S3 prefix so concurrent runs don't step on each
+          # other's writes. Nightly cleanup script (operator-side) rm's
+          # ci/run-* prefixes older than 7d.
+          CI_S3_PREFIX=ci/run-${{ github.run_id }}
+          EOF
+
+      - name: Materialize test deployer key
+        # Same path the production heima-bring-up.sh writes to. CI
+        # populates from a GitHub secret instead of operator interaction.
+        run: |
+          mkdir -p "$HOME/.agentkeys"
+          umask 077
+          printf '%s\n' '${{ secrets.TEST_HEIMA_DEPLOYER_KEY }}' \
+            > "$HOME/.agentkeys/heima-deployer.key"
+          chmod 600 "$HOME/.agentkeys/heima-deployer.key"
+
+      - name: Stage 1 — chain reachability + identity bootstrap
+        if: ${{ inputs.stage == 'all' || inputs.stage == '1' || inputs.stage == '' }}
+        # --skip-deploy: contracts are pre-deployed once per test-env
+        # refresh (operator one-shot) and pinned in TEST_*_HEIMA secrets,
+        # so CI doesn't burn HEI on every push.
+        # --skip-email: SES email-link round-trip is exercised separately;
+        # identity bootstrap here uses wallet_sig.
+        # No --webauthn: stub-mode K11 (WEBAUTHN_MODE=0 default).
+        run: |
+          AGENTKEYS_CHAIN=heima \
+            bash harness/v2-stage1-demo.sh --skip-deploy --skip-email
+
+      - name: Stage 2 — multi-master + recovery (stub mode)
+        if: ${{ inputs.stage == 'all' || inputs.stage == '2' || inputs.stage == '' }}
+        run: |
+          AGENTKEYS_CHAIN=heima \
+            bash harness/v2-stage2-demo.sh --stub --skip-build
+
+      - name: Stage 3 — per-actor + per-data-class PrincipalTag isolation
+        if: ${{ inputs.stage == 'all' || inputs.stage == '3' || inputs.stage == '' }}
+        # The capstone: stage-3 is the layer with the highest security
+        # invariant payload (per CLAUDE.md "Per-actor + per-data-class
+        # isolation invariants" table). Requires AWS STS
+        # AssumeRoleWithWebIdentity → which requires AWS to fetch the
+        # OIDC issuer's JWKS over public TLS. The long-lived test broker
+        # (TEST_BROKER_HOST) satisfies that; the same code path proves
+        # the prod IAM trust policy + bucket policy are correctly scoped.
+        run: |
+          AGENTKEYS_CHAIN=heima \
+            bash harness/v2-stage3-demo.sh
+
+      - name: Clean up per-run S3 prefix
+        if: always()
+        run: |
+          PREFIX="ci/run-${{ github.run_id }}/"
+          for bucket in \
+            "${{ secrets.TEST_VAULT_BUCKET }}" \
+            "${{ secrets.TEST_MEMORY_BUCKET }}"; do
+            [ -n "$bucket" ] || continue
+            aws s3 rm "s3://$bucket/$PREFIX" --recursive 2>/dev/null || true
+          done
diff --git a/.github/workflows/harness-e2e.yml b/.github/workflows/harness-e2e.yml
deleted file mode 100644
index 071608b..0000000
--- a/.github/workflows/harness-e2e.yml
+++ /dev/null
@@ -1,204 +0,0 @@
-name: harness E2E (long-lived test broker)
-
-# Issue #66 tier-2: end-to-end harness exercise against the long-lived
-# test-broker.litentry.org infrastructure provisioned by
-# scripts/provision-test-environment.sh.
-#
-# Gated on TEST_OIDC_AWS_ROLE_ARN being set as a repo secret — until the
-# operator wires it (see docs/test-environment.md §3), the job is inert
-# and surfaces as a no-op rather than failing. This keeps the workflow
-# safe to merge before the parallel infra is up.
-#
-# Coverage delta vs. harness-ci.yml:
-#   * harness-ci.yml: ephemeral anvil + in-process broker + StubSts
-#                     (no public TLS, no real AWS, no real SES)
-#   * harness-e2e.yml: real test-broker.litentry.org + real AWS test
-#                      resources (test bucket, test role) + real Heima
-#                      Paseo chain. Runs the full stage-3 per-actor +
-#                      per-data-class PrincipalTag isolation suite
-#                      that ephemeral CI can't reach.
-#
-# No LLM. No WebAuthn (passes the harness scripts in default stub mode).
-# Schedule + workflow_dispatch only — never on every PR (this hits real
-# AWS API calls + real chain RPC, so it's nightly-cadence).
-
-on:
-  schedule:
-    # Nightly at 06:00 UTC — well after the prior day's PR activity
-    # quiesces but before the operator's morning standup.
-    - cron: "0 6 * * *"
-  workflow_dispatch:
-    inputs:
-      stage:
-        description: "Which stage to run (1, 2, 3, or all)"
-        required: false
-        default: "all"
-        type: choice
-        options: ["1", "2", "3", "all"]
-
-# Prevent overlapping runs (each one consumes test AWS resources + chain RPC).
-concurrency:
-  group: harness-e2e
-  cancel-in-progress: false  # let in-flight nightly finish; queue manual runs
-
-# OIDC-only AWS auth via GitHub Actions — never long-lived secrets.
-permissions:
-  id-token: write   # required for aws-actions/configure-aws-credentials
-  contents: read
-
-jobs:
-  preflight:
-    name: gate on test infra availability
-    runs-on: ubuntu-latest
-    outputs:
-      should_run: ${{ steps.gate.outputs.should_run }}
-    steps:
-      - id: gate
-        run: |
-          if [ -n "${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}" ]; then
-            echo "should_run=true" >> "$GITHUB_OUTPUT"
-            echo "test infra credentials present; proceeding"
-          else
-            echo "should_run=false" >> "$GITHUB_OUTPUT"
-            echo "::warning::TEST_OIDC_AWS_ROLE_ARN unset — skipping. See docs/test-environment.md."
-          fi
-
-  harness-e2e:
-    name: harness/v2-stage*-demo.sh against test-broker
-    needs: preflight
-    if: needs.preflight.outputs.should_run == 'true'
-    runs-on: ubuntu-latest
-    timeout-minutes: 60
-
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          submodules: recursive
-
-      - name: Install Rust toolchain
-        uses: dtolnay/rust-toolchain@stable
-
-      - name: Cache cargo registry + target
-        uses: Swatinem/rust-cache@v2
-        with:
-          shared-key: harness-e2e
-
-      - name: Install Foundry
-        uses: foundry-rs/foundry-toolchain@v1
-        with:
-          version: stable
-
-      - name: Configure AWS credentials via OIDC (test role)
-        uses: aws-actions/configure-aws-credentials@v4
-        with:
-          role-to-assume: ${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}
-          aws-region: ${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
-          # Session name shows up in CloudTrail — keep traceable to the
-          # PR / run for forensic walking.
-          role-session-name: gh-actions-${{ github.repository_id }}-${{ github.run_id }}
-
-      - name: Build agentkeys CLI + workers
-        run: cargo build --release --workspace
-
-      - name: Source test-environment env
-        # The harness scripts source scripts/operator-workstation.env by
-        # default. For the e2e run, overlay scripts/test-environment.env
-        # into that path so the entire harness flow reuses unchanged.
-        # The .example template is committed; the live file lives only
-        # in the runner's filesystem for the duration of the job.
-        run: |
-          cp scripts/test-environment.env.example scripts/operator-workstation.env
-          # Substitute repo secrets into the live env file.
-          {
-            echo "ACCOUNT_ID=${{ secrets.TEST_ACCOUNT_ID }}"
-            echo "REGION=${{ secrets.TEST_AWS_REGION || 'us-east-1' }}"
-            echo "BROKER_HOST=${{ secrets.TEST_BROKER_HOST || 'test-broker.litentry.org' }}"
-            echo "OIDC_ISSUER=https://${{ secrets.TEST_BROKER_HOST || 'test-broker.litentry.org' }}"
-            echo "VAULT_BUCKET=${{ secrets.TEST_VAULT_BUCKET }}"
-            echo "MEMORY_BUCKET=${{ secrets.TEST_MEMORY_BUCKET }}"
-            echo "VAULT_ROLE_ARN=${{ secrets.TEST_VAULT_ROLE_ARN }}"
-            echo "MEMORY_ROLE_ARN=${{ secrets.TEST_MEMORY_ROLE_ARN }}"
-            echo "DATA_ROLE_ARN=${{ secrets.TEST_DATA_ROLE_ARN }}"
-            # Per-run S3 prefix isolation — concurrent runs (manual +
-            # nightly) won't step on each other's writes; nightly
-            # cleanup s3 rm's keys older than 7d.
-            echo "CI_S3_PREFIX=ci/run-${{ github.run_id }}"
-          } >> scripts/operator-workstation.env
-
-      - name: Stage 1 — chain + identity bootstrap
-        if: ${{ inputs.stage == 'all' || inputs.stage == '1' }}
-        # --skip-deploy: contracts are pre-deployed by
-        # scripts/provision-test-environment.sh on Heima-Paseo, and
-        # those addresses are baked into scripts/test-environment.env.
-        # --skip-email: e2e doesn't exercise the SES round-trip
-        # (separate workflow); identity bootstrap uses wallet_sig.
-        # No --webauthn: stub-mode (WEBAUTHN_MODE=0 default).
-        run: |
-          AGENTKEYS_CHAIN=heima-paseo \
-            bash harness/v2-stage1-demo.sh --skip-deploy --skip-email
-
-      - name: Stage 2 — multi-master + recovery (stub mode)
-        if: ${{ inputs.stage == 'all' || inputs.stage == '2' }}
-        run: |
-          AGENTKEYS_CHAIN=heima-paseo \
-            bash harness/v2-stage2-demo.sh --stub --skip-build
-
-      - name: Stage 3 — per-actor + per-data-class PrincipalTag isolation
-        if: ${{ inputs.stage == 'all' || inputs.stage == '3' }}
-        # The tier-2 capstone: stage-3 is the suite ephemeral CI can't
-        # run, since it requires AWS STS AssumeRoleWithWebIdentity, which
-        # in turn requires AWS to fetch the OIDC issuer's JWKS over
-        # public TLS. Now that we have test-broker.litentry.org with a
-        # real Let's Encrypt cert and real test IAM roles, all 11 steps
-        # of v2-stage3-demo.sh execute end-to-end.
-        run: |
-          AGENTKEYS_CHAIN=heima-paseo \
-            bash harness/v2-stage3-demo.sh
-
-      - name: Clean up per-run S3 prefix
-        if: always()
-        # Best-effort: tear down the per-run S3 prefix we wrote to.
-        # The nightly cleanup s3 rm catches any keys we missed.
-        run: |
-          PREFIX="ci/run-${{ github.run_id }}/"
-          for bucket in \
-            "${{ secrets.TEST_VAULT_BUCKET }}" \
-            "${{ secrets.TEST_MEMORY_BUCKET }}"; do
-            [ -n "$bucket" ] || continue
-            aws s3 rm "s3://$bucket/$PREFIX" --recursive || true
-          done
-
-  nightly-prefix-cleanup:
-    # Sweep any per-run S3 prefixes older than 7 days from the test
-    # buckets. Cheap insurance against forgotten prefixes from cancelled
-    # runs; complements the per-job cleanup above.
-    name: cleanup stale CI prefixes
-    needs: preflight
-    if: needs.preflight.outputs.should_run == 'true' && github.event_name == 'schedule'
-    runs-on: ubuntu-latest
-    timeout-minutes: 10
-    permissions:
-      id-token: write
-      contents: read
-    steps:
-      - name: Configure AWS credentials
-        uses: aws-actions/configure-aws-credentials@v4
-        with:
-          role-to-assume: ${{ secrets.TEST_OIDC_AWS_ROLE_ARN }}
-          aws-region: ${{ secrets.TEST_AWS_REGION || 'us-east-1' }}
-          role-session-name: gh-actions-cleanup-${{ github.run_id }}
-
-      - name: Sweep prefixes older than 7d
-        run: |
-          cutoff=$(date -u -d "7 days ago" +%Y-%m-%dT%H:%M:%SZ 2>/dev/null \
-                   || date -u -v-7d +%Y-%m-%dT%H:%M:%SZ)
-          for bucket in \
-            "${{ secrets.TEST_VAULT_BUCKET }}" \
-            "${{ secrets.TEST_MEMORY_BUCKET }}"; do
-            [ -n "$bucket" ] || continue
-            aws s3api list-objects-v2 --bucket "$bucket" --prefix "ci/" \
-              --query "Contents[?LastModified<\`$cutoff\`].Key" --output text \
-              | tr '\t' '\n' | while read -r key; do
-                [ -n "$key" ] && aws s3 rm "s3://$bucket/$key"
-              done
-          done
diff --git a/docs/test-environment.md b/docs/test-environment.md
deleted file mode 100644
index ca3596b..0000000
--- a/docs/test-environment.md
+++ /dev/null
@@ -1,166 +0,0 @@
-# Test environment — AgentKeys (issue #66)
-
-**Audience:** the operator setting up CI for AgentKeys, plus contributors who need to debug a CI failure.
-**Scope:** the parallel test infrastructure (broker, IAM roles, S3 buckets, deployer wallet, smart contracts) that exists alongside prod so CI can exercise the full code path without touching real user data.
-
-This is the operator-facing companion to:
-- [`.github/workflows/harness-ci.yml`](../.github/workflows/harness-ci.yml) — the tier-1 ephemeral CI workflow (no external infra)
-- [`.github/workflows/harness-e2e.yml`](../.github/workflows/harness-e2e.yml) — the tier-2 nightly E2E workflow against the long-lived test broker
-- [`harness/ci-ephemeral-stack.sh`](../harness/ci-ephemeral-stack.sh) — the ephemeral stack driver tier-1 invokes
-- [`scripts/provision-test-environment.sh`](../scripts/provision-test-environment.sh) — operator-run, one-shot provisioner for the tier-2 long-lived infra
-- [`scripts/test-environment.env.example`](../scripts/test-environment.env.example) — env file template
-
-## Two-tier model
-
-Issue #66 calls for a CI that runs the harness scripts against a parallel test environment, never spends LLM tokens, and never invokes WebAuthn. There are two natural points to do that, and we ship both:
-
-| | Tier 1 — ephemeral | Tier 2 — long-lived |
-|---|---|---|
-| **Workflow** | `harness-ci.yml` | `harness-e2e.yml` |
-| **Trigger** | every push + PR | nightly + manual dispatch |
-| **Where** | inside a GitHub Actions runner | runs against `test-broker.litentry.org` |
-| **Chain** | `anvil` (fresh per run, instant finality) | Heima-Paseo testnet (long-lived contracts) |
-| **Deployer** | anvil's prefunded default test key (zero risk) | a separate Paseo wallet, funded by operator, persisted at `~/.agentkeys/heima-paseo-deployer-test.key` |
-| **Contracts** | fresh deploy per run via Foundry | deployed once by `provision-test-environment.sh`, addresses pinned in `scripts/test-environment.env` |
-| **Broker** | in-process spawn, OIDC issuer = `http://127.0.0.1:8091`, `StubSts` | real broker process on test EC2, OIDC issuer = `https://test-broker.litentry.org`, real AWS STS |
-| **AWS** | none — broker boots with `--skip-startup-check`, no STS/S3 calls | real test bucket + real test role; AWS STS `AssumeRoleWithWebIdentity` works because the test broker exposes a public TLS-fronted JWKS endpoint |
-| **WebAuthn** | never — harness defaults to `WEBAUTHN_MODE=0` stub mode | never — same default |
-| **LLM** | never | never |
-| **Wall time** | ~10–15 min | ~25–40 min |
-
-Tier 1 catches almost all regressions because the Rust integration tests (`cargo test --workspace`) already spawn an in-process broker with `StubSts` + `StubEmailSender` — those tests cover SIWE auth, OIDC mint, cap-token verification, multi-master, recovery, and per-data-class isolation logic. What tier 1 *can't* cover is the real-AWS path: stage 3's `AssumeRoleWithWebIdentity` requires AWS to fetch the issuer's JWKS over public TLS, which an ephemeral CI runner can't expose. That's the tier-2 capstone.
-
-## Tier 1 — ephemeral CI (no operator setup needed)
-
-Already wired. Every push to `main` or `evm`, plus every PR touching `crates/**` / `harness/**` / `scripts/**`, runs:
-
-1. `cargo fmt --check`
-2. `cargo clippy --workspace --all-targets -- -D warnings`
-3. `cargo test --workspace -- --test-threads=1`
-4. `bash harness/ci-ephemeral-stack.sh`, which:
-   - Starts a fresh `anvil` on port 8545 (new chain, instant finality)
-   - Runs `forge build && forge test` in `crates/agentkeys-chain/`
-   - Runs `forge script DeployAgentKeysV1.s.sol` to deploy all 6 contracts to the ephemeral anvil
-   - Parses the deployed addresses and writes a synthetic `operator-workstation.env`
-   - Runs `scripts/verify-heima-contracts.sh` against the new addresses (read-only ABI + wiring checks)
-   - Starts `mock-server` + `agentkeys-broker-server` (with `--skip-startup-check`, OIDC issuer = `http://127.0.0.1:8091`)
-   - Probes `/healthz`, `/.well-known/openid-configuration`, `/.well-known/jwks.json`
-
-On failure, the script's EXIT trap preserves all logs (`anvil.log`, `forge-deploy.log`, `broker.log`, etc.) and the workflow uploads them as a `ephemeral-stack-logs` artifact.
-
-## Tier 2 — long-lived test broker
-
-### Operator bring-up (~2 hours, one-shot)
-
-```bash
-awsp agentkeys-admin            # AWS admin profile for the account hosting test infra
-bash scripts/provision-test-environment.sh
-```
-
-This walks through 7 steps:
-
-1. **Provision the EC2 broker host** at `test-broker.litentry.org`. Manual step (the runbook fragment in the script tells you exactly what to do on the target EC2).
-2. **Register the AWS IAM OIDC provider** for `test-broker.litentry.org` (separate ARN from prod's `oidc-provider/broker.litentry.org`).
-3. **Provision IAM roles** `agentkeys-data-role-test`, `agentkeys-vault-role-test`, `agentkeys-memory-role-test`, each trust-policied on the test OIDC provider with the same `PrincipalTag/agentkeys_actor_omni` scoping prod uses.
-4. **Provision S3 buckets** `agentkeys-mail-test-${ACCT}`, `agentkeys-vault-test-${ACCT}`, `agentkeys-memory-test-${ACCT}` with block-public-access + default SSE-S3 + the v3 split-statement PrincipalTag bucket policy.
-5. **Generate a new deployer wallet** (distinct from the prod deployer) at `~/.agentkeys/heima-paseo-deployer-test.key`. You fund it from your personal Paseo wallet (Paseo has sudo so Alice can also fund — see `scripts/heima-bring-up.sh`).
-6. **Deploy fresh v2 stage-1 contracts** to Heima-Paseo via `DeployAgentKeysV1.s.sol`. Records the addresses under `*_HEIMA_PASEO` keys in `scripts/test-environment.env`.
-7. **Provision a GitHub Actions OIDC role** (`github-actions-agentkeys-e2e`) trust-policied on `token.actions.githubusercontent.com` with a condition limiting it to the agentkeys repo. Grant it `sts:AssumeRole` on the three test roles + read-only S3 on the three test buckets.
-
-Some steps are still operator-manual (parameterizing `provision-vault-role.sh` to accept a `SUFFIX=` env var is a TODO; until then, copy the prod scripts as `-test` variants by hand). The script logs these as `skip` with a follow-up TODO instead of silently passing.
-
-### Repo secrets to set (after provisioning)
-
-After the provisioner finishes, set these in **Settings → Secrets and variables → Actions**:
-
-| Secret | Value |
-|---|---|
-| `TEST_OIDC_AWS_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/github-actions-agentkeys-e2e` |
-| `TEST_AWS_REGION` | `us-east-1` (or wherever the test broker lives) |
-| `TEST_ACCOUNT_ID` | `${ACCT}` |
-| `TEST_BROKER_HOST` | `test-broker.litentry.org` |
-| `TEST_VAULT_BUCKET` | `agentkeys-vault-test-${ACCT}` |
-| `TEST_MEMORY_BUCKET` | `agentkeys-memory-test-${ACCT}` |
-| `TEST_VAULT_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-vault-role-test` |
-| `TEST_MEMORY_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-memory-role-test` |
-| `TEST_DATA_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-data-role-test` |
-
-`TEST_OIDC_AWS_ROLE_ARN` is the **gate**: until it's set, the `harness-e2e.yml` preflight job sets `should_run=false` and the workflow surfaces as a `::warning::` skip rather than a failure. This keeps the workflow safe to merge before the parallel infra is up.
-
-### Per-run S3 prefix namespacing
-
-The e2e workflow exports `CI_S3_PREFIX=ci/run-${GITHUB_RUN_ID}` and the harness scripts honor that prefix when writing test envelopes to S3. This means concurrent runs (nightly + a manual dispatch) won't step on each other's writes.
-
-Cleanup is two-layered:
-- **Per-job cleanup**: the e2e workflow's `if: always()` step runs `aws s3 rm s3://$bucket/$PREFIX --recursive` at the end of each run.
-- **Nightly sweep**: a separate `nightly-prefix-cleanup` job lists `ci/` prefix keys older than 7 days and rm's them. Cheap insurance against forgotten prefixes from cancelled runs.
-
-### Cert renewal monitoring
-
-`test-broker.litentry.org` uses Let's Encrypt (auto-renewed every 90d by certbot). If renewal silently fails, AWS STS stops trusting the OIDC issuer and the e2e workflow turns red overnight.
-
-The nightly workflow's preflight already exercises a `curl` against `https://${TEST_BROKER_HOST}/.well-known/openid-configuration`. A renewal failure surfaces as an immediate workflow failure with a clear TLS error.
-
-### Rotating the test broker secrets
-
-If the test mock-server's `DEV_KEY_SERVICE_MASTER_SECRET` ever leaks, rotate via:
-
-```bash
-# 1. New secret on the broker host
-ssh ec2-user@test-broker.litentry.org \
-  'sudo systemctl set-environment DEV_KEY_SERVICE_MASTER_SECRET=$(openssl rand -hex 32) \
-   && sudo systemctl restart agentkeys-backend'
-
-# 2. There's nothing on the operator side to rotate — the secret never
-#    leaves the broker host (it derives per-omni signer keys in-process).
-```
-
-Test wallets minted via the rotated signer will have different addresses from pre-rotation wallets, which is the desired blast-radius cut.
-
-## Cleanup / teardown
-
-Tear down the entire test environment (cheap insurance if costs spike):
-
-```bash
-# Drain the buckets first
-for bucket in agentkeys-mail-test-${ACCT} agentkeys-vault-test-${ACCT} agentkeys-memory-test-${ACCT}; do
-  aws s3 rm "s3://$bucket" --recursive
-  aws s3api delete-bucket --bucket "$bucket"
-done
-
-# Delete the roles (detach policies first)
-for role in agentkeys-data-role-test agentkeys-vault-role-test agentkeys-memory-role-test github-actions-agentkeys-e2e; do
-  for policy in $(aws iam list-role-policies --role-name "$role" --query 'PolicyNames[]' --output text); do
-    aws iam delete-role-policy --role-name "$role" --policy-name "$policy"
-  done
-  aws iam delete-role --role-name "$role"
-done
-
-# Delete the OIDC provider
-aws iam delete-open-id-connect-provider \
-  --open-id-connect-provider-arn arn:aws:iam::${ACCT}:oidc-provider/test-broker.litentry.org
-
-# Stop + terminate the EC2 + release the EIP (manual, console or aws ec2 CLI)
-```
-
-The contracts on Heima-Paseo stay on chain (they're free), but they're inert without the broker pointing at them.
-
-## Why two tiers (vs. just one)
-
-A single-tier model — running everything against the long-lived broker on every PR — was the obvious shape, but loses on:
-
-- **Latency**: every PR pays the ~30 min e2e wall time (vs. ~10 min for tier 1).
-- **Cost**: every PR hits real AWS API calls + chain RPC + potentially gas.
-- **Contention**: concurrent PRs serialize on the single test broker, or step on each other's S3 writes without per-run prefix isolation.
-- **Brittleness**: a flaky external dep (Paseo collator hiccup, AWS API throttle) blocks merges.
-
-A single-tier model the other way — only ephemeral CI, no long-lived test broker — was also tempting, but loses stage-3 coverage entirely (`AssumeRoleWithWebIdentity` needs publicly-fetchable JWKS). That's the most security-critical layer in the codebase (per-actor + per-data-class IAM isolation per CLAUDE.md "Per-actor + per-data-class isolation invariants"), so leaving it untested in CI was unacceptable.
-
-The two-tier split puts the fast, cheap, deterministic checks on every PR and the expensive E2E on nightly. PRs that need to verify a stage-3 fix can trigger `harness-e2e.yml` via `workflow_dispatch` directly from the PR page.
-
-## Related
-
-- Original issue: [#66 — Stage 7: shared test broker for CI + dev](https://github.com/wildmeta-agent/agentKeys/issues/66)
-- Prod cloud setup: [`docs/cloud-setup.md`](cloud-setup.md)
-- Stage 7 demo + verification: [`docs/stage7-demo-and-verification.md`](stage7-demo-and-verification.md)
-- Architecture: [`docs/spec/architecture.md`](spec/architecture.md) §17 (per-data-class buckets), §4 (HDKD actor tree), CLAUDE.md "Per-actor + per-data-class isolation invariants" table
diff --git a/harness/ci-ephemeral-stack.sh b/harness/ci-ephemeral-stack.sh
deleted file mode 100755
index 8d9ffa3..0000000
--- a/harness/ci-ephemeral-stack.sh
+++ /dev/null
@@ -1,401 +0,0 @@
-#!/usr/bin/env bash
-# harness/ci-ephemeral-stack.sh — issue #66 tier-1 ephemeral CI driver.
-#
-# Stands up a complete, isolated AgentKeys test environment INSIDE a
-# single CI runner and exercises the chain-deploy path end-to-end. No
-# external infrastructure, no LLM, no WebAuthn, no real AWS.
-#
-# What this script delivers (the four parallel-infra axes from issue #66):
-#
-#   ─ new test broker server     → ephemeral agentkeys-broker-server
-#                                   spawned on 127.0.0.1, OIDC issuer
-#                                   http://127.0.0.1:$BROKER_PORT, stub
-#                                   STS client (no real AWS).
-#   ─ new smart contract on-chain → forge script deploys a fresh copy of
-#                                   the v2 stage-1 contract set
-#                                   (P256Verifier + K11Verifier +
-#                                   SidecarRegistry + AgentKeysScope +
-#                                   K3EpochCounter + CredentialAudit)
-#                                   to a brand-new anvil instance.
-#   ─ new deployer account       → anvil's canonical first prefunded test
-#                                   key (10_000 ETH; zero risk).
-#   ─ no WebAuthn                 → the harness scripts default to
-#                                   WEBAUTHN_MODE=0 (stage-1 line 131);
-#                                   this script never passes --webauthn,
-#                                   so K11 enrollment writes deterministic
-#                                   stub bytes (CI-friendly).
-#
-# What's COVERED by this script (matches the harness scripts' coverage
-# for things that don't require real AWS):
-#
-#   * Forge unit + property tests for all six v2 stage-1 contracts.
-#   * End-to-end Foundry deploy via DeployAgentKeysV1.s.sol against the
-#     ephemeral anvil — same script as heima-bring-up.sh step 5 uses
-#     against Heima Mainnet/Paseo.
-#   * Read-only ABI/wiring checks via verify-heima-contracts.sh against
-#     the freshly deployed addresses (same checks Heima uses).
-#   * Broker liveness + OIDC discovery surface (/.well-known/
-#     openid-configuration, /.well-known/jwks.json, /healthz).
-#
-# What's NOT covered here (intentionally — needs the long-lived
-# test-broker.litentry.org tier-2 environment with publicly-reachable
-# TLS + real AWS resources; see docs/test-environment.md):
-#
-#   * harness/v2-stage3-demo.sh — per-actor + per-data-class S3
-#     PrincipalTag isolation tests. AWS STS AssumeRoleWithWebIdentity
-#     requires AWS to fetch the OIDC issuer's JWKS over public TLS,
-#     which a CI runner can't expose.
-#   * Real SES email-link auth round-trip (uses StubEmailSender in unit
-#     tests; long-lived tier-2 exercises real SES).
-#
-# All the Rust-side broker/worker logic (SIWE auth, OIDC mint, cap-token
-# verify, etc.) is covered by `cargo test --workspace` in the parent
-# CI workflow — those tests already spawn an in-process broker with
-# StubSts + StubEmailSender, so the ephemeral-stack script focuses on
-# what cargo test can't reach: the on-chain deploy + ABI surface.
-#
-# Usage:
-#   bash harness/ci-ephemeral-stack.sh                # full ephemeral roundtrip
-#   bash harness/ci-ephemeral-stack.sh --skip-broker  # chain-only (forge + anvil)
-#   bash harness/ci-ephemeral-stack.sh --keep-running # leave anvil + broker up
-#                                                     # (for local debugging)
-#
-# Exit codes:
-#   0  every check passed
-#   1  any check failed; logs in $WORK_DIR/*.log preserved on failure
-#   2  prereqs missing (anvil/forge/cargo)
-
-set -euo pipefail
-
-REPO_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
-cd "$REPO_ROOT"
-
-# ─── CLI ─────────────────────────────────────────────────────────────────
-SKIP_BROKER=0
-KEEP_RUNNING=0
-ANVIL_PORT="${ANVIL_PORT:-8545}"
-MOCK_PORT="${MOCK_PORT:-8090}"
-BROKER_PORT="${BROKER_PORT:-8091}"
-
-while [[ $# -gt 0 ]]; do
-  case "$1" in
-    --skip-broker)  SKIP_BROKER=1; shift ;;
-    --keep-running) KEEP_RUNNING=1; shift ;;
-    --anvil-port)   ANVIL_PORT="$2"; shift 2 ;;
-    --mock-port)    MOCK_PORT="$2"; shift 2 ;;
-    --broker-port)  BROKER_PORT="$2"; shift 2 ;;
-    -h|--help)
-      sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d'
-      exit 0 ;;
-    *) echo "unknown flag: $1 (try --help)" >&2; exit 2 ;;
-  esac
-done
-
-# ─── Colors ──────────────────────────────────────────────────────────────
-if [ -t 2 ]; then
-  C_HEAD='\033[1;36m'; C_OK='\033[1;32m'; C_WARN='\033[1;33m'
-  C_ERR='\033[1;31m'; C_DIM='\033[2m'; C_RESET='\033[0m'
-else
-  C_HEAD=''; C_OK=''; C_WARN=''; C_ERR=''; C_DIM=''; C_RESET=''
-fi
-log()  { printf "${C_HEAD}==>${C_RESET} %s\n" "$*" >&2; }
-ok()   { printf "    ${C_OK}ok${C_RESET}    %s\n" "$*" >&2; }
-info() { printf "    ${C_DIM}info${C_RESET}  %s\n" "$*" >&2; }
-warn() { printf "    ${C_WARN}warn${C_RESET}  %s\n" "$*" >&2; }
-die()  { printf "    ${C_ERR}fail${C_RESET}  %s\n" "$*" >&2; exit 1; }
-
-# ─── Work dir + cleanup trap ─────────────────────────────────────────────
-WORK_DIR="$(mktemp -d -t agentkeys-ci-ephemeral-XXXXXX)"
-ANVIL_PID=""
-MOCK_PID=""
-BROKER_PID=""
-
-cleanup() {
-  local rc=$?
-  if [ "$KEEP_RUNNING" = "1" ]; then
-    info "--keep-running set; leaving processes up"
-    info "  anvil:  pid=$ANVIL_PID  port=$ANVIL_PORT"
-    [ -n "$MOCK_PID" ]   && info "  mock:   pid=$MOCK_PID    port=$MOCK_PORT"
-    [ -n "$BROKER_PID" ] && info "  broker: pid=$BROKER_PID  port=$BROKER_PORT"
-    info "  work_dir: $WORK_DIR"
-    exit "$rc"
-  fi
-  log "Cleanup"
-  for pid_var in BROKER_PID MOCK_PID ANVIL_PID; do
-    eval "pid=\${$pid_var:-}"
-    if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
-      kill "$pid" 2>/dev/null || true
-      wait "$pid" 2>/dev/null || true
-      ok "stopped $pid_var pid=$pid"
-    fi
-  done
-  if [ "$rc" -ne 0 ]; then
-    warn "exit=$rc — preserving logs at $WORK_DIR"
-    for f in "$WORK_DIR"/*.log; do
-      [ -e "$f" ] || continue
-      printf "\n${C_DIM}── tail $f ──${C_RESET}\n" >&2
-      tail -n 50 "$f" >&2 || true
-    done
-  else
-    rm -rf "$WORK_DIR"
-  fi
-}
-trap cleanup EXIT INT TERM
-
-# ─── 1. Prereq sanity-check ──────────────────────────────────────────────
-log "1/8 Prereq sanity-check"
-missing=()
-for tool in cargo jq curl awk grep sed anvil forge cast; do
-  command -v "$tool" >/dev/null 2>&1 || missing+=("$tool")
-done
-if [ ${#missing[@]} -gt 0 ]; then
-  warn "missing tools: ${missing[*]}"
-  warn "  install Foundry: curl -L https://foundry.paradigm.xyz | bash && foundryup"
-  warn "  install Rust:    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh"
-  die "prereqs missing"
-fi
-ok "tools present: cargo jq curl awk grep sed anvil forge cast"
-
-# ─── 2. Start anvil (new chain) ──────────────────────────────────────────
-log "2/8 Starting anvil on 127.0.0.1:$ANVIL_PORT (new ephemeral chain)"
-# Anvil's first default account: pre-funded with 10_000 ETH, deterministic.
-# This is our "new deployer account" — fresh per CI run, zero blast radius.
-ANVIL_DEPLOYER_KEY="0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80"
-ANVIL_DEPLOYER_ADDR="0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266"
-anvil --port "$ANVIL_PORT" \
-      --host 127.0.0.1 \
-      --silent \
-      > "$WORK_DIR/anvil.log" 2>&1 &
-ANVIL_PID=$!
-# Wait for RPC ready (anvil bootstraps fast — <2s typically, give it 30s)
-for _ in $(seq 1 60); do
-  if curl -sf --max-time 1 \
-       -H 'Content-Type: application/json' \
-       -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \
-       "http://127.0.0.1:$ANVIL_PORT" >/dev/null 2>&1; then
-    break
-  fi
-  sleep 0.5
-done
-curl -sf --max-time 2 \
-     -H 'Content-Type: application/json' \
-     -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \
-     "http://127.0.0.1:$ANVIL_PORT" >/dev/null \
-  || die "anvil failed to come up; see $WORK_DIR/anvil.log"
-ok "anvil up (pid=$ANVIL_PID chain_id=31337 deployer=$ANVIL_DEPLOYER_ADDR)"
-
-# ─── 3. Forge build + test (contract unit + property tests) ──────────────
-log "3/8 Forge build + test (crates/agentkeys-chain/)"
-(
-  cd crates/agentkeys-chain
-  forge build > "$WORK_DIR/forge-build.log" 2>&1 \
-    || die "forge build failed; see $WORK_DIR/forge-build.log"
-  ok "forge build clean"
-  forge test --no-match-test "fork_" > "$WORK_DIR/forge-test.log" 2>&1 \
-    || die "forge test failed; see $WORK_DIR/forge-test.log"
-  ok "forge test passed ($(grep -c "^\[PASS\]" "$WORK_DIR/forge-test.log" || echo 0) tests)"
-)
-
-# ─── 4. Deploy v2 stage-1 contract set (new smart contracts on-chain) ────
-log "4/8 Deploy v2 stage-1 contracts via DeployAgentKeysV1.s.sol"
-(
-  cd crates/agentkeys-chain
-  forge script script/DeployAgentKeysV1.s.sol \
-    --rpc-url "http://127.0.0.1:$ANVIL_PORT" \
-    --private-key "$ANVIL_DEPLOYER_KEY" \
-    --broadcast \
-    --skip-simulation \
-    > "$WORK_DIR/forge-deploy.log" 2>&1 \
-    || die "forge script deploy failed; see $WORK_DIR/forge-deploy.log"
-)
-# Parse "Name: 0xAddress" lines (the contract names from DeployAgentKeysV1.s.sol's
-# console.log calls). Format matches heima-bring-up.sh's parser.
-parse_addr() {
-  local name="$1"
-  awk -v want="$name" '
-    $0 ~ want":" {
-      for (i=1; i<=NF; i++) if ($i ~ /^0x[a-fA-F0-9]{40}$/) { print $i; exit }
-    }
-  ' "$WORK_DIR/forge-deploy.log"
-}
-SCOPE_ADDR=$(parse_addr "AgentKeysScope")
-REGISTRY_ADDR=$(parse_addr "SidecarRegistry")
-EPOCH_ADDR=$(parse_addr "K3EpochCounter")
-AUDIT_ADDR=$(parse_addr "CredentialAudit")
-P256_ADDR=$(parse_addr "P256Verifier")
-K11_ADDR=$(parse_addr "K11Verifier")
-for v in SCOPE_ADDR REGISTRY_ADDR EPOCH_ADDR AUDIT_ADDR P256_ADDR K11_ADDR; do
-  eval "val=\${$v}"
-  [ -n "$val" ] || die "could not parse $v from forge-deploy.log"
-done
-ok "AgentKeysScope:  $SCOPE_ADDR"
-ok "SidecarRegistry: $REGISTRY_ADDR"
-ok "K3EpochCounter:  $EPOCH_ADDR"
-ok "CredentialAudit: $AUDIT_ADDR"
-ok "P256Verifier:    $P256_ADDR"
-ok "K11Verifier:     $K11_ADDR"
-
-# ─── 5. Write synthetic operator-workstation.env for verify scripts ──────
-log "5/8 Write synthetic operator-workstation.env (--anvil profile)"
-SYNTH_ENV="$WORK_DIR/operator-workstation.env"
-cat > "$SYNTH_ENV" <<EOF
-# Synthetic env file for harness/ci-ephemeral-stack.sh (issue #66).
-# Generated $(date -u +%Y-%m-%dT%H:%M:%SZ) — DO NOT COMMIT.
-ACCOUNT_ID=000000000000
-REGION=us-east-1
-MAIL_DOMAIN=test.invalid
-MAIL_BUCKET=agentkeys-mail-ci-ephemeral
-BUCKET=agentkeys-mail-ci-ephemeral
-VAULT_BUCKET=agentkeys-vault-ci-ephemeral
-MEMORY_BUCKET=agentkeys-memory-ci-ephemeral
-BROKER_HOST=127.0.0.1:$BROKER_PORT
-OIDC_ISSUER=http://127.0.0.1:$BROKER_PORT
-BACKEND_URL=http://127.0.0.1:$MOCK_PORT
-AGENTKEYS_SIGNER_URL=http://127.0.0.1:$MOCK_PORT
-DATA_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-data-role-ci
-VAULT_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-vault-role-ci
-MEMORY_ROLE_ARN=arn:aws:iam::000000000000:role/agentkeys-memory-role-ci
-
-# v2 stage-1 contracts (anvil profile)
-SCOPE_CONTRACT_ADDRESS_ANVIL=$SCOPE_ADDR
-SIDECAR_REGISTRY_ADDRESS_ANVIL=$REGISTRY_ADDR
-K3_EPOCH_COUNTER_ADDRESS_ANVIL=$EPOCH_ADDR
-CREDENTIAL_AUDIT_ADDRESS_ANVIL=$AUDIT_ADDR
-P256_VERIFIER_ADDRESS_ANVIL=$P256_ADDR
-K11_VERIFIER_ADDRESS_ANVIL=$K11_ADDR
-EOF
-ok "wrote $SYNTH_ENV"
-
-# ─── 6. Verify deployed contracts (read-only ABI + wiring checks) ────────
-log "6/8 verify-heima-contracts.sh (anvil profile)"
-# verify-heima-contracts.sh reads scripts/operator-workstation.env, so
-# overlay the synthetic file in place for the duration of this step.
-# Restored even on failure via the trap.
-REAL_ENV="$REPO_ROOT/scripts/operator-workstation.env"
-BACKUP_ENV=""
-if [ -f "$REAL_ENV" ]; then
-  BACKUP_ENV="$WORK_DIR/operator-workstation.env.original"
-  cp "$REAL_ENV" "$BACKUP_ENV"
-fi
-restore_env() {
-  if [ -n "$BACKUP_ENV" ] && [ -f "$BACKUP_ENV" ]; then
-    cp "$BACKUP_ENV" "$REAL_ENV"
-  elif [ -f "$REAL_ENV" ] && [ -z "$BACKUP_ENV" ]; then
-    rm -f "$REAL_ENV"
-  fi
-}
-cp "$SYNTH_ENV" "$REAL_ENV"
-verify_rc=0
-AGENTKEYS_CHAIN=anvil bash "$REPO_ROOT/scripts/verify-heima-contracts.sh" \
-  > "$WORK_DIR/verify-contracts.log" 2>&1 || verify_rc=$?
-restore_env
-if [ "$verify_rc" -ne 0 ]; then
-  warn "verify-heima-contracts.sh exited $verify_rc; full log:"
-  cat "$WORK_DIR/verify-contracts.log" >&2
-  die "contract verification failed"
-fi
-ok "all six v2 stage-1 contracts verified (bytecode + ABI + wiring)"
-
-# ─── 7. Optional: stand up the broker server (skipped by default) ────────
-if [ "$SKIP_BROKER" = "1" ]; then
-  log "7/8 Broker bring-up SKIPPED (--skip-broker)"
-else
-  log "7/8 Stand up ephemeral broker (new test broker server)"
-
-  # Pre-generate keypairs so the broker boots clean. The keygen
-  # subcommand writes 0600 files; matches the production setup-broker-host
-  # flow but in $WORK_DIR instead of /var/lib/agentkeys.
-  BROKER_DATA_DIR="$WORK_DIR/broker-data"
-  mkdir -p "$BROKER_DATA_DIR"
-  info "building agentkeys-broker-server (release)"
-  cargo build --release -p agentkeys-broker-server \
-    > "$WORK_DIR/cargo-build-broker.log" 2>&1 \
-    || die "cargo build broker failed; see $WORK_DIR/cargo-build-broker.log"
-  BROKER_BIN="$REPO_ROOT/target/release/agentkeys-broker-server"
-  [ -x "$BROKER_BIN" ] || die "broker binary missing at $BROKER_BIN"
-
-  "$BROKER_BIN" keygen --purpose oidc \
-    --out "$BROKER_DATA_DIR/oidc-keypair.json" >/dev/null
-  "$BROKER_BIN" keygen --purpose session \
-    --out "$BROKER_DATA_DIR/session-keypair.json" >/dev/null
-  ok "broker keypairs generated"
-
-  info "building agentkeys-mock-server (release)"
-  cargo build --release -p agentkeys-mock-server \
-    > "$WORK_DIR/cargo-build-mock.log" 2>&1 \
-    || die "cargo build mock-server failed; see $WORK_DIR/cargo-build-mock.log"
-  MOCK_BIN="$REPO_ROOT/target/release/agentkeys-mock-server"
-  [ -x "$MOCK_BIN" ] || die "mock-server binary missing at $MOCK_BIN"
-
-  info "starting mock-server on 127.0.0.1:$MOCK_PORT"
-  "$MOCK_BIN" --port "$MOCK_PORT" \
-    > "$WORK_DIR/mock-server.log" 2>&1 &
-  MOCK_PID=$!
-  for _ in $(seq 1 60); do
-    curl -sf --max-time 1 "http://127.0.0.1:$MOCK_PORT/healthz" >/dev/null 2>&1 && break
-    sleep 0.25
-  done
-  curl -sf --max-time 2 "http://127.0.0.1:$MOCK_PORT/healthz" >/dev/null \
-    || die "mock-server failed to come up; see $WORK_DIR/mock-server.log"
-  ok "mock-server up (pid=$MOCK_PID)"
-
-  info "starting broker on 127.0.0.1:$BROKER_PORT (--skip-startup-check)"
-  # No real AWS creds in CI — broker runs OIDC-only mint path per issue #71,
-  # so the only thing AWS would do is the optional GetCallerIdentity probe,
-  # which --skip-startup-check disables.
-  BROKER_OIDC_ISSUER="http://127.0.0.1:$BROKER_PORT" \
-  BROKER_BACKEND_URL="http://127.0.0.1:$MOCK_PORT" \
-  BROKER_DATA_ROLE_ARN="arn:aws:iam::000000000000:role/agentkeys-data-role-ci" \
-  BROKER_AWS_REGION="us-east-1" \
-  BROKER_OIDC_KEYPAIR_PATH="$BROKER_DATA_DIR/oidc-keypair.json" \
-  BROKER_SESSION_KEYPAIR_PATH="$BROKER_DATA_DIR/session-keypair.json" \
-  BROKER_AUDIT_DB_PATH="$BROKER_DATA_DIR/audit.sqlite" \
-  RUST_LOG=info \
-    "$BROKER_BIN" --bind 127.0.0.1 --port "$BROKER_PORT" --skip-startup-check \
-    > "$WORK_DIR/broker.log" 2>&1 &
-  BROKER_PID=$!
-  for _ in $(seq 1 60); do
-    curl -sf --max-time 1 "http://127.0.0.1:$BROKER_PORT/healthz" >/dev/null 2>&1 && break
-    sleep 0.25
-  done
-  curl -sf --max-time 2 "http://127.0.0.1:$BROKER_PORT/healthz" >/dev/null \
-    || die "broker failed to come up; see $WORK_DIR/broker.log"
-  ok "broker up (pid=$BROKER_PID)"
-
-  # OIDC discovery surface — same endpoints AWS would hit in tier-2.
-  info "probing OIDC discovery surface"
-  curl -sf --max-time 2 \
-       "http://127.0.0.1:$BROKER_PORT/.well-known/openid-configuration" \
-       > "$WORK_DIR/oidc-config.json" \
-    || die "openid-configuration unreachable"
-  jq -e '.issuer == "http://127.0.0.1:'"$BROKER_PORT"'"' \
-       "$WORK_DIR/oidc-config.json" >/dev/null \
-    || die "openid-configuration issuer claim mismatch (see $WORK_DIR/oidc-config.json)"
-  ok ".well-known/openid-configuration → issuer matches"
-
-  curl -sf --max-time 2 \
-       "http://127.0.0.1:$BROKER_PORT/.well-known/jwks.json" \
-       > "$WORK_DIR/jwks.json" \
-    || die "jwks.json unreachable"
-  jq -e '.keys | length >= 1' "$WORK_DIR/jwks.json" >/dev/null \
-    || die "jwks.json has no keys (see $WORK_DIR/jwks.json)"
-  ok ".well-known/jwks.json → at least one key present"
-fi
-
-# ─── 8. Summary ──────────────────────────────────────────────────────────
-log "8/8 Summary"
-ok "ephemeral environment passed all checks"
-info "  chain      : anvil  (chain_id 31337, ephemeral)"
-info "  deployer   : $ANVIL_DEPLOYER_ADDR"
-info "  contracts  : 6/6 deployed + verified on chain"
-if [ "$SKIP_BROKER" != "1" ]; then
-  info "  broker     : http://127.0.0.1:$BROKER_PORT"
-  info "  oidc issuer: http://127.0.0.1:$BROKER_PORT"
-  info "  backend    : http://127.0.0.1:$MOCK_PORT (mock-server)"
-fi
-info ""
-info "Not covered here (needs long-lived test-broker.litentry.org —"
-info "see docs/test-environment.md):"
-info "  * stage-3 per-actor + per-data-class S3 PrincipalTag isolation"
-info "  * real AWS STS AssumeRoleWithWebIdentity"
-info "  * real SES email-link auth round-trip"
diff --git a/scripts/provision-test-environment.sh b/scripts/provision-test-environment.sh
deleted file mode 100755
index 6155522..0000000
--- a/scripts/provision-test-environment.sh
+++ /dev/null
@@ -1,276 +0,0 @@
-#!/usr/bin/env bash
-# scripts/provision-test-environment.sh — issue #66 tier-2 one-shot
-# provisioner for the long-lived parallel test environment.
-#
-# What this script provisions (every resource parallel to prod, every
-# name carrying a -test suffix so misconfigured CI runs targeting prod
-# fail closed):
-#
-#   1. AWS IAM OIDC provider for test-broker.litentry.org
-#   2. AWS IAM roles:
-#        - agentkeys-data-role-test    (email subsystem)
-#        - agentkeys-vault-role-test   (credentials, scoped to vault bucket)
-#        - agentkeys-memory-role-test  (long-term memory, scoped to memory bucket)
-#      All three trust-policied on the test OIDC provider, with the same
-#      PrincipalTag/agentkeys_actor_omni scoping that prod uses.
-#   3. AWS S3 buckets (per-data-class, per arch.md §17.2):
-#        - agentkeys-mail-test-${ACCT}
-#        - agentkeys-vault-test-${ACCT}
-#        - agentkeys-memory-test-${ACCT}
-#      Each with block-public-access + default SSE-S3 + the v3
-#      split-statement PrincipalTag bucket policy from prod
-#      (scripts/apply-vault-bucket-policy.sh + apply-memory-bucket-policy.sh).
-#   4. EC2 broker host at test-broker.litentry.org via:
-#        bash scripts/setup-broker-host.sh \
-#          --issuer-url https://test-broker.litentry.org \
-#          --account-id ${ACCOUNT_ID} \
-#          --signer-host signer-test.litentry.org \
-#          --audit-host audit-test.litentry.org \
-#          --email-host email-test.litentry.org \
-#          --cred-host cred-test.litentry.org \
-#          --memory-host memory-test.litentry.org \
-#          --chain-rpc https://rpc.paseo-parachain.heima.network \
-#          --vault-bucket agentkeys-vault-test-${ACCOUNT_ID} \
-#          --memory-bucket agentkeys-memory-test-${ACCOUNT_ID}
-#   5. A new deployer wallet on Heima-Paseo (distinct from the prod
-#      deployer), persisted at ~/.agentkeys/heima-paseo-deployer-test.key.
-#      Funded from the operator's personal Paseo wallet (no sudo on
-#      mainnet; sudo is fine on Paseo via Alice if collators are up).
-#   6. Fresh v2 stage-1 contracts deployed via DeployAgentKeysV1.s.sol
-#      to Heima-Paseo, distinct addresses from prod, written to
-#      scripts/test-environment.env under the *_HEIMA_PASEO keys.
-#
-# Idempotent: re-run safely. Each step pre-checks "is this already done?"
-# before acting. Failed runs leave a paper trail in $WORK_DIR.
-#
-# This is the OPERATOR script — runs once per account. The CI workflow
-# (.github/workflows/harness-e2e.yml) consumes the provisioned env via
-# GitHub Actions secrets + scripts/test-environment.env.
-#
-# Usage:
-#   awsp agentkeys-admin                             # admin profile required
-#   bash scripts/provision-test-environment.sh       # full provisioning
-#   bash scripts/provision-test-environment.sh --dry-run
-#   bash scripts/provision-test-environment.sh --only-step N
-#
-# Per CLAUDE.md, this script is the SINGLE ENTRY POINT for test-env
-# changes. No ad-hoc aws iam / aws s3api edits — extend this script
-# instead and re-run.
-
-set -euo pipefail
-
-REPO_ROOT="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")/.." && pwd)"
-cd "$REPO_ROOT"
-
-# ─── Config defaults ─────────────────────────────────────────────────────
-DRY_RUN=0
-ONLY_STEP=""
-TEST_BROKER_HOST="${TEST_BROKER_HOST:-test-broker.litentry.org}"
-TEST_SIGNER_HOST="${TEST_SIGNER_HOST:-signer-test.litentry.org}"
-TEST_AUDIT_HOST="${TEST_AUDIT_HOST:-audit-test.litentry.org}"
-TEST_EMAIL_HOST="${TEST_EMAIL_HOST:-email-test.litentry.org}"
-TEST_CRED_HOST="${TEST_CRED_HOST:-cred-test.litentry.org}"
-TEST_MEMORY_HOST="${TEST_MEMORY_HOST:-memory-test.litentry.org}"
-TEST_ENV_FILE="$REPO_ROOT/scripts/test-environment.env"
-TEST_ENV_EXAMPLE="$REPO_ROOT/scripts/test-environment.env.example"
-WORK_DIR="$(mktemp -d -t agentkeys-provision-test-XXXXXX)"
-
-while [[ $# -gt 0 ]]; do
-  case "$1" in
-    --dry-run)        DRY_RUN=1; shift ;;
-    --only-step)      ONLY_STEP="$2"; shift 2 ;;
-    --test-broker-host) TEST_BROKER_HOST="$2"; shift 2 ;;
-    -h|--help)
-      sed -n '2,/^set -euo/p' "$0" | sed 's/^# \?//' | sed '$d'
-      exit 0 ;;
-    *) echo "unknown flag: $1 (try --help)" >&2; exit 2 ;;
-  esac
-done
-
-# ─── Colors ──────────────────────────────────────────────────────────────
-if [ -t 2 ]; then
-  C_HEAD='\033[1;36m'; C_OK='\033[1;32m'; C_SKIP='\033[1;33m'
-  C_WARN='\033[1;33m'; C_ERR='\033[1;31m'; C_RESET='\033[0m'
-else
-  C_HEAD=''; C_OK=''; C_SKIP=''; C_WARN=''; C_ERR=''; C_RESET=''
-fi
-log()  { printf "${C_HEAD}==>${C_RESET} %s\n" "$*" >&2; }
-ok()   { printf "    ${C_OK}ok${C_RESET}    %s\n" "$*" >&2; }
-skip() { printf "    ${C_SKIP}skip${C_RESET}  %s\n" "$*" >&2; }
-warn() { printf "    ${C_WARN}warn${C_RESET}  %s\n" "$*" >&2; }
-die()  { printf "    ${C_ERR}fail${C_RESET}  %s\n" "$*" >&2; exit 1; }
-
-should_run_step() {
-  [ -z "$ONLY_STEP" ] && return 0
-  [ "$1" = "$ONLY_STEP" ]
-}
-
-run_or_dry() {
-  if [ "$DRY_RUN" = "1" ]; then
-    printf "    ${C_WARN}dry-run${C_RESET} %s\n" "$*" >&2
-  else
-    "$@"
-  fi
-}
-
-# ─── Step 0: prerequisite check ──────────────────────────────────────────
-log "0/7 Prereq check"
-caller_arn=$(aws sts get-caller-identity --query Arn --output text 2>&1) \
-  || die "aws sts get-caller-identity failed: $caller_arn — run: awsp agentkeys-admin"
-caller_lc=$(printf '%s' "$caller_arn" | tr '[:upper:]' '[:lower:]')
-case "$caller_lc" in
-  *":user/agentkeys-admin"*) ok "caller: $caller_arn" ;;
-  *) die "caller is $caller_arn — admin required. Run: awsp agentkeys-admin" ;;
-esac
-ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
-REGION="${AWS_REGION:-us-east-1}"
-ok "ACCOUNT_ID=$ACCOUNT_ID REGION=$REGION"
-
-# Seed the env file if missing
-if [ ! -f "$TEST_ENV_FILE" ]; then
-  [ -f "$TEST_ENV_EXAMPLE" ] || die "missing $TEST_ENV_EXAMPLE (committed template)"
-  cp "$TEST_ENV_EXAMPLE" "$TEST_ENV_FILE"
-  ok "seeded $TEST_ENV_FILE from .example"
-fi
-
-env_set() {
-  local key="$1" val="$2" file="$3"
-  if grep -qE "^${key}=" "$file" 2>/dev/null; then
-    if [ "$(uname)" = "Darwin" ]; then
-      sed -i '' -E "s|^${key}=.*|${key}=${val}|" "$file"
-    else
-      sed -i -E "s|^${key}=.*|${key}=${val}|" "$file"
-    fi
-  else
-    printf '%s=%s\n' "$key" "$val" >> "$file"
-  fi
-}
-env_set ACCOUNT_ID "$ACCOUNT_ID" "$TEST_ENV_FILE"
-env_set REGION "$REGION" "$TEST_ENV_FILE"
-
-# ─── Step 1: provision the broker host (mirrors prod §5) ─────────────────
-if should_run_step 1; then
-  log "1/7 Provision broker host (test-broker.${TEST_BROKER_HOST#test-broker.})"
-  cat >&2 <<EOF
-    This step is OPERATOR-DRIVEN — setup-broker-host.sh runs on the
-    target EC2, not on your laptop. The runbook:
-
-      1. Stand up a fresh t3.micro EC2 with an Elastic IP.
-      2. Add an A record for ${TEST_BROKER_HOST} pointing at the EIP.
-         (Same for signer-test / audit-test / email-test / cred-test /
-          memory-test — five additional A records.)
-      3. SSH into the EC2 as ec2-user, then:
-           git clone https://github.com/<owner>/agentKeys && cd agentKeys
-           bash scripts/setup-broker-host.sh \\
-             --issuer-url https://${TEST_BROKER_HOST} \\
-             --account-id ${ACCOUNT_ID} \\
-             --signer-host ${TEST_SIGNER_HOST} \\
-             --audit-host ${TEST_AUDIT_HOST} \\
-             --email-host ${TEST_EMAIL_HOST} \\
-             --cred-host ${TEST_CRED_HOST} \\
-             --memory-host ${TEST_MEMORY_HOST} \\
-             --chain-rpc https://rpc.paseo-parachain.heima.network \\
-             --vault-bucket agentkeys-vault-test-${ACCOUNT_ID} \\
-             --memory-bucket agentkeys-memory-test-${ACCOUNT_ID} \\
-             --email-from noreply-test@bots-test.litentry.org \\
-             --non-interactive --yes
-      4. Confirm: curl -sf https://${TEST_BROKER_HOST}/healthz
-
-    See docs/test-environment.md §3 for the full host runbook.
-EOF
-  skip "manual operator step; rerun --only-step 2 once the host is up"
-fi
-
-# ─── Step 2: IAM OIDC provider for test-broker ───────────────────────────
-if should_run_step 2; then
-  log "2/7 IAM OIDC provider (oidc-provider/${TEST_BROKER_HOST})"
-  oidc_arn="arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${TEST_BROKER_HOST}"
-  if aws iam get-open-id-connect-provider --open-id-connect-provider-arn "$oidc_arn" \
-       >/dev/null 2>&1; then
-    skip "OIDC provider already registered: $oidc_arn"
-  else
-    # Fetch the broker's TLS leaf thumbprint (AWS requires it for OIDC
-    # provider registration). Public TLS cert, so this is fine to
-    # fetch from any network.
-    thumb=$(echo | openssl s_client -servername "$TEST_BROKER_HOST" \
-                                     -connect "${TEST_BROKER_HOST}:443" 2>/dev/null \
-              | openssl x509 -fingerprint -noout 2>/dev/null \
-              | awk -F'=' '{print $2}' | tr -d ':' | tr 'A-Z' 'a-z')
-    [ -n "$thumb" ] || die "could not fetch TLS thumbprint for ${TEST_BROKER_HOST}; is the broker reachable?"
-    run_or_dry aws iam create-open-id-connect-provider \
-      --url "https://${TEST_BROKER_HOST}" \
-      --client-id-list "sts.amazonaws.com" \
-      --thumbprint-list "$thumb"
-    ok "registered $oidc_arn (thumbprint=$thumb)"
-  fi
-  env_set OIDC_PROVIDER_ARN "$oidc_arn" "$TEST_ENV_FILE"
-fi
-
-# ─── Step 3: IAM roles (data, vault, memory) ─────────────────────────────
-if should_run_step 3; then
-  log "3/7 IAM roles (data-test, vault-test, memory-test)"
-  # These wrap the existing prod provisioning scripts with a -test
-  # suffix on every name. The scripts read role/bucket names from env,
-  # so set env then call.
-  warn "extend scripts/provision-vault-role.sh + provision-memory-role.sh"
-  warn "to accept a SUFFIX env var, or copy them as -test variants."
-  warn "Tracking as a TODO in this script — exercise once the prod"
-  warn "scripts are parameterized (~ 1 PR of work)."
-fi
-
-# ─── Step 4: S3 buckets ──────────────────────────────────────────────────
-if should_run_step 4; then
-  log "4/7 S3 buckets (mail-test, vault-test, memory-test)"
-  warn "same parameterization story as step 3 — see TODO above."
-fi
-
-# ─── Step 5: deployer wallet + funding ───────────────────────────────────
-if should_run_step 5; then
-  log "5/7 Deployer wallet on Heima-Paseo (distinct from prod deployer)"
-  KEYFILE="$HOME/.agentkeys/heima-paseo-deployer-test.key"
-  if [ -f "$KEYFILE" ]; then
-    skip "$KEYFILE exists"
-  else
-    mkdir -p "$(dirname "$KEYFILE")"
-    run_or_dry cast wallet new --json \
-      | tee "$WORK_DIR/wallet.json" \
-      | jq -r .[0].private_key > "$KEYFILE"
-    chmod 600 "$KEYFILE"
-    addr=$(jq -r .[0].address "$WORK_DIR/wallet.json")
-    ok "generated $KEYFILE (addr=$addr) — fund this address from your"
-    ok "  personal Paseo wallet, then re-run --only-step 6 to deploy contracts."
-  fi
-fi
-
-# ─── Step 6: deploy v2 stage-1 contracts on Heima-Paseo ──────────────────
-if should_run_step 6; then
-  log "6/7 Deploy v2 stage-1 contracts to Heima-Paseo (new contracts on-chain)"
-  KEYFILE="$HOME/.agentkeys/heima-paseo-deployer-test.key"
-  [ -f "$KEYFILE" ] || die "missing $KEYFILE — run --only-step 5 first"
-  run_or_dry env HEIMA_DEPLOYER_KEY_FILE="$KEYFILE" \
-    AGENTKEYS_CHAIN=heima-paseo \
-    bash "$REPO_ROOT/scripts/heima-bring-up.sh"
-  ok "contract addresses recorded in scripts/operator-workstation.env;"
-  ok "  copy the *_HEIMA_PASEO lines into $TEST_ENV_FILE."
-fi
-
-# ─── Step 7: GitHub Actions OIDC role for the e2e workflow ───────────────
-if should_run_step 7; then
-  log "7/7 GitHub Actions OIDC role (test-only)"
-  warn "Create an additional IAM role 'github-actions-agentkeys-e2e'"
-  warn "with trust policy on token.actions.githubusercontent.com and a"
-  warn "condition limiting to the agentkeys repo + branch ref. Grant"
-  warn "agentkeys-vault-role-test + agentkeys-memory-role-test assume"
-  warn "perms and read-only S3 on the three test buckets."
-  warn ""
-  warn "Then store the role ARN as the TEST_OIDC_AWS_ROLE_ARN repo secret."
-  warn "Until that secret is set, .github/workflows/harness-e2e.yml is"
-  warn "inert (the job is gated on its presence)."
-fi
-
-# ─── Done ────────────────────────────────────────────────────────────────
-log "Done"
-ok "test environment provisioning complete (or skip-noted above)"
-ok "next: bash harness/v2-stage3-demo.sh against \$OIDC_ISSUER=${TEST_BROKER_HOST}"
-ok "  with AGENTKEYS_ENV_FILE=$TEST_ENV_FILE"
-rm -rf "$WORK_DIR"
diff --git a/scripts/test-environment.env.example b/scripts/test-environment.env.example
deleted file mode 100644
index 68c8787..0000000
--- a/scripts/test-environment.env.example
+++ /dev/null
@@ -1,92 +0,0 @@
-# AgentKeys long-lived test environment — env file template (issue #66 tier-2).
-#
-# Companion to scripts/operator-workstation.env, but for the PARALLEL
-# test infrastructure (not prod):
-#
-#   - Hostname:   test-broker.litentry.org     (vs. broker.litentry.org)
-#   - OIDC iss:   https://test-broker.litentry.org
-#   - IAM role:   agentkeys-data-role-test     (vs. agentkeys-data-role)
-#   - Vault role: agentkeys-vault-role-test    (vs. agentkeys-vault-role)
-#   - Mem role:   agentkeys-memory-role-test   (vs. agentkeys-memory-role)
-#   - Mail/vault/memory buckets: -test suffix on every bucket name
-#   - Chain:      heima-paseo (testnet — no real-HEI cost on every CI run)
-#   - Deployer:   separate keypair, persisted only in operator wallet
-#   - Contracts:  deployed fresh by scripts/provision-test-environment.sh,
-#                 distinct addresses from prod (recorded below per chain)
-#
-# Why mirror operator-workstation.env instead of forking it: the harness
-# scripts (harness/v2-stage*.sh) source ONE env file. Setting
-# AGENTKEYS_ENV_FILE=./scripts/test-environment.env before invoking a
-# harness script reuses the entire flow against the test infra unchanged.
-#
-# Bring-up: bash scripts/provision-test-environment.sh
-# Activate: cp scripts/test-environment.env.example scripts/test-environment.env
-#           (then fill in the values below from the provisioner output)
-#
-# This .example file commits as-is. The non-example copy MUST NOT be
-# committed (it carries no secrets in itself, but its contents are the
-# canonical "this account hosts the test infra" pointer — gated behind
-# the operator's deliberate copy).
-#
-# See docs/test-environment.md for the full bring-up runbook.
-
-# ─── AWS account ─────────────────────────────────────────────────────────
-# Same account as prod is fine for cost, but every resource name carries
-# a -test suffix so a misconfigured CI run targeting prod fails closed
-# (the role / bucket / OIDC provider simply won't exist in prod).
-ACCOUNT_ID=000000000000
-REGION=us-east-1
-
-# ─── Hostname + OIDC issuer ──────────────────────────────────────────────
-# DNS A record + TLS cert + nginx + systemd all per scripts/setup-broker-host.sh
-# with --issuer-url https://test-broker.litentry.org. Long-lived because
-# AWS validates the OIDC issuer URL byte-for-byte against the JWT `iss`
-# claim — every reboot must restore the same URL.
-BROKER_HOST=test-broker.litentry.org
-OIDC_ISSUER=https://${BROKER_HOST}
-OIDC_PROVIDER_ARN=arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${BROKER_HOST}
-
-# ─── IAM roles (parallel to prod, distinct ARNs) ─────────────────────────
-DATA_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role-test
-VAULT_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-vault-role-test
-MEMORY_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-memory-role-test
-
-# ─── S3 buckets (parallel to prod, distinct names) ───────────────────────
-MAIL_DOMAIN=bots-test.litentry.org
-MAIL_BUCKET=agentkeys-mail-test-${ACCOUNT_ID}
-BUCKET=${MAIL_BUCKET}
-VAULT_BUCKET=agentkeys-vault-test-${ACCOUNT_ID}
-MEMORY_BUCKET=agentkeys-memory-test-${ACCOUNT_ID}
-
-# ─── Backend (signer) URL ────────────────────────────────────────────────
-# Test env runs the mock-server backend (the production dev_key_service
-# shape). Real TEE workers are out of scope for the test environment —
-# see issue #74 step 2.
-AGENTKEYS_SIGNER_URL=https://signer-test.litentry.org
-BACKEND_URL=${AGENTKEYS_SIGNER_URL}
-
-# ─── Chain (Heima-Paseo testnet) ─────────────────────────────────────────
-# Defaults to Paseo for zero real-HEI cost. Override to `anvil` for
-# fully local runs; never `heima` (mainnet — prod-only).
-AGENTKEYS_CHAIN=heima-paseo
-
-# Contract addresses — populated by scripts/provision-test-environment.sh.
-# Keep one set per chain so re-bring-up against another chain doesn't
-# clobber. The non-test file commits the actual addresses post-deploy.
-SCOPE_CONTRACT_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-SIDECAR_REGISTRY_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-K3_EPOCH_COUNTER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-CREDENTIAL_AUDIT_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-P256_VERIFIER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-K11_VERIFIER_ADDRESS_HEIMA_PASEO=0x0000000000000000000000000000000000000000
-
-# ─── Deployer key path ───────────────────────────────────────────────────
-# Operator-held only; the test deployer is a DIFFERENT wallet from prod.
-# Provisioner persists it at ~/.agentkeys/heima-paseo-deployer-test.key.
-HEIMA_DEPLOYER_KEY_FILE=${HOME}/.agentkeys/heima-paseo-deployer-test.key
-
-# ─── CI namespacing (per-run S3 prefix isolation) ────────────────────────
-# Set by the e2e workflow at run time so concurrent CI runs don't step
-# on each other's writes. Cleaned up by nightly s3-prefix-rm job (see
-# docs/test-environment.md §Cleanup).
-CI_S3_PREFIX=ci/pr-${PR_NUMBER:-manual}/run-${GITHUB_RUN_ID:-local}

From 5a66a8535b414375c7694d8404141962258513f5 Mon Sep 17 00:00:00 2001
From: wildmeta-agent <agent@wildmeta.ai>
Date: Thu, 21 May 2026 10:05:38 +0800
Subject: [PATCH 3/4] docs: concise setup guides aligned with
 scripts/setup-{broker-host,heima}.sh
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per operator request: pivot cloud-setup.md from a verbose manual-bash
runbook to a concise prereq/script-pointer split, add new heima-setup.md
+ ci-setup.md for the chain + CI flows, and move troubleshooting into
the ./wiki/ folder.

What changed:

  docs/cloud-setup.md  — UPDATE, 970 → 314 lines
    Add a TL;DR with the three-command operator flow (manual §1-§4
    prereqs, then setup-broker-host.sh, then setup-heima.sh). Slim
    §1-§4 to invariants + helper-script pointers + brief command
    blocks (DKIM bulk-record / receipt rule / per-data-class role
    provisioning all delegate to the existing scripts/*.sh). Replace
    the verbose §5/§6/§7 (EC2 broker / signer / workers, each with
    100+ lines of inline bash) with one §5 "Run setup-broker-host.sh"
    section that names what the script does (build, systemd, nginx,
    certbot, keypairs, env files) + what it doesn't (DNS, IAM, OIDC
    provider — those stay in §1-§4). Keep §0 (identities table) and
    §6 (cleanup recipe).

  docs/heima-setup.md  — NEW, 106 lines
    The 15-step pipeline in scripts/setup-heima.sh, with idempotency
    check + helper-script pointer per step. Mainnet vs Paseo vs Anvil
    tradeoff table. Per-step re-run examples. Heima London EVM pin
    explanation.

  docs/ci-setup.md  — NEW, 184 lines
    The 7-step operator bring-up for the no-LLM
    .github/workflows/harness-ci.yml workflow: provision test broker
    via setup-broker-host.sh with -test suffix, provision parallel
    AWS resources, register the test OIDC provider, generate + fund
    the test deployer wallet, deploy fresh test contracts on Heima
    mainnet with the same .sol source (different deployer →
    different addresses → isolated parallel contract set), register
    the GitHub Actions OIDC role, set the repo secrets. Includes
    the full TEST_* secret list, manual-dispatch instructions, and
    a secret-hygiene reminder.

  wiki/cloud-setup-faq.md     — NEW, 94 lines
  wiki/heima-setup-faq.md     — NEW, 111 lines
  wiki/ci-setup-faq.md        — NEW, 96 lines
    Troubleshooting + edge cases for each setup doc. Lives under
    ./wiki/ per CLAUDE.md "Wiki-location policy" — auto-published
    to the GitHub wiki on every push to main.

Constraints applied:

  - Concise: every doc fits in a few screens.
  - Idempotent: every flow reuses the existing idempotent helper
    scripts (setup-broker-host.sh, setup-heima.sh, provision-*-role.sh,
    apply-*-bucket-policy.sh).
  - No project credentials exposed: account IDs, role ARNs, bucket
    names, deployer keys, contract addresses all referenced via
    ${ACCOUNT_ID} / ${BROKER_HOST} / ${REGION} placeholders or via
    "read from operator-workstation.env" / "from step N" pointers.
    Real values live only in the operator's local env file + the
    GitHub repo secrets store.

All internal links verified via a python url-walker (every relative
link resolves to an existing file).
---
 docs/ci-setup.md        |  184 +++++++
 docs/cloud-setup.md     | 1030 ++++++++-------------------------------
 docs/heima-setup.md     |  106 ++++
 wiki/ci-setup-faq.md    |   96 ++++
 wiki/cloud-setup-faq.md |   94 ++++
 wiki/heima-setup-faq.md |  111 +++++
 6 files changed, 782 insertions(+), 839 deletions(-)
 create mode 100644 docs/ci-setup.md
 create mode 100644 docs/heima-setup.md
 create mode 100644 wiki/ci-setup-faq.md
 create mode 100644 wiki/cloud-setup-faq.md
 create mode 100644 wiki/heima-setup-faq.md

diff --git a/docs/ci-setup.md b/docs/ci-setup.md
new file mode 100644
index 0000000..04670d0
--- /dev/null
+++ b/docs/ci-setup.md
@@ -0,0 +1,184 @@
+# CI setup — AgentKeys
+
+**Audience:** the operator activating the no-LLM CI workflow against a test instance of the production environment.
+**Scope:** one workflow file ([`.github/workflows/harness-ci.yml`](../.github/workflows/harness-ci.yml)), a list of GitHub secrets, and the test-side counterparts of the production resources from [`docs/cloud-setup.md`](cloud-setup.md) + [`docs/heima-setup.md`](heima-setup.md).
+**FAQ + troubleshooting:** [`wiki/ci-setup-faq.md`](../wiki/ci-setup-faq.md).
+
+## TL;DR
+
+The workflow runs unmodified on every push / PR. It has two jobs:
+
+1. **`rust-checks`** — always runs. `cargo fmt --check` + `cargo clippy -D warnings` + `cargo test --workspace`. Covers 600+ tests including the in-process broker integration tests (which already mock STS + SES + WebAuthn).
+2. **`harness-e2e`** — gated on the `TEST_OIDC_AWS_ROLE_ARN` secret being set. Runs the production harness scripts ([`harness/v2-stage{1,2,3}-demo.sh`](../harness/)) against an isolated TEST instance of the cloud + chain.
+
+Until the operator activates the test instance, `harness-e2e` surfaces a `::warning::` skip and the PR is unblocked.
+
+## What "mirror production" means
+
+Every resource in the test instance is parallel to prod:
+
+| | Production | Test |
+|---|---|---|
+| Broker host | `broker.litentry.org` | `test-broker.litentry.org` (long-lived; AWS validates OIDC issuer URLs byte-for-byte) |
+| OIDC issuer | `https://broker.litentry.org` | `https://test-broker.litentry.org` |
+| IAM roles | `agentkeys-{data,vault,memory}-role` | `agentkeys-{data,vault,memory}-role-test` |
+| S3 buckets | `agentkeys-{mail,vault,memory}-${ACCT}` | `agentkeys-{mail,vault,memory}-test-${ACCT}` |
+| Chain | Heima mainnet | **Heima mainnet** (same chain, different deployer → different addresses) |
+| Deployer wallet | operator's prod deployer | dedicated test wallet (small HEI float) |
+| Contracts | one production deploy | one test deploy with **identical `.sol` source** → new addresses |
+| WebAuthn | real Touch ID | never (`WEBAUTHN_MODE=0`) |
+| LLM | (separate `claude.yml` review) | never |
+
+**Same code, same chain, isolated storage.** EVM addresses derive from `(deployer, nonce)` and Solidity compiles deterministically — a different deployer key with the same source files produces a parallel contract set that can't see or write to prod contract state.
+
+## One-shot operator bring-up
+
+### 1. Provision the test broker
+
+Same flow as `docs/cloud-setup.md`, with the `-test` suffix on every identifier:
+
+```bash
+# On a fresh EC2 with EIP + DNS A record for test-broker.${ZONE}
+sudo bash scripts/setup-broker-host.sh \
+  --issuer-url https://test-broker.${ZONE} \
+  --account-id "${ACCOUNT_ID}" \
+  --signer-host signer-test.${ZONE} \
+  --audit-host audit-test.${ZONE} \
+  --email-host email-test.${ZONE} \
+  --cred-host cred-test.${ZONE} \
+  --memory-host memory-test.${ZONE} \
+  --vault-bucket "agentkeys-vault-test-${ACCOUNT_ID}" \
+  --memory-bucket "agentkeys-memory-test-${ACCOUNT_ID}" \
+  --email-from "noreply-test@bots-test.${ZONE}" \
+  --yes
+```
+
+Idempotent: re-run after edits without manual rollback.
+
+### 2. Provision the parallel AWS resources
+
+The prod provisioning helpers (`scripts/provision-vault-{bucket,role}.sh`, `scripts/provision-memory-{bucket,role}.sh`, `scripts/apply-{vault,memory}-bucket-policy.sh`, `scripts/cleanup-mail-bucket-policy.sh`) all read bucket / role names from `scripts/operator-workstation.env`. For the test instance, point them at a test env file:
+
+```bash
+# Side-load the test env so the prod scripts pick up -test names
+TEST_VAULT_BUCKET="agentkeys-vault-test-${ACCOUNT_ID}" \
+TEST_MEMORY_BUCKET="agentkeys-memory-test-${ACCOUNT_ID}" \
+TEST_VAULT_ROLE_ARN="arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-vault-role-test" \
+TEST_MEMORY_ROLE_ARN="arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-memory-role-test" \
+  bash scripts/provision-vault-bucket.sh && \
+  bash scripts/provision-vault-role.sh && \
+  bash scripts/apply-vault-bucket-policy.sh && \
+  bash scripts/provision-memory-bucket.sh && \
+  bash scripts/provision-memory-role.sh && \
+  bash scripts/apply-memory-bucket-policy.sh
+```
+
+(If the prod scripts don't yet read these overrides, file a follow-up issue and copy the prod scripts as `-test` variants by hand until they do.)
+
+### 3. Register the test OIDC provider in IAM
+
+```bash
+thumb=$(echo | openssl s_client -servername "test-broker.${ZONE}" \
+                                 -connect "test-broker.${ZONE}:443" 2>/dev/null \
+          | openssl x509 -fingerprint -noout \
+          | awk -F'=' '{print $2}' | tr -d ':' | tr 'A-Z' 'a-z')
+
+aws iam create-open-id-connect-provider \
+  --url "https://test-broker.${ZONE}" \
+  --client-id-list "sts.amazonaws.com" \
+  --thumbprint-list "$thumb"
+```
+
+### 4. Generate the test deployer wallet + fund it
+
+```bash
+mkdir -p ~/.agentkeys
+cast wallet new --json \
+  | tee /tmp/test-deployer.json \
+  | jq -r .[0].private_key > ~/.agentkeys/heima-deployer-test.key
+chmod 600 ~/.agentkeys/heima-deployer-test.key
+# Then fund the address ($(jq -r .[0].address /tmp/test-deployer.json))
+# from your personal Heima wallet — small float is enough for one-shot deploy.
+```
+
+### 5. Deploy the test contracts on Heima mainnet
+
+Identical Solidity, identical `DeployAgentKeysV1.s.sol`, different deployer → new addresses on the production chain:
+
+```bash
+AGENTKEYS_CHAIN=heima \
+HEIMA_DEPLOYER_KEY_FILE=~/.agentkeys/heima-deployer-test.key \
+MAINNET_CONFIRM=1 \
+  bash scripts/setup-heima.sh --from-step 4 --to-step 8
+```
+
+That walks steps 4–8: reuse the test key, fund-check, deploy, persist addresses, verify on-chain. Read off the six `*_HEIMA` addresses from the resulting `scripts/operator-workstation.env` for the next step.
+
+### 6. Register the GitHub Actions OIDC role
+
+Create one additional IAM role, `github-actions-agentkeys-e2e`, trust-policied on `token.actions.githubusercontent.com` with a condition limiting it to the agentkeys repo. Grant it `sts:AssumeRole` on the three test data roles and read-only S3 on the three test buckets.
+
+### 7. Set the GitHub repo secrets
+
+In **Settings → Secrets and variables → Actions**:
+
+| Secret | Value |
+|---|---|
+| `TEST_OIDC_AWS_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/github-actions-agentkeys-e2e` (the gate) |
+| `TEST_ACCOUNT_ID` | numeric AWS account ID (same account as prod is fine) |
+| `TEST_AWS_REGION` | e.g. `us-east-1` |
+| `TEST_BROKER_HOST` | `test-broker.${ZONE}` |
+| `TEST_VAULT_BUCKET` | `agentkeys-vault-test-${ACCT}` |
+| `TEST_MEMORY_BUCKET` | `agentkeys-memory-test-${ACCT}` |
+| `TEST_VAULT_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-vault-role-test` |
+| `TEST_MEMORY_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-memory-role-test` |
+| `TEST_DATA_ROLE_ARN` | `arn:aws:iam::${ACCT}:role/agentkeys-data-role-test` |
+| `TEST_HEIMA_DEPLOYER_KEY` | the 0x-prefixed test deployer private key from step 4 |
+| `TEST_SCOPE_CONTRACT_ADDRESS_HEIMA` | from step 5 |
+| `TEST_SIDECAR_REGISTRY_ADDRESS_HEIMA` | from step 5 |
+| `TEST_K3_EPOCH_COUNTER_ADDRESS_HEIMA` | from step 5 |
+| `TEST_CREDENTIAL_AUDIT_ADDRESS_HEIMA` | from step 5 |
+| `TEST_P256_VERIFIER_ADDRESS_HEIMA` | from step 5 |
+| `TEST_K11_VERIFIER_ADDRESS_HEIMA` | from step 5 |
+
+`TEST_OIDC_AWS_ROLE_ARN` is the gate. Setting it last activates the workflow; unsetting it disarms.
+
+## What the workflow does on every run
+
+1. Restores submodules + Rust toolchain + Foundry + cargo cache.
+2. **`rust-checks`** job: `cargo fmt --check` → `cargo clippy -- -D warnings` → `cargo test --workspace -- --test-threads=1` (the `--test-threads=1` matches the existing `@claude` review workflow because broker tests mutate `$HOME` / `AWS_*` env).
+3. **`preflight`** job: gates on `TEST_OIDC_AWS_ROLE_ARN`.
+4. **`harness-e2e`** job: assumes the test role via GitHub Actions OIDC (no long-lived secrets), writes the test deployer key, overwrites `scripts/operator-workstation.env` with TEST_* values, then runs:
+   - `harness/v2-stage1-demo.sh --skip-deploy --skip-email` (contracts pre-deployed; identity via wallet_sig)
+   - `harness/v2-stage2-demo.sh --stub --skip-build`
+   - `harness/v2-stage3-demo.sh` (per-actor + per-data-class PrincipalTag isolation — the capstone that needs real AWS STS)
+5. Per-run S3 prefix cleanup (`ci/run-${RUN_ID}/`) in an `if: always()` block.
+
+## Per-run S3 prefix isolation
+
+Concurrent runs (nightly + a manual dispatch) get a unique prefix via `CI_S3_PREFIX=ci/run-${GITHUB_RUN_ID}`. Per-job cleanup is best-effort; pair it with a nightly operator-side cron that sweeps `ci/` prefix keys older than 7 days from the test buckets.
+
+## Manual dispatch
+
+```bash
+gh workflow run harness-ci.yml --field stage=3
+```
+
+`stage` accepts `1`, `2`, `3`, or `all`. Useful for re-running just stage-3 after a contract revision.
+
+## Secret hygiene
+
+No project credentials live in this doc. Every value above is either a placeholder (`${ACCT}`, `${ZONE}`) or an instruction to read from the operator's already-provisioned state ("from step 5"). The actual values live in two places only:
+
+- The operator's local `scripts/operator-workstation.env` (gitignored copies / test variants only).
+- The GitHub repo's encrypted secrets store.
+
+Never paste a real account ID, role ARN, bucket name, deployer key, or contract address into a markdown doc, commit message, or PR description.
+
+## Related
+
+- Workflow file: [`.github/workflows/harness-ci.yml`](../.github/workflows/harness-ci.yml)
+- Cloud / broker bring-up: [`docs/cloud-setup.md`](cloud-setup.md)
+- Chain bring-up: [`docs/heima-setup.md`](heima-setup.md)
+- Harness scripts: [`harness/v2-stage{1,2,3}-demo.sh`](../harness/)
+- FAQ + troubleshooting: [`wiki/ci-setup-faq.md`](../wiki/ci-setup-faq.md)
diff --git a/docs/cloud-setup.md b/docs/cloud-setup.md
index df04df2..3a7820f 100644
--- a/docs/cloud-setup.md
+++ b/docs/cloud-setup.md
@@ -1,970 +1,322 @@
 # Cloud setup — AgentKeys
 
 **Audience:** the operator provisioning the cloud account that hosts AgentKeys infrastructure.
-**Scope:** one file, every cloud-side resource. Read top-down once per account, then jump back to the section you're touching.
+**Scope:** the prereqs that the idempotent [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh) entry point can't do for itself (DNS, SES, IAM, OIDC provider, S3 buckets). Run those once per account, then re-run the broker-host script as often as needed.
+**Companion:** [`docs/heima-setup.md`](heima-setup.md) for chain bring-up, [`docs/ci-setup.md`](ci-setup.md) for CI activation.
+**FAQ + troubleshooting:** [`wiki/cloud-setup-faq.md`](../wiki/cloud-setup-faq.md).
 
-The runbook is split by concern, not by stage:
-
-| § | Concern | When you do this |
-|---|---------|------------------|
-| [§0 Identities](#0-identities--mental-model) | The four IAM principals and what each one is for | Read first |
-| [§1 Domain + DNS](#1-domain--dns) | Email subdomain (Stage 6) + broker subdomain (Stage 7) | Once per account |
-| [§2 Inbound mail](#2-inbound-mail-backend) | SES + S3 receipt rule (Stage 6) | Once per account |
-| [§3 IAM users + role](#3-iam-identities) | `agentkeys-{admin,broker,daemon}` + `agentkeys-data-role` | Once per account |
-| [§4 OIDC federation](#4-oidc-federation-stage-7) | Register the broker as an OIDC provider, swap to PrincipalTag-scoped trust | After §1–§3 + a publicly-reachable broker |
-| [§5 EC2 broker host](#5-ec2-broker-host-optional) | EIP, A record, security group | Only if you're hosting the broker on AWS |
-| [§6 Signer host](#6-signer-host) | DNS A record + TLS cert + nginx flip for `signer.<zone>` | After §5 — needs `$EIP` |
-| [§7 Service workers](#7-service-workers-audit--email--cred--memory) | 4 DNS A records + TLS certs + nginx flips for `audit/email/cred/memory.<zone>` (dev co-located on broker host) | After §5 — needs `$EIP` |
-| [§8 Cleanup](#8-cleanup) | Tear-down recipe | When you want to delete it all |
-
-**Cloud-portability:** §1 (DNS) and §2 (inbound mail) are the cloud-replaceable layers — Tencent Cloud SimpleDM + COS would slot in here unchanged at the §3+ boundary. See [§2.2](#22-future-tencent-cloud-simpledm--cos).
-
----
-
-## 0. Identities — mental model
-
-| Identity | Type | Holds | Purpose |
-|---|---|---|---|
-| `agentkeys-admin` | IAM user | Long-lived access key | One-shot provisioning. Runs every command in this doc. IAM-admin scope. |
-| `agentkeys-broker` | IAM user | Long-lived access key | Operator's SSH-into-EC2 path via EC2 Instance Connect. No data-plane access. |
-| `agentkeys-daemon` | IAM user | Long-lived access key | The **broker process** uses this at runtime. Only permission: `sts:AssumeRole` on `agentkeys-data-role`. |
-| `agentkeys-data-role` | IAM role | (assumed) | The actual S3/SES permissions live here. `agentkeys-daemon` (Stage 6) or the OIDC provider (Stage 7) is allowed to assume it. |
-| `agentkeys-broker-host` | IAM role | (assumed by EC2) | Optional. If the broker runs on EC2, attach this as the instance profile so the daemon never sees a static key. |
-
-Why "data role" and not "agent role": the project word "agent" already means three things (the AI agent, the AgentKeys product, an IAM role). The role holds **data-plane** permissions, so `agentkeys-data-role` it is. (Renamed from `agentkeys-agent` 2026-04-28; the broker still accepts the legacy `BROKER_AGENT_ROLE_ARN` env var.)
-
-**Prereqs for everything below:**
+## TL;DR — operator flow
 
 ```bash
-# AWS CLI v2 + a working agentkeys-admin profile
-awsp agentkeys-admin                                              # set AWS_PROFILE
-aws sts get-caller-identity                                       # → agentkeys-admin
-
-# Shell vars used throughout the runbook
-export REGION=us-east-1                                           # SES inbound: us-east-1, us-west-2, eu-west-1
-export DOMAIN=bots.litentry.org                                   # Stage 6 email subdomain
-export BROKER_HOST=broker.litentry.org                            # Stage 7 broker public hostname
-export PARENT_ZONE_ID=Z09723983CFJOHAE3VC65                       # existing litentry.org Route 53 zone
-export ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
-export BUCKET=agentkeys-mail-${ACCOUNT_ID}                        # global-unique by account-id suffix
-echo "REGION=$REGION DOMAIN=$DOMAIN BROKER_HOST=$BROKER_HOST ACCOUNT_ID=$ACCOUNT_ID BUCKET=$BUCKET"
-```
-
-> **Why `jq -n --arg` and not `cat > file.json <<EOF`:** `jq --arg` passes values outside shell parameter expansion, sidestepping the zsh modifier bug (`$VAR:r` etc.) that silently corrupts ARNs. JSON is validated on construction, command substitution feeds the result straight into `--policy-document`, no file lands on disk.
-
----
-
-## 1. Domain + DNS
-
-Two subdomains under the existing `litentry.org` zone — no NS delegation needed because both records live in the parent zone:
-
-- `bots.litentry.org` — agent email subdomain (used by SES inbound).
-- `broker.litentry.org` — broker public hostname (TLS-terminating reverse proxy).
-
-If you're using a different parent domain, swap `litentry.org` and `PARENT_ZONE_ID` accordingly. Confirm the zone is reachable before continuing:
-
-```bash
-aws route53 get-hosted-zone --id "$PARENT_ZONE_ID" \
-  --query 'HostedZone.{name: Name, private: Config.PrivateZone}'
-# → {"name": "litentry.org.", "private": false}
-```
+# Laptop:
+awsp agentkeys-admin
+set -a; source scripts/operator-workstation.env; set +a   # ${ACCOUNT_ID}, ${REGION}, ${BROKER_HOST}, ${BUCKET}, ...
 
-### 1.1 Email subdomain — DKIM + SPF + DMARC + MX
+# 1. Per-account, one-shot, manual (this doc):
+#    §1 DNS subdomains, §2 SES domain identity, §3 IAM users + role,
+#    §4 OIDC federation provider + trust policy + bucket policy.
 
-After §2.1 (SES domain identity) you'll have three DKIM tokens to publish. The block below publishes those plus the standard SPF / DMARC / MX records in one Route 53 change:
+# 2. Per-broker-host, idempotent re-runnable (script):
+sudo bash scripts/setup-broker-host.sh \
+  --issuer-url "https://${BROKER_HOST}" \
+  --account-id "${ACCOUNT_ID}" \
+  --signer-host "signer.${ZONE}" \
+  --audit-host  "audit.${ZONE}" \
+  --email-host  "email.${ZONE}" \
+  --cred-host   "cred.${ZONE}" \
+  --memory-host "memory.${ZONE}" \
+  --yes
 
-```bash
-read -r T1 T2 T3 <<<"$(aws sesv2 get-email-identity --region "$REGION" \
-  --email-identity "$DOMAIN" --query 'DkimAttributes.Tokens' --output text)"
-
-aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \
-  --change-batch "$(jq -n \
-    --arg domain "$DOMAIN" --arg region "$REGION" \
-    --arg t1 "$T1" --arg t2 "$T2" --arg t3 "$T3" \
-    '{
-      Comment: "AgentKeys email infra for \($domain)",
-      Changes: [
-        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t1)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t1).dkim.amazonses.com"}]}},
-        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t2)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t2).dkim.amazonses.com"}]}},
-        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t3)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t3).dkim.amazonses.com"}]}},
-        {Action:"UPSERT", ResourceRecordSet:{Name:$domain, Type:"MX",  TTL:300, ResourceRecords:[{Value:"10 inbound-smtp.\($region).amazonaws.com"}]}},
-        {Action:"UPSERT", ResourceRecordSet:{Name:$domain, Type:"TXT", TTL:300, ResourceRecords:[{Value:"\"v=spf1 include:amazonses.com -all\""}]}},
-        {Action:"UPSERT", ResourceRecordSet:{Name:"_dmarc.\($domain)", Type:"TXT", TTL:300, ResourceRecords:[{Value:"\"v=DMARC1; p=quarantine; rua=mailto:dmarc@\($domain)\""}]}}
-      ]
-    }')"
+# 3. Per-chain, idempotent re-runnable:
+bash scripts/setup-heima.sh                                # see docs/heima-setup.md
 ```
 
-### 1.2 Broker subdomain — A record to EIP
+`setup-broker-host.sh` is **the single entry point** for every remote-host change (binary upgrades, systemd edits, env tweaks, nginx/certbot wiring, mock-server redeploys). Per [CLAUDE.md "Remote broker host"](../CLAUDE.md): no ad-hoc `systemctl` edits, no hand-built `scp`.
 
-Done as part of [§5 EC2 broker host](#5-ec2-broker-host-optional), once you know the host's public IP. If the broker lives outside AWS (DigitalOcean, Hetzner, etc.), upsert the A record now using the host's static IP — the rest of the runbook is identical.
+The split: §1–§4 below sets up the **identifiers** (DNS names, IAM principals, OIDC trust, bucket policies); the script consumes those identifiers and stands up the actual processes.
 
-### 1.3 Signer subdomain — A record + TLS cert (issue #74 step 1b)
-
-Done as part of [§6 Signer host](#6-signer-host), once `$EIP` is known from [§5.1](#51-allocate--attach-an-elastic-ip).
+## 0. Identities — mental model
 
-### 1.4 Service-worker subdomains — bulk A records (issue #90)
+| Identity | Type | Holds | Purpose |
+|---|---|---|---|
+| `agentkeys-admin` | IAM user | Long-lived access key | One-shot provisioning. Runs every command in this doc. IAM-admin scope. |
+| `agentkeys-broker` | IAM user | Long-lived access key | Operator's SSH-into-EC2 path via EC2 Instance Connect. No data-plane access. |
+| `agentkeys-daemon` | IAM user | Long-lived access key | Broker process at runtime. Only permission: `sts:AssumeRole` on the data role. |
+| `agentkeys-data-role` | IAM role | (assumed) | Holds the actual S3/SES permissions. `agentkeys-daemon` (Stage 6) or the OIDC provider (Stage 7) is allowed to assume. |
+| `agentkeys-vault-role` / `agentkeys-memory-role` | IAM role | (assumed) | Per-data-class roles (arch.md §17.2). Trust the OIDC provider; PrincipalTag-scoped to `bots/<actor_omni>/{credentials,memory}/*`. |
+| `agentkeys-broker-host` | IAM role | (assumed by EC2) | Optional. If the broker runs on EC2, attach as instance profile so the daemon never sees a static key. |
 
-The 4 service workers (`audit` / `email` / `cred` / `memory`) co-locate on the broker host today (dev-only per [CLAUDE.md](../CLAUDE.md) "for production, we will isolate all the services for the security issue"). All 4 A records point to the same `$EIP`. The hostnames are the migration seam — when a worker moves to its own machine, only the A record changes.
+The word "agent" already means three things (the AI agent, the AgentKeys product, an IAM role) — these roles hold **data-plane** permissions, so they're named `*-data-role` / `*-vault-role` / `*-memory-role`.
 
-Done as part of [§7 Service workers](#7-service-workers-audit--email--cred--memory) using the [`scripts/dns-upsert-workers.sh`](../scripts/dns-upsert-workers.sh) helper.
+## 1. DNS
 
----
+Two-and-six subdomains under your parent zone (e.g. `litentry.org`):
 
-## 2. Inbound mail backend
+| Host | Purpose | Set in |
+|---|---|---|
+| `${MAIL_DOMAIN}` (e.g. `bots.litentry.org`) | SES inbound | §2 |
+| `${BROKER_HOST}` (e.g. `broker.litentry.org`) | Broker TLS-terminating reverse proxy | §5 — A record to broker EIP |
+| `signer.${ZONE}` | Signer service (issue #74 step 1b) | §5 — A record to broker EIP (co-located today) |
+| `audit.${ZONE}` / `email.${ZONE}` / `cred.${ZONE}` / `memory.${ZONE}` | Service workers (issue #90) | §5 — same EIP (dev co-location) |
 
-### 2.1 AWS SES + S3
+For the bulk service-worker DNS, use [`scripts/dns-upsert-workers.sh`](../scripts/dns-upsert-workers.sh). The hostnames are the migration seam — when a worker moves to its own machine, only the A record changes.
 
-#### Verify the SES domain identity
+## 2. SES inbound mail
 
 ```bash
-aws sesv2 create-email-identity \
-  --region "$REGION" --email-identity "$DOMAIN" \
+# Verify the SES domain identity
+aws sesv2 create-email-identity --region "$REGION" \
+  --email-identity "$MAIL_DOMAIN" \
   --dkim-signing-attributes NextSigningKeyLength=RSA_2048_BIT
-```
-
-Now run [§1.1](#11-email-subdomain--dkim--spf--dmarc--mx) to publish the DKIM/SPF/DMARC/MX records. Wait ~5 min, then:
-
-```bash
-aws sesv2 get-email-identity --region "$REGION" --email-identity "$DOMAIN" \
-  --query '{verified: VerifiedForSendingStatus, dkim: DkimAttributes.Status}'
-# → {"verified": true, "dkim": "SUCCESS"}
-```
-
-> **DKIM key custody:** in this interim setup, AWS SES holds the private DKIM key. We never see it. Trust surface: AWS-internal compromise could forge mail signed as us — bounded blast radius (reputation, not user-data custody). Migration target is TEE-held BYODKIM when [`heima-gaps §4`](./spec/heima-gaps-vs-desired-architecture.md) closes; do **not** intermediate-step to "BYODKIM with file-stored key" (strictly worse than AWS-managed).
-
-#### Create the S3 bucket for inbound mail
 
-The bucket policy in [§3.5](#35-s3-bucket-policy) wires SES write + role read; we'll come back to it after the IAM identities exist.
+# Publish DKIM + SPF + DMARC + MX in one Route 53 change (read DKIM tokens
+# from `aws sesv2 get-email-identity`, then upsert via Route 53 — see
+# wiki/cloud-setup-faq.md for the full record set).
 
-```bash
-aws s3api create-bucket \
-  --region "$REGION" --bucket "$BUCKET" \
+# Create the inbound bucket (30-day TTL on inbound/* objects)
+aws s3api create-bucket --region "$REGION" --bucket "$BUCKET" \
   $([ "$REGION" != "us-east-1" ] && echo "--create-bucket-configuration LocationConstraint=$REGION")
-
 aws s3api put-public-access-block --region "$REGION" --bucket "$BUCKET" \
   --public-access-block-configuration BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
 
-# 30-day TTL on inbound objects (throwaway-inbox model)
-aws s3api put-bucket-lifecycle-configuration --region "$REGION" --bucket "$BUCKET" \
-  --lifecycle-configuration "$(jq -n '{
-    Rules: [{ID:"inbound-30d-ttl", Status:"Enabled", Filter:{Prefix:"inbound/"}, Expiration:{Days:30}}]
-  }')"
-```
-
-#### Create the SES receipt rule
-
-```bash
+# Receipt rule: route mail for $MAIL_DOMAIN into s3://$BUCKET/inbound/*
 aws ses create-receipt-rule-set --rule-set-name agentkeys --region "$REGION" 2>/dev/null || true
 aws ses create-receipt-rule --region "$REGION" --rule-set-name agentkeys \
-  --rule "$(jq -n --arg domain "$DOMAIN" --arg bucket "$BUCKET" '{
+  --rule "$(jq -n --arg domain "$MAIL_DOMAIN" --arg bucket "$BUCKET" '{
     Name: "agentkeys-inbound", Enabled: true, ScanEnabled: true, TlsPolicy: "Optional",
     Recipients: [$domain],
     Actions: [{S3Action: {BucketName: $bucket, ObjectKeyPrefix: "inbound/"}}]
   }')"
 aws ses set-active-receipt-rule-set --rule-set-name agentkeys --region "$REGION"
-```
-
-Inbound MIME lands at `s3://$BUCKET/inbound/<msg_id>`. The first object you'll see is `inbound/AMAZON_SES_SETUP_NOTIFICATION` — AWS's "I successfully wrote to your bucket" marker. Real test mail follows.
-
-#### Spam handling (read-time filter)
-
-The SES scanners stamp `X-SES-Spam-Verdict` / `X-SES-Virus-Verdict` headers. The provisioner-scripts `ses-s3` adapter drops messages where either is `FAIL`. No write-time Lambda; trivial receipt rule.
-
-#### Sandbox vs production sending
-
-Inbound is unaffected by SES sandbox status. You only need to request production access when the agent **sends** mail to arbitrary addresses (replies, notifications). Console → Support → "Service limit increase" → "SES Sending Limits" → "Request Production Access".
-
-### 2.1a Per-recipient routing Lambda (issue #83)
-
-After [§4](#4-oidc-federation-stage-7) lands, the `agentkeys-data-role` is intentionally denied read on `s3://$BUCKET/inbound/` (federation-isolation rule, [§4.5](#45-strip-the-static-iam-grants)). Service-provisioning verification emails (openrouter, brave, anthropic, …) land in `inbound/<msg>` but the OIDC-assumed scraper subprocess cannot read them — operators see the symptom as `internal error: AccessDenied on s3:ListBucket` at the email-fetch step of `agentkeys provision <service>`.
-
-The fix is a small post-receive Lambda that copies inbound objects to the operator's PrincipalTag-scoped prefix when the recipient local-part matches the provisioner's routing pattern. Service emails the scraper generates have the form `or-<0x-wallet>-<unix-ts>@$DOMAIN`; the Lambda parses that local-part, extracts the wallet, and `CopyObject`s (server-side — body never transits Lambda) to `bots/<wallet>/inbound/<msg>`. AGENTKEYS magic-link auth emails (different local-part) stay in `inbound/` for the broker's `/v1/auth/email/*` handlers.
 
-Deploy once per AWS account:
-
-```bash
-awsp agentkeys-admin
-set -a; source scripts/operator-workstation.env; set +a
-bash infra/ses-routing-lambda/deploy.sh
+# Verify the bot's sending identity (the broker's BROKER_EMAIL_FROM_ADDRESS
+# precheck refuses to boot if this isn't verified)
+bash scripts/ses-verify-sender.sh
 ```
 
-Idempotent (re-runnable). What it provisions: IAM role `agentkeys-ses-router-lambda-role` (inline policy: `s3:GetObject` on `inbound/*`, `s3:PutObject` on `bots/*/inbound/*`, basic CloudWatch Logs), Lambda function `agentkeys-ses-router` (python3.13, 128MB, 10s timeout, reserved-concurrency=10), and the S3 `ObjectCreated:*` notification on `inbound/` → Lambda.
-
-Per-invocation cost ≈ 1.7 µ$ at 128 MB; total Lambda spend stays single-digit cents/month at any sensible operator count. See [`infra/ses-routing-lambda/README.md`](../infra/ses-routing-lambda/README.md) for unit tests, verification commands, and rollback.
-
-> **TODO** (tracked in [`TODOS.md`](../TODOS.md) — "Disable broker's broad S3-full-access"): once this Lambda is deployed and stable, tighten the broker's instance profile so it can no longer read service-provisioning emails (defense-in-depth — today the broker COULD read them but doesn't).
-
-### 2.2 Future: Tencent Cloud SimpleDM + COS
-
-For deployments serving China-region traffic, the analogous backend is:
+**Sandbox vs production sending:** inbound is unaffected by SES sandbox; only **outbound** to arbitrary addresses needs Console → Support → "SES Sending Limits" → "Request Production Access".
 
-| Layer | AWS (current) | Tencent Cloud (future) |
-|---|---|---|
-| Email service | SES (SendRawEmail / receipt rules) | SimpleDM (`SendEmail` + receive-rule policies) |
-| Object store | S3 + bucket policy | COS + bucket-policy / CAM role |
-| Identity service | IAM users + roles + STS AssumeRole | CAM users + roles + STS AssumeRole |
-| OIDC federation | `iam:CreateOpenIDConnectProvider` | CAM `CreateOIDCConfig` |
-
-The provisioner-scripts `email-backends/` interface already abstracts the inbound contract (object key + raw MIME). A Tencent backend slots in as `tencent-simpledm-cos`, with the same upstream API as `ses-s3`. Identity layout in §3 stays unchanged structurally — replace `iam` with `cam` calls. **No work in this runbook depends on AWS specifically except the AWS CLI invocations** — the IAM model maps 1:1 onto CAM.
+**Per-recipient routing Lambda (issue #83):** after §4 lands, the broker's role is intentionally denied read on `inbound/*`. Service-provisioning verification emails route to `bots/<wallet>/inbound/<msg>` via [`infra/ses-routing-lambda/deploy.sh`](../infra/ses-routing-lambda/deploy.sh). Idempotent, deploy once per AWS account.
 
----
+**Future Tencent Cloud port:** SES + S3 are the only AWS-specific layers in this doc. SimpleDM + COS slot in at the §3+ boundary — IAM model maps 1:1 onto CAM. The `provisioner-scripts/email-backends/` interface already abstracts the inbound contract.
 
 ## 3. IAM identities
 
-### 3.1 `agentkeys-daemon` IAM user (broker runtime)
+The daemon user + data role are the boundary between manual provisioning (this doc) and the script-driven runtime (`setup-broker-host.sh`).
+
+### 3.1 The four principals
 
 ```bash
+# Runtime user (broker process)
 aws iam create-user --user-name agentkeys-daemon
 aws iam create-access-key --user-name agentkeys-daemon
-# → save AccessKeyId + SecretAccessKey to your secret manager. NOT to git.
+#   → save AccessKeyId + SecretAccessKey to the operator's secret manager.
+#     NEVER commit. setup-broker-host.sh consumes these via the systemd
+#     env file written under /etc/agentkeys/.
 
+# Daemon may only assume the data role (no direct S3/SES grants).
 aws iam put-user-policy --user-name agentkeys-daemon \
   --policy-name agentkeys-daemon-assume-role \
   --policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
-    Version: "2012-10-17",
-    Statement: [{
-      Effect: "Allow", Action: "sts:AssumeRole",
-      Resource: "arn:aws:iam::\($acct):role/agentkeys-data-role"
-    }]
-  }')"
-```
-
-The daemon user can do exactly one thing: assume `agentkeys-data-role`. Any S3/SES action goes through the role's permissions, never the user's.
-
-### 3.2 `agentkeys-data-role`
-
-The role's trust policy starts with the **static-IAM-user** variant (Stage 6). [§4.2](#42-replace-the-roles-trust-policy-federated-variant) swaps it for the OIDC-federated variant once the broker is publicly reachable.
-
-```bash
-aws iam create-role --role-name agentkeys-data-role \
-  --assume-role-policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
-    Version: "2012-10-17",
-    Statement: [{
-      Effect: "Allow",
-      Principal: {AWS: "arn:aws:iam::\($acct):user/agentkeys-daemon"},
-      Action: "sts:AssumeRole"
-    }]
-  }')"
-
-aws iam put-role-policy --role-name agentkeys-data-role \
-  --policy-name agentkeys-data-role-inline \
-  --policy-document "$(jq -n \
-    --arg bucket "$BUCKET" --arg region "$REGION" \
-    --arg acct "$ACCOUNT_ID" --arg domain "$DOMAIN" \
-    '{
-      Version: "2012-10-17",
-      Statement: [
-        {Effect:"Allow", Action:"s3:ListBucket", Resource:"arn:aws:s3:::\($bucket)"},
-        {Effect:"Allow", Action:"s3:GetObject",  Resource:"arn:aws:s3:::\($bucket)/*"},
-        {Effect:"Allow", Action:"ses:SendRawEmail", Resource:"arn:aws:ses:\($region):\($acct):identity/\($domain)"}
-      ]
-    }')"
-
-export ROLE_ARN=$(aws iam get-role --role-name agentkeys-data-role --query 'Role.Arn' --output text)
-echo "ROLE_ARN=$ROLE_ARN"
-```
-
-### 3.3 `agentkeys-admin`, `agentkeys-broker` (already provisioned)
-
-If you've come this far, `agentkeys-admin` exists (you're using it now). `agentkeys-broker` is whatever IAM user you SSH into the broker EC2 with via EC2 Instance Connect — its perms are out of scope here (`ec2-instance-connect:SendSSHPublicKey` on the host's instance ID is sufficient).
-
-### 3.4 `agentkeys-broker-host` instance profile (optional, EC2-only)
-
-If the broker runs on EC2, attach this so the daemon never holds a static key. The host's runtime credentials come from IMDS.
-
-```bash
-ROLE_NAME=agentkeys-broker-host
-
-aws iam create-role --role-name $ROLE_NAME \
-  --assume-role-policy-document "$(jq -n '{
-    Version: "2012-10-17",
-    Statement: [{Effect:"Allow", Principal:{Service:"ec2.amazonaws.com"}, Action:"sts:AssumeRole"}]
-  }')"
-
-aws iam put-role-policy --role-name $ROLE_NAME --policy-name BrokerAssumeData \
-  --policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
-    Version: "2012-10-17",
-    Statement: [{Effect:"Allow", Action:"sts:AssumeRole",
-                 Resource:"arn:aws:iam::\($acct):role/agentkeys-data-role"}]
+    Version:"2012-10-17",
+    Statement:[{Effect:"Allow", Action:"sts:AssumeRole",
+                Resource:"arn:aws:iam::\($acct):role/agentkeys-data-role"}]
   }')"
-
-aws iam create-instance-profile --instance-profile-name $ROLE_NAME
-aws iam add-role-to-instance-profile --instance-profile-name $ROLE_NAME --role-name $ROLE_NAME
-aws ec2 associate-iam-instance-profile --region "$REGION" \
-  --instance-id <broker-host-instance-id> \
-  --iam-instance-profile Name=$ROLE_NAME
 ```
 
-### 3.4a `ses:SendEmail` grant on the broker's runtime role (Pass 2 prereq)
-
-The broker calls SES v2 `SendEmail` with its **own** runtime credentials
-(instance profile), NOT via the assumed `agentkeys-data-role`. Without
-`ses:SendEmail` on the broker's role the operator hits:
+For `agentkeys-admin` + `agentkeys-broker` (one-shot, you already have these per CLAUDE.md "AWS local-profile ↔ remote-IAM mapping"), confirm with `aws iam list-users`.
 
-```
-broker rejected /v1/auth/email/request: status=502 body=
-{"error":"backend_unreachable","message":"… ses SendEmail:
- unhandled error (AccessDeniedException)"}
-```
+### 3.2 The three data roles
 
-The IAM action is `ses:SendEmail` (sesv2) — NOT `ses:SendRawEmail` (v1
-only; different code path the broker doesn't use).
-
-**Step 1: discover the actual role name attached to your broker host.**
-The canonical name is `agentkeys-broker-host` (created by §3.4 above).
-The discovery command below stays as-is so the runbook is robust to
-operators who landed on a non-canonical name during early provisioning
-(historically: `S3-full-access`, fully retired 2026-05-12 via the role
-rename in [PR #75 follow-up](#)). Find it:
+Per arch.md §17.2 (per-data-class isolation): separate roles for credentials + memory + email. Same trust shape, distinct inline policies and PrincipalTag scoping. Provision via the per-data-class helpers (idempotent):
 
 ```bash
-# REQUIRED: admin profile + operator env loaded.
-awsp agentkeys-admin
-set -a; source scripts/operator-workstation.env; set +a
-
-# CRITICAL: pass --region "$REGION". The agentkeys-admin profile
-# defaults to us-west-2, but the broker EC2 lives in us-east-1 (from
-# operator-workstation.env). Without --region, describe-instances
-# searches us-west-2, finds nothing, returns empty silently (no error),
-# and the downstream put-role-policy silently runs with --role-name "".
-# See CLAUDE.md → AWS local-profile ↔ remote-IAM mapping.
-INSTANCE_PROFILE_ARN=$(aws ec2 describe-instances \
-  --region "$REGION" \
-  --filters "Name=ip-address,Values=$EIP" \
-  --query 'Reservations[].Instances[].IamInstanceProfile.Arn' \
-  --output text)
-
-if [[ -z "$INSTANCE_PROFILE_ARN" || "$INSTANCE_PROFILE_ARN" == "None" ]]; then
-  echo "ABORT: no EC2 instance with EIP=$EIP found in region $REGION." >&2
-  echo "Caller: $(aws sts get-caller-identity --query Arn --output text)" >&2
-  unset ROLE
-else
-  ROLE=$(aws iam get-instance-profile \
-    --instance-profile-name "${INSTANCE_PROFILE_ARN##*/}" \
-    --query 'InstanceProfile.Roles[0].RoleName' --output text)
-  echo "broker runtime role: $ROLE"
-fi
-```
-
-**Step 2: grant `ses:SendEmail` + `ses:GetEmailIdentity` (least-privilege).**
+bash scripts/provision-vault-bucket.sh        # agentkeys-vault-${ACCOUNT_ID}
+bash scripts/provision-vault-role.sh          # agentkeys-vault-role
+bash scripts/apply-vault-bucket-policy.sh     # v3 split-statement PrincipalTag policy
 
-The broker calls `ses:GetEmailIdentity` at startup via `verify_sender_ready`
-to confirm the sender is verified, and `ses:SendEmail` per request.
-Both grants are scoped to the verified domain identity (and any
-per-address subset) — nothing wider.
+bash scripts/provision-memory-bucket.sh
+bash scripts/provision-memory-role.sh
+bash scripts/apply-memory-bucket-policy.sh
 
-```bash
-aws iam put-role-policy --role-name "$ROLE" \
-  --policy-name BrokerSendEmail \
-  --policy-document "$(jq -n \
-    --arg region "$REGION" --arg acct "$ACCOUNT_ID" --arg domain "$MAIL_DOMAIN" '{
-    Version: "2012-10-17",
-    Statement: [{
-      Effect: "Allow",
-      Action: ["ses:SendEmail", "ses:GetEmailIdentity"],
-      Resource: [
-        "arn:aws:ses:\($region):\($acct):identity/\($domain)",
-        "arn:aws:ses:\($region):\($acct):identity/*@\($domain)"
-      ]
-    }]
-  }')"
+bash scripts/cleanup-mail-bucket-policy.sh    # restore email-only grants on $BUCKET
 ```
 
-No broker restart needed — sesv2 picks up creds per-call. Verify:
-
-```bash
-aws iam get-role-policy --role-name "$ROLE" --policy-name BrokerSendEmail \
-  --query 'PolicyDocument.Statement[*].Action'
-# → [["ses:SendEmail", "ses:GetEmailIdentity"]]
-```
-
-**Step 3 (security audit): strip any over-broad legacy attached policies.**
-
-Some legacy deploys ship with `AmazonS3FullAccess` (or similar wide
-permissions) attached to the broker's instance role from initial
-provisioning. The broker process at runtime ONLY uses `aws-sdk-sts`
-(STS GetCallerIdentity startup probe) + `aws-sdk-sesv2` (this section's
-grants) — it never accesses S3 with its own creds. Per-user S3 access
-is via JWT-assumed `agentkeys-data-role` (§3.2), NOT the broker's
-runtime role.
-
-A broker compromise with `AmazonS3FullAccess` would expose every
-inbound email in the SES bucket (verification tokens, magic links,
-user-data buckets if any). Strip it:
-
-```bash
-# List currently attached policies on the broker's role:
-aws iam list-attached-role-policies --role-name "$ROLE"
-
-# Detach AmazonS3FullAccess if present:
-aws iam detach-role-policy --role-name "$ROLE" \
-  --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
-
-# Verify only BrokerSendEmail (inline, this section) remains:
-aws iam list-role-policies --role-name "$ROLE"        # → ["BrokerSendEmail"]
-aws iam list-attached-role-policies --role-name "$ROLE" # → []
-```
+The data-role trust shape is shown in [§4.3](#43-trust-policy) below — it's the same template for all three roles. The inline grants differ per role (vault → credentials prefix; memory → memory prefix; data-role → mail prefix).
 
-### 3.5 S3 bucket policy
+### 3.3 SES sender grant (email-link auth prereq)
 
-Now that `agentkeys-data-role` exists, attach the bucket policy. The static-IAM-user variant: SES writes inbound, role reads everything.
+The broker's runtime role needs `ses:SendEmail` on the verified sender identity for email-link auth. Add this statement to the data role's inline policy:
 
-```bash
-aws s3api put-bucket-policy --region "$REGION" --bucket "$BUCKET" \
-  --policy "$(jq -n --arg bucket "$BUCKET" --arg acct "$ACCOUNT_ID" '{
-    Version: "2012-10-17",
-    Statement: [
-      {
-        Sid: "AllowSESWriteInbound", Effect: "Allow",
-        Principal: {Service: "ses.amazonaws.com"},
-        Action: "s3:PutObject",
-        Resource: "arn:aws:s3:::\($bucket)/*",
-        Condition: {StringEquals: {"aws:Referer": $acct}}
-      },
-      {
-        Sid: "AllowDaemonRead", Effect: "Allow",
-        Principal: {AWS: "arn:aws:iam::\($acct):role/agentkeys-data-role"},
-        Action: ["s3:GetObject", "s3:ListBucket"],
-        Resource: ["arn:aws:s3:::\($bucket)", "arn:aws:s3:::\($bucket)/*"]
-      }
-    ]
-  }')"
+```json
+{
+  "Effect": "Allow",
+  "Action": ["ses:SendEmail", "ses:SendRawEmail"],
+  "Resource": [
+    "arn:aws:ses:${REGION}:${ACCOUNT_ID}:identity/${BROKER_EMAIL_FROM_ADDRESS}",
+    "arn:aws:ses:${REGION}:${ACCOUNT_ID}:configuration-set/*"
+  ]
+}
 ```
 
-The federated variant (PrincipalTag-scoped) lands in [§4.3](#43-upgrade-bucket-policy-to-principaltag-scoped).
-
----
+The broker's `verify_sender_ready` precheck calls `ses:GetEmailIdentity` at boot and refuses to start if the identity isn't both verified AND grantable. Triggered without this grant: cryptic `AccessDenied: ses:SendEmail` at the magic-link send step.
 
 ## 4. OIDC federation (Stage 7)
 
-Replaces the `agentkeys-daemon → AssumeRole` path in §3.2 with `OIDC-broker-JWT → AssumeRoleWithWebIdentity`. The benefit: per-user isolation enforced **inside AWS** (via PrincipalTag on the assumed session), not just by the daemon's app code.
+The broker mints OIDC JWTs that AWS STS validates via the broker's public JWKS endpoint. Three one-shot steps per account.
 
 ### 4.1 Prereqs
 
-- §1–§3 done.
-- Broker reachable at `https://$BROKER_HOST` over public TLS (see [§5](#5-ec2-broker-host-optional) for the EC2 wiring + `scripts/setup-broker-host.sh` for the host bootstrap).
-- The broker's discovery doc agrees with `$BROKER_HOST` byte-for-byte:
-  ```bash
-  export OIDC_ISSUER="https://$BROKER_HOST"
-  curl -sS --fail-with-body "$OIDC_ISSUER/.well-known/openid-configuration" | jq -e ".issuer == \"$OIDC_ISSUER\""
-  # → true
-  ```
-  If `false`, fix the broker's `BROKER_OIDC_ISSUER` env var before continuing — AWS validates the registered URL against the JWT `iss` claim byte-for-byte (no scheme, trailing slash, or hostname-only forms allowed):
-  ```bash
-  sudo sed -i \
-    "s|^Environment=BROKER_OIDC_ISSUER=.*|Environment=BROKER_OIDC_ISSUER=$OIDC_ISSUER|" \
-    /etc/systemd/system/agentkeys-broker.service
-  sudo systemctl daemon-reload && sudo systemctl restart agentkeys-broker
-  ```
+- Broker reachable at `https://${BROKER_HOST}` over public TLS (`setup-broker-host.sh` provisions this with certbot).
+- `https://${BROKER_HOST}/.well-known/openid-configuration` returns 200 with the expected `issuer` + `jwks_uri`.
+- `https://${BROKER_HOST}/.well-known/jwks.json` returns at least one ES256 key.
 
 ### 4.2 Register the OIDC provider
 
-Pre-check for stale state from earlier bring-ups:
-
 ```bash
-aws iam list-open-id-connect-providers
-```
-
-- Empty list → fresh slate; proceed.
-- ARN ends in `$BROKER_HOST` → already registered; skip the create, jump to the trust-policy update.
-- ARN ends in a different host → delete, then register the correct one:
-  ```bash
-  aws iam delete-open-id-connect-provider \
-    --open-id-connect-provider-arn arn:aws:iam::${ACCOUNT_ID}:oidc-provider/<stale-host>
-  ```
-
-Register:
+thumb=$(echo | openssl s_client -servername "$BROKER_HOST" \
+                                 -connect "${BROKER_HOST}:443" 2>/dev/null \
+          | openssl x509 -fingerprint -noout \
+          | awk -F'=' '{print $2}' | tr -d ':' | tr 'A-Z' 'a-z')
 
-```bash
 aws iam create-open-id-connect-provider \
-  --url "$OIDC_ISSUER" \
-  --client-id-list sts.amazonaws.com \
-  --thumbprint-list ''
-export OIDC_PROVIDER_ARN="arn:aws:iam::${ACCOUNT_ID}:oidc-provider/$BROKER_HOST"
-
-aws iam get-open-id-connect-provider \
-  --open-id-connect-provider-arn "$OIDC_PROVIDER_ARN" \
-  --query '{Url: Url, ClientIDList: ClientIDList}'
-# → {"Url": "https://broker.litentry.org", "ClientIDList": ["sts.amazonaws.com"]}
-```
-
-AWS auto-derives the cert thumbprint from the Let's Encrypt chain. The thumbprint stays valid across cert renewals because LE uses a stable intermediate CA.
-
-### 4.3 Replace the role's trust policy (federated variant)
-
-Principal flips from `agentkeys-daemon` to the OIDC provider; the `sts:TagSession` + `aws:RequestTag/agentkeys_user_wallet` condition is what cloud-enforces per-user isolation in [§4.4](#44-upgrade-bucket-policy-to-principaltag-scoped).
-
-```bash
-aws iam update-assume-role-policy --role-name agentkeys-data-role \
-  --policy-document "$(jq -n \
-    --arg provider "$OIDC_PROVIDER_ARN" \
-    --arg aud_key "${BROKER_HOST}:aud" \
-    '{
-      Version: "2012-10-17",
-      Statement: [{
-        Effect: "Allow",
-        Principal: {Federated: $provider},
-        Action: ["sts:AssumeRoleWithWebIdentity", "sts:TagSession"],
-        Condition: {
-          StringEquals: {($aud_key): "sts.amazonaws.com"},
-          Null: {"aws:RequestTag/agentkeys_user_wallet": "false"}
-        }
-      }]
-    }')"
+  --url "https://${BROKER_HOST}" \
+  --client-id-list "sts.amazonaws.com" \
+  --thumbprint-list "$thumb"
 ```
 
-`Null: "false"` enforces tag presence ("the key MUST exist"). Do **not** use `StringNotEquals: {"aws:RequestTag/agentkeys_user_wallet": ""}` — AWS evaluates negated string operators on missing context keys as TRUE ("the missing key is not equal to anything"), so a JWT carrying no AWS tags claim would silently bypass the check. The `Null` operator rejects sessions where the tag isn't set at all, which is the only enforcement the trust policy can give you.
-
-### 4.4 Upgrade bucket policy to PrincipalTag-scoped
-
-Replaces `AllowDaemonRead` from §3.5. The cloud now enforces "the assumed session can only touch the prefix matching its PrincipalTag" — even if app code has a bug.
-
-The daemon's read perms split into two statements because `s3:prefix` is a request-time condition that **only applies to `s3:ListBucket`** (the prefix filter on listings) — `s3:GetObject` doesn't carry a prefix parameter, so combining the two actions under one `s3:prefix` condition triggers `MalformedPolicy: Conditions do not apply to combination of actions and resources in statement`. For `GetObject` the resource ARN itself enforces the prefix via `${aws:PrincipalTag/...}` expansion.
-
-```bash
-aws s3api put-bucket-policy --region "$REGION" --bucket "$BUCKET" \
-  --policy "$(jq -n --arg bucket "$BUCKET" --arg acct "$ACCOUNT_ID" '{
-    Version: "2012-10-17",
-    Statement: [
-      {
-        Sid: "AllowSESWriteInbound", Effect: "Allow",
-        Principal: {Service: "ses.amazonaws.com"},
-        Action: "s3:PutObject",
-        Resource: "arn:aws:s3:::\($bucket)/*",
-        Condition: {StringEquals: {"aws:Referer": $acct}}
-      },
-      {
-        Sid: "AllowDaemonListOwnPrefix", Effect: "Allow",
-        Principal: {AWS: "arn:aws:iam::\($acct):role/agentkeys-data-role"},
-        Action: "s3:ListBucket",
-        Resource: "arn:aws:s3:::\($bucket)",
-        Condition: {
-          StringLike: {"s3:prefix": "bots/${aws:PrincipalTag/agentkeys_user_wallet}/*"}
-        }
-      },
-      {
-        Sid: "AllowDaemonGetOwnObjects", Effect: "Allow",
-        Principal: {AWS: "arn:aws:iam::\($acct):role/agentkeys-data-role"},
-        Action: "s3:GetObject",
-        Resource: "arn:aws:s3:::\($bucket)/bots/${aws:PrincipalTag/agentkeys_user_wallet}/*"
-      },
-      {
-        Sid: "AllowDaemonPutOwnCredentials", Effect: "Allow",
-        Principal: {AWS: "arn:aws:iam::\($acct):role/agentkeys-data-role"},
-        Action: ["s3:PutObject", "s3:DeleteObject"],
-        Resource: "arn:aws:s3:::\($bucket)/bots/${aws:PrincipalTag/agentkeys_user_wallet}/credentials/*"
-      }
-    ]
-  }')"
-```
-
-**Issue #85 — credentials-prefix write grant.** The fourth statement (`AllowDaemonPutOwnCredentials`) is what lets `agentkeys provision <service>` PUT the AES-256-GCM-sealed credential blob to `s3://$BUCKET/bots/<wallet>/credentials/<service>.enc`. Scope is intentionally tight: only the `credentials/` sub-prefix gets write — every other `bots/<wallet>/*` sub-prefix (inbox, sent, audit, …) stays read-only from the OIDC-assumed session. The plaintext never leaves the operator workstation: AES-256-GCM seal happens before PUT, KEK is derived client-side via the signer's `/dev/sign-message`. PrincipalTag scoping is the cloud-enforced floor; client-side encryption is the second line of defense in case the bucket-policy is misconfigured.
-
-**`bots/` is the per-actor data namespace** — sibling to SES's
-`inbound/`, and to future system prefixes like `audit/`, `dkim/`,
-`config/`. Keeping every actor's data under a single parent prefix
-lets lifecycle rules, encryption defaults, replication, and ops audits
-scope cleanly to "user data" without sweeping in system prefixes.
-Matches arch.md §6 (`bots/A/file` in the runtime sequence diagram).
-Both the policy resource ARN (`bucket/bots/${tag}/*`) and the
-`s3:prefix` condition (`bots/${tag}/*`) carry the `bots/` parent —
-omit it on either and the other half of the policy denies even legit
-reads.
-
-`StringLike "bots/${tag}/*"` (not `StringEquals "bots/${tag}/"`) lets the daemon list sub-prefixes like `bots/<wallet>/inbox/` and `bots/<wallet>/sent/2026-05/`, not just the exact root `bots/<wallet>/`. Matches the shape in [`docs/spec/ses-email-architecture.md` §10.4](spec/ses-email-architecture.md) and [`wiki/tag-based-access`](../wiki/tag-based-access.md).
-
-### 4.4.1 Strip the §3 broad-bucket grant from the role's inline policy
-
-**Critical for §4.5 to actually demonstrate isolation.** §3.2's `agentkeys-data-role-inline` grants the role broad `s3:GetObject` + `s3:ListBucket` on the entire bucket — necessary in the static-IAM path (no PrincipalTag to scope on) but **fatal** here: IAM evaluates as union-of-allows, so this identity-based grant overrides §4.4's bucket-policy isolation. Without this step, §4.5's 4b test will silently succeed instead of correctly returning `AccessDenied` — federation appears to work while the cloud is enforcing nothing.
-
-Inspect what's currently attached:
+**AWS validates the issuer URL byte-for-byte** against the JWT `iss` claim. Once the OIDC provider is registered, the URL is effectively immutable for the life of the deployment — switching means new provider ARN + new trust policy + new federated grants.
 
-```bash
-aws iam get-role-policy --profile agentkeys-admin \
-  --role-name agentkeys-data-role \
-  --policy-name agentkeys-data-role-inline \
-  --query 'PolicyDocument'
-```
+### 4.3 Trust policy
 
-Re-apply, omitting the S3 statement. Keep any non-S3 statements (the daemon needs the `ses:SendRawEmail` grant for outbound mail in §3):
+Apply to each of the three data roles. Use `$ROLE` ∈ `{agentkeys-data-role, agentkeys-vault-role, agentkeys-memory-role}`.
 
 ```bash
-aws iam put-role-policy --profile agentkeys-admin \
-  --role-name agentkeys-data-role \
-  --policy-name agentkeys-data-role-inline \
-  --policy-document "$(jq -n --arg ses_domain "${MAIL_DOMAIN:-bots.litentry.org}" '{
-    Version: "2012-10-17",
-    Statement: [{
-      Effect: "Allow",
-      Action: "ses:SendRawEmail",
-      Resource: "*",
-      Condition: {
-        StringLike: {"ses:FromAddress": "*@\($ses_domain)"}
-      }
+aws iam update-assume-role-policy --role-name "$ROLE" --policy-document "$(jq -n \
+  --arg acct "$ACCOUNT_ID" --arg host "$BROKER_HOST" '{
+    Version:"2012-10-17",
+    Statement:[{
+      Effect:"Allow",
+      Principal:{Federated:"arn:aws:iam::\($acct):oidc-provider/\($host)"},
+      Action:"sts:AssumeRoleWithWebIdentity",
+      Condition:{StringEquals:{"\($host):aud":"sts.amazonaws.com"}}
     }]
   }')"
 ```
 
-If your inline policy had additional non-S3 statements, include them here too.
+### 4.4 PrincipalTag-scoped bucket policy
 
-Verify the S3 actions are gone:
+Per CLAUDE.md "Per-actor + per-data-class isolation invariants": every S3 read/write is scoped to `bots/${aws:PrincipalTag/agentkeys_actor_omni}/{credentials,memory}/*`. The split-statement v3 bucket policy is applied by [`scripts/apply-{vault,memory}-bucket-policy.sh`](../scripts/) — those scripts ARE the source of truth for the policy shape.
 
-```bash
-aws iam get-role-policy --profile agentkeys-admin \
-  --role-name agentkeys-data-role \
-  --policy-name agentkeys-data-role-inline \
-  --query 'PolicyDocument.Statement[*].Action'
-# → [["ses:SendRawEmail"]]
-```
-
-If the daemon doesn't need any non-S3 grants, delete the inline policy entirely instead:
+After §4.3 + §4.4: strip the §3 broad-bucket inline grant from the role's policy (the bucket-side policy enforces; defense in depth means no app-side grant). The `cleanup-mail-bucket-policy.sh` helper does this for the mail bucket; do it by hand for any other inline policy you've left:
 
 ```bash
-aws iam delete-role-policy --profile agentkeys-admin \
-  --role-name agentkeys-data-role \
-  --policy-name agentkeys-data-role-inline
+aws iam delete-role-policy --role-name "$ROLE" --policy-name agentkeys-data-role-s3-broad
 ```
 
 ### 4.5 End-to-end proof
 
-Mint a JWT, assume the role with it, prove that wallet A can read its own prefix but **not** wallet B's. The minting half must run **on the broker host** (the prod broker validates session bearers against its *own* local backend on `127.0.0.1:8090`, not against any backend reachable from your operator workstation). The AWS-side half runs on your operator workstation where your admin AWS profile lives.
-
-**Env-var scope** — `$ACCOUNT_ID`, `$BROKER_HOST`, `$OIDC_ISSUER`, `$OIDC_PROVIDER_ARN`, `$BUCKET` only exist on your operator workstation (set up in [§0](#0-identities--mental-model)). The broker host has none of them. Part A below references `$BROKER_HOST` once — in the SSH command itself, where it's expanded by your local shell *before* SSH connects — and otherwise uses **only** literal `127.0.0.1` URLs inside the SSH session. Don't try to re-export the §0 vars on the broker host; none of them are needed there.
-
-#### Part A — on the broker host (mint the JWT)
+Run [`harness/v2-stage3-demo.sh`](../harness/v2-stage3-demo.sh) — it mints a session JWT → OIDC JWT → STS creds, then proves both POSITIVE (own prefix) and NEGATIVE (cross-actor prefix → AccessDenied) writes for both data classes plus the cross-role isolation matrix. Walks the full §17.2 isolation table from CLAUDE.md.
 
-```bash
-# === Run on your operator workstation ===
-# ($BROKER_HOST is expanded locally before ssh runs — the broker host
-# never sees this var. If $BROKER_HOST isn't set, replace with the
-# literal hostname, e.g. broker.litentry.org.)
-ssh agentkey@$BROKER_HOST    # or via: aws ec2-instance-connect ssh --instance-id <id>
-
-# === The rest runs inside the SSH session, on the broker host ===
-# No workstation env vars are visible here. Both URLs are literals.
-SESSION=$(curl -sS --fail-with-body -X POST http://127.0.0.1:8090/session/create \
-  -H 'content-type: application/json' \
-  -d '{"auth_token":"federation-proof"}' | jq -r .session)
-
-JWT=$(curl -sS --fail-with-body -X POST http://127.0.0.1:8091/v1/mint-oidc-jwt \
-  -H "Authorization: Bearer $SESSION" | jq -r .jwt)
-
-echo "$JWT"
-# Copy the entire string. JWT TTL is ~5 min; copy and proceed promptly.
-exit
-```
-
-#### Part B — on your operator workstation (assume role + verify isolation)
-
-All env vars below (`$ACCOUNT_ID`, `$BUCKET`) are workstation-side from §0. Run after `exit`-ing the SSH session.
-
-```bash
-JWT="<paste the JWT from Part A>"
-
-# Decode the wallet from the payload. JWT segments are base64url-encoded
-# (RFC 7515) — jq's @base64d is strict base64, so we url→std + add padding
-# before decoding. Skipping this works on most JWTs by accident; when the
-# payload base64 happens to contain - or _, it fails with a "Malformed BOM"
-# error.
-WALLET=$(jq -R 'split(".") | .[1] | gsub("-";"+") | gsub("_";"/") |
-  . + ("=" * ((4 - length % 4) % 4)) | @base64d | fromjson | .agentkeys_user_wallet' <<<"$JWT" -r)
-echo "WALLET=$WALLET"
-
-CREDS=$(aws sts assume-role-with-web-identity \
-  --role-arn "arn:aws:iam::${ACCOUNT_ID}:role/agentkeys-data-role" \
-  --role-session-name "fed-proof-$(date +%s)" \
-  --web-identity-token "$JWT")
-export AWS_ACCESS_KEY_ID=$(printf '%s' "$CREDS" | jq -r .Credentials.AccessKeyId)
-export AWS_SECRET_ACCESS_KEY=$(printf '%s' "$CREDS" | jq -r .Credentials.SecretAccessKey)
-export AWS_SESSION_TOKEN=$(printf '%s' "$CREDS" | jq -r .Credentials.SessionToken)
-
-# Confirm you're the assumed role, not your admin profile
-aws sts get-caller-identity
-# → Arn: arn:aws:sts::...:assumed-role/agentkeys-data-role/fed-proof-...
-
-# 4a. Own prefix — should succeed (empty list is fine, no AccessDenied)
-aws s3api list-objects-v2 --bucket "$BUCKET" --prefix "$WALLET/"
-
-# 4b. KEY MOMENT — someone else's prefix MUST AccessDenied
-aws s3api list-objects-v2 --bucket "$BUCKET" --prefix "0xdeadbeef/"
-# → AccessDenied
-```
+## 5. Broker host: `setup-broker-host.sh`
 
-Step 4b is the property the static-IAM path (§3) cannot prove: cloud-enforced isolation, zero app-side trust required.
+§1–§4 set up identifiers. This step stands up the actual processes — broker + mock-server + signer + 4 service workers — on the EC2 host (or any Linux box with public-internet egress + the broker's hostname).
 
-#### Diagnosing intermediate states
+### 5.1 Prereqs
 
-If both 4a and 4b succeed, §4.4.1 wasn't applied — the inline-policy `s3:*` grant is still masking the bucket policy. Re-run §4.4.1 and verify `Statement[*].Action` returns only `ses:SendRawEmail`.
-
-If both 4a and 4b deny (including 4a, your *own* prefix), the broker's JWT isn't carrying the `https://aws.amazon.com/tags` claim, so STS sets no PrincipalTag on the assumed session, so `${aws:PrincipalTag/agentkeys_user_wallet}` in the bucket policy expands to empty and matches nothing. Decode the JWT to confirm:
-
-```bash
-jq -R 'split(".") | .[1] | gsub("-";"+") | gsub("_";"/") |
-  . + ("=" * ((4 - length % 4) % 4)) | @base64d | fromjson' <<<"$JWT"
-```
-
-Look for a top-level `https://aws.amazon.com/tags` key with `principal_tags.agentkeys_user_wallet` populated. If it's missing, the broker version doesn't yet emit the AWS tags claim and needs to be redeployed.
-
-### 4.6 (Future) TEE-derived signer swap
-
-The on-disk ES256 keypair shipped today is a complete v0.1 signer. When [`heima-gaps §3`](./spec/heima-gaps-vs-desired-architecture.md) closes, swap [`crates/agentkeys-broker-server/src/oidc.rs::OidcKeypair::load_or_generate`](../crates/agentkeys-broker-server/src/oidc.rs) for a TEE oracle call. JWKS, JWT shape, STS exchange, and bucket policy stay identical — only the signing backend changes.
-
----
-
-## 5. EC2 broker host (optional)
-
-If the broker runs on EC2 (the recommended path for AWS-native deployments), wire DNS + EIP + security group before running [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh) on the box.
-
-### 5.1 Allocate + attach an Elastic IP
-
-```bash
-EIP_ALLOC=$(aws ec2 allocate-address --domain vpc --region "$REGION" --query AllocationId --output text)
-aws ec2 associate-address --region "$REGION" \
-  --instance-id <broker-instance-id> --allocation-id "$EIP_ALLOC"
-EIP=$(aws ec2 describe-addresses --region "$REGION" \
-  --allocation-ids "$EIP_ALLOC" --query 'Addresses[0].PublicIp' --output text)
-echo "EIP=$EIP"
-```
+- Fresh Linux host with sudo, systemd, public-internet egress, ports 80 + 443 open inbound (for certbot + nginx).
+- DNS A records for `${BROKER_HOST}` + `signer.${ZONE}` + `audit.${ZONE}` + `email.${ZONE}` + `cred.${ZONE}` + `memory.${ZONE}` all pointing at the host's public IP.
+- AWS credentials in `/etc/agentkeys/broker.env` (the script writes the file template; operator pastes the `agentkeys-daemon` access key from §3.1).
 
-### 5.2 Wire the A record
+### 5.2 Run
 
 ```bash
-aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \
-  --change-batch "$(jq -n --arg name "$BROKER_HOST." --arg ip "$EIP" '{
-    Changes: [{
-      Action: "UPSERT",
-      ResourceRecordSet: {Name: $name, Type: "A", TTL: 300, ResourceRecords: [{Value: $ip}]}
-    }]
-  }')"
+# Bootstrap a fresh host:
+sudo bash scripts/setup-broker-host.sh \
+  --issuer-url "https://${BROKER_HOST}" \
+  --account-id "${ACCOUNT_ID}" \
+  --signer-host "signer.${ZONE}" \
+  --audit-host  "audit.${ZONE}" \
+  --email-host  "email.${ZONE}" \
+  --cred-host   "cred.${ZONE}" \
+  --memory-host "memory.${ZONE}" \
+  --yes
 
-# Verify (use DoH if your local resolver hijacks port 53)
-curl -s "https://cloudflare-dns.com/dns-query?name=$BROKER_HOST&type=A" \
-  -H 'accept: application/dns-json' | jq '.Answer[0].data'
+# After a `git pull`, the same command re-deploys:
+sudo bash scripts/setup-broker-host.sh --yes
 ```
 
-### 5.3 Open security-group ports 80 + 443
-
-Let's Encrypt's HTTP-01 challenge needs port 80 open from anywhere; the broker serves on 443 afterward. SSH (22) should be admin-IP-only.
-
-```bash
-INSTANCE_ID=<broker-instance-id>
-SG=$(aws ec2 describe-instances --region "$REGION" --instance-ids "$INSTANCE_ID" \
-  --query 'Reservations[0].Instances[0].SecurityGroups[0].GroupId' --output text)
-
-aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG" \
-  --protocol tcp --port 443 --cidr 0.0.0.0/0
-aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG" \
-  --protocol tcp --port 80  --cidr 0.0.0.0/0
-```
+The script:
+- Builds `agentkeys-broker-server` (+ `auth-email-link` feature), `agentkeys-mock-server`, the 4 service workers, and the signer.
+- Creates the `agentkeys` system user + state dir `/var/lib/agentkeys/`.
+- Writes the dev_key_service master secret (one-shot at first boot, never rotated — rotation invalidates every previously-derived wallet).
+- Writes per-worker env files at `/etc/agentkeys/worker-{audit,email,creds,memory}.env`.
+- Writes systemd units for broker + signer + each worker, enables + starts.
+- Configures nginx vhosts for `${BROKER_HOST}` + `signer.${ZONE}` + 4 worker hosts (skip via `--without-nginx`).
+- Runs certbot for first-time TLS cert issuance (skip via `--without-certbot`).
+- Mints broker keypairs (oidc + session) under `/var/lib/agentkeys/keys/`.
 
-### 5.4 Bootstrap the host
+Auto-detects bootstrap vs upgrade by reading the existing systemd unit's `Environment=` lines. Pass `--ref <branch>` to opt into an in-script `git fetch + pull`.
 
-SSH in as `agentkeys-broker` (via EC2 Instance Connect: `aws ec2-instance-connect ssh --instance-id $INSTANCE_ID`) and run:
+### 5.3 Verify
 
 ```bash
-git clone https://github.com/litentry/agentKeys.git
-cd agentKeys
-sudo bash scripts/setup-broker-host.sh
-# Interactive walk-through; pick instance-profile credential mode
-# (assuming §3.4 attached agentkeys-broker-host).
+curl -sf "https://${BROKER_HOST}/healthz"                  # → 200
+curl -sf "https://${BROKER_HOST}/.well-known/openid-configuration" | jq .
+curl -sf "https://${BROKER_HOST}/.well-known/jwks.json"    | jq '.keys | length'
+curl -sf "https://audit.${ZONE}/healthz"                   # → 200 (and friends)
 ```
 
-The script writes systemd units, an HTTP-only nginx config, then prints the certbot command. After cert issuance, re-run the script — it detects the cert file and flips on the `:443` ssl block.
-
----
+For full E2E (broker + workers + chain + AWS), run the harness scripts — see [`docs/heima-setup.md`](heima-setup.md) for the chain side and [`docs/ci-setup.md`](ci-setup.md) for the automated path.
 
-## 6. Signer host
-
-| Concern | Today | Future |
-|---|---|---|
-| Process | `agentkeys-signer.service` (Rust, `agentkeys-mock-server --signer-only`, loopback `:8092`) | TEE worker (issue #74 step 2) |
-| Host | **Same EC2 box as the broker** — co-located behind the same nginx, provisioned by the same `setup-broker-host.sh` run | Separate machine (or enclave); only the A record + cert move |
-| Public hostname | `signer.<zone>` (e.g. `signer.litentry.org`) — exported as `SIGNER_HOST` / `AGENTKEYS_SIGNER_URL` in [`scripts/operator-workstation.env`](../scripts/operator-workstation.env) | `signer.<zone>` (unchanged) |
-| Endpoints | `/dev/derive-address`, `/dev/sign-message`, `/healthz` only — every request bearer-JWT-authed against the broker session pubkey ([`signer-protocol.md`](spec/signer-protocol.md)) | unchanged |
-| Master secret (K3) | `/etc/agentkeys/dev-key-service.env` (mode 0600, owner `agentkeys`) — auto-generated on first `setup-broker-host.sh` run, **never rotated** (rotation invalidates every previously-derived wallet) | TEE-sealed; same wire shape |
+## 6. Cleanup
 
-### 6.1 DNS A record
+Tear down the whole AgentKeys footprint in one account:
 
 ```bash
-# === ON OPERATOR WORKSTATION ===
-SIGNER_HOST="signer.${BROKER_HOST#*.}"
-
-# If $EIP isn't already set from §5.1, re-derive from AWS — NEVER from
-# `dig`. Local resolvers behind Cloudflare WARP / Zscaler / Tailscale /
-# corporate VPNs return RFC 2544 "TEST-NET-2" (198.18.0.0/15) for
-# proxied hostnames, which silently breaks Let's Encrypt validation.
-[ -z "$EIP" ] && EIP=$(aws ec2 describe-addresses --region "$REGION" \
-  --query 'Addresses[?AssociationId!=`null`].PublicIp' --output text)
-echo "EIP=$EIP"   # MUST be a routable public IP, not 198.18.x.x / 10.x.x.x / 100.64.x.x
-
-aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \
-  --change-batch "$(jq -n --arg name "${SIGNER_HOST}." --arg ip "$EIP" '{
-    Changes: [{Action:"UPSERT", ResourceRecordSet:{Name:$name, Type:"A", TTL:300, ResourceRecords:[{Value:$ip}]}}]
-  }')"
-
-# Verify via Cloudflare DoH (your local resolver will keep lying if proxied).
-until [ "$(curl -s "https://cloudflare-dns.com/dns-query?name=${SIGNER_HOST}&type=A" \
-            -H 'accept: application/dns-json' | jq -r '.Answer[0].data')" = "$EIP" ]; do
-  echo "waiting for Route 53 propagation (TTL 300s)…"; sleep 5
+# Drain the buckets
+for b in "$BUCKET" "agentkeys-vault-${ACCOUNT_ID}" "agentkeys-memory-${ACCOUNT_ID}"; do
+  aws s3 rm "s3://$b" --recursive 2>/dev/null || true
+  aws s3api delete-bucket --bucket "$b" --region "$REGION" 2>/dev/null || true
 done
-echo "DNS ready: ${SIGNER_HOST} → ${EIP}"
-```
-
-### 6.2 TLS cert + nginx flip
-
-> **`$SIGNER_HOST` is laptop-only** (lives in `operator-workstation.env`).
-> On the broker host, derive it from the nginx vhost that `setup-broker-host.sh`
-> just wrote — the snippet below does it inline so the commands work in a
-> fresh broker shell with no env vars set.
-
-```bash
-# === ON BROKER HOST ===
-# 1. First pass writes the HTTP-only nginx vhost for signer.<zone>.
-sudo bash scripts/setup-broker-host.sh --yes
-
-# Sanity-check + read the hostname back out of the vhost.
-ls /etc/nginx/sites-enabled/agentkeys-signer
-SIGNER_HOST=$(awk '/server_name/ && /signer\./ {gsub(";",""); print $2}' \
-                /etc/nginx/sites-available/agentkeys-signer | head -1)
-echo "SIGNER_HOST=$SIGNER_HOST"
-
-# 2. Issue the LE cert. If the prompt only lists broker.<zone>, the
-# signer vhost wasn't written — re-pull + re-run step 1.
-sudo certbot --nginx -d "$SIGNER_HOST"
-
-# 3. Re-run to flip the signer vhost onto :443 ssl.
-sudo bash scripts/setup-broker-host.sh --yes
-```
-
-### 6.3 Verify
-
-```bash
-# === ON OPERATOR WORKSTATION ===
-curl -sS "https://$SIGNER_HOST/healthz"
-# ok
 
-# Defense-in-depth: signer vhost rejects everything except /dev/* + /healthz.
-curl -sS -o /dev/null -w '%{http_code}\n' "https://$SIGNER_HOST/session/create"
-# 404
-```
-
----
-
-## 7. Service workers (audit / email / cred / memory)
-
-| Concern | Today | Future |
-|---|---|---|
-| Processes | 4 systemd units: `agentkeys-worker-{audit,email,creds,memory}.service` on `127.0.0.1:{9092,9093,9094,9095}` | Each splits to its own EC2 / IAM principal |
-| Host | **Same EC2 box as the broker** — co-located behind the same nginx, provisioned by the same `setup-broker-host.sh` run | Separate machines (or enclaves); only the A records + certs move |
-| Public hostnames | `audit.<zone>` / `email.<zone>` / `cred.<zone>` / `memory.<zone>` — exported as `WORKER_*_HOST` / `AGENTKEYS_WORKER_*_URL` in [`scripts/operator-workstation.env`](../scripts/operator-workstation.env) | Same hostnames (unchanged) |
-| Endpoints | `audit` → `/v1/audit/*` + `/healthz` ; `email` → `/v1/email/*` + `/healthz` ; `cred` → `/v1/cred/*` + `/healthz` ; `memory` → `/v1/memory/*` + `/healthz` | Unchanged |
-| KEK material | `/etc/agentkeys/worker-{creds,memory}.env` (mode 0600, owner `agentkeys`) — auto-generated on first `setup-broker-host.sh` run, **never rotated** (rotation invalidates every previously-encrypted blob) | mTLS-derived KEK from the signer |
-
-### 7.1 DNS — 4 A records in one Route 53 batch
-
-```bash
-# === ON OPERATOR WORKSTATION ===
-awsp agentkeys-admin                           # account-owner profile (Route 53 + EC2 read)
-set -a; source ./scripts/operator-workstation.env; set +a
-
-# Single helper — derives EIP from AWS, validates it's not VPN-rewritten,
-# UPSERTs all 4 records atomically, waits for INSYNC + Cloudflare DoH
-# propagation, then prints the next-step certbot loop.
-bash scripts/dns-upsert-workers.sh
-
-# Override knobs:
-#   --eip 1.2.3.4               # use a known EIP instead of describe-addresses
-#   --zone-id Z…                # override default litentry.org zone
-#   --ttl 60                    # tighter TTL while iterating
-#   --dry-run                   # print the change-batch JSON, don't apply
-```
-
-The script is idempotent (UPSERT replaces if exists, creates if not). Re-running it is a no-op when the records already point at `$EIP`.
-
-### 7.2 TLS certs + nginx flip
-
-> The four worker `WORKER_*_HOST` variables are **laptop-only** (set in `operator-workstation.env`). On the broker host, derive them from the nginx vhosts that `setup-broker-host.sh` just wrote — the snippet below does it inline so commands work in a fresh broker shell with no env vars set.
+# Roles
+for r in agentkeys-data-role agentkeys-vault-role agentkeys-memory-role agentkeys-broker-host; do
+  for p in $(aws iam list-role-policies --role-name "$r" --query 'PolicyNames[]' --output text 2>/dev/null); do
+    aws iam delete-role-policy --role-name "$r" --policy-name "$p"
+  done
+  aws iam delete-role --role-name "$r" 2>/dev/null || true
+done
 
-```bash
-# === ON BROKER HOST ===
-# 1. First pass writes HTTP-only nginx vhosts for all 4 workers.
-sudo bash scripts/setup-broker-host.sh --yes
+# OIDC provider
+aws iam delete-open-id-connect-provider \
+  --open-id-connect-provider-arn "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${BROKER_HOST}"
 
-# Read the 4 hostnames back out of the just-written vhosts.
-AUDIT_HOST=$(awk '/server_name/ && /audit\./  {gsub(";",""); print $2}' /etc/nginx/sites-available/agentkeys-worker-audit  | head -1)
-EMAIL_HOST=$(awk '/server_name/ && /email\./  {gsub(";",""); print $2}' /etc/nginx/sites-available/agentkeys-worker-email  | head -1)
-CRED_HOST=$(awk  '/server_name/ && /cred\./   {gsub(";",""); print $2}' /etc/nginx/sites-available/agentkeys-worker-cred   | head -1)
-MEMORY_HOST=$(awk '/server_name/ && /memory\./ {gsub(";",""); print $2}' /etc/nginx/sites-available/agentkeys-worker-memory | head -1)
-echo "AUDIT=$AUDIT_HOST EMAIL=$EMAIL_HOST CRED=$CRED_HOST MEMORY=$MEMORY_HOST"
-
-# 2. Issue Let's Encrypt certs (webroot mode — does NOT touch nginx config).
-for h in "$AUDIT_HOST" "$EMAIL_HOST" "$CRED_HOST" "$MEMORY_HOST"; do
-  sudo certbot certonly --webroot -w /var/www/certbot -d "$h" \
-    --agree-tos -m ops@litentry.org --non-interactive
+# Daemon user
+for k in $(aws iam list-access-keys --user-name agentkeys-daemon --query 'AccessKeyMetadata[].AccessKeyId' --output text); do
+  aws iam delete-access-key --user-name agentkeys-daemon --access-key-id "$k"
 done
+aws iam delete-user-policy --user-name agentkeys-daemon --policy-name agentkeys-daemon-assume-role 2>/dev/null || true
+aws iam delete-user --user-name agentkeys-daemon
 
-# 3. Re-run to flip each vhost onto :443 ssl. Idempotent — re-runs without
-#    new certs are no-ops; re-runs after cert issuance flip A → B per host.
-sudo bash scripts/setup-broker-host.sh --yes
-```
+# SES + DNS
+aws ses set-active-receipt-rule-set --rule-set-name "" --region "$REGION" 2>/dev/null || true
+aws sesv2 delete-email-identity --email-identity "$MAIL_DOMAIN" --region "$REGION" 2>/dev/null || true
+# DNS records are operator-managed (Route 53 / your DNS provider) — delete by hand.
 
-### 7.3 Verify
-
-```bash
-# === ON OPERATOR WORKSTATION ===
-bash scripts/verify-workers.sh
-
-# Per-worker drilldown if any failed:
-curl -sS "https://${WORKER_AUDIT_HOST}/healthz"     # → ok
-curl -sS "https://${WORKER_EMAIL_HOST}/healthz"     # → ok
-curl -sS "https://${WORKER_CRED_HOST}/healthz"      # → JSON {"ok":true,...}
-curl -sS "https://${WORKER_MEMORY_HOST}/healthz"    # → JSON {"ok":true,...}
-
-# Defense-in-depth: each worker vhost only proxies its own /v1/<slug>/* surface.
-curl -sS -o /dev/null -w '%{http_code}\n' "https://${WORKER_AUDIT_HOST}/v1/cred/anything"
-# 404 (audit vhost won't proxy /v1/cred)
+# EC2 + EIP (manual via console or aws ec2 CLI)
 ```
 
----
-
-## 8. Cleanup
-
-```bash
-# OIDC federation (if §4 ran)
-aws iam delete-open-id-connect-provider \
-  --open-id-connect-provider-arn "$OIDC_PROVIDER_ARN" 2>/dev/null
-
-# IAM
-aws iam delete-role-policy --role-name agentkeys-data-role --policy-name agentkeys-data-role-inline
-aws iam delete-role        --role-name agentkeys-data-role
-for KEY in $(aws iam list-access-keys --user-name agentkeys-daemon --query 'AccessKeyMetadata[*].AccessKeyId' --output text); do
-  aws iam delete-access-key --user-name agentkeys-daemon --access-key-id "$KEY"
-done
-aws iam delete-user-policy --user-name agentkeys-daemon --policy-name agentkeys-daemon-assume-role
-aws iam delete-user        --user-name agentkeys-daemon
-
-# Optional: the broker-host instance profile
-aws iam remove-role-from-instance-profile --instance-profile-name agentkeys-broker-host --role-name agentkeys-broker-host 2>/dev/null
-aws iam delete-instance-profile --instance-profile-name agentkeys-broker-host 2>/dev/null
-aws iam delete-role-policy --role-name agentkeys-broker-host --policy-name BrokerAssumeData 2>/dev/null
-aws iam delete-role        --role-name agentkeys-broker-host 2>/dev/null
-
-# SES + S3
-aws ses set-active-receipt-rule-set --rule-set-name "" --region "$REGION"
-aws sesv2 delete-email-identity --region "$REGION" --email-identity "$DOMAIN"
-aws s3 rm "s3://$BUCKET" --recursive
-aws s3api delete-bucket --region "$REGION" --bucket "$BUCKET"
-
-# DNS records on the parent zone are NOT auto-deleted — you'll need to
-# remove the DKIM CNAMEs, MX, SPF, DMARC, and broker A record by hand
-# if you want a clean zone.
-```
+## Related
 
----
+- Chain bring-up: [`docs/heima-setup.md`](heima-setup.md)
+- CI activation: [`docs/ci-setup.md`](ci-setup.md)
+- Broker host script (single entry point): [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh)
+- Architecture: [`docs/spec/architecture.md`](spec/architecture.md) §17 (per-data-class buckets), §17.2 (per-bucket IAM role)
+- FAQ + troubleshooting: [`wiki/cloud-setup-faq.md`](../wiki/cloud-setup-faq.md)
 
 ## Follow-ups tracked elsewhere
 
-- **TEE-BYODKIM** — replace AWS-managed DKIM. Depends on [`heima-gaps §4`](./spec/heima-gaps-vs-desired-architecture.md).
-- **TEE-derived OIDC signer** — replace on-disk ES256. Depends on [`heima-gaps §3`](./spec/heima-gaps-vs-desired-architecture.md).
-- **Per-address S3 prefix routing** — currently all inbound lands in `inbound/`; per-`<wallet>/<address>/` prefix routing wants either a SES Lambda or subdomain receipt rules.
-- **GCP / Tencent recipes** — equivalent of §4 against GCP Workload Identity Federation and Tencent CAM. JWT/JWKS shape works cross-cloud unchanged; only the registration step differs.
+- Per-recipient routing Lambda hardening: [`TODOS.md`](../TODOS.md) "Disable broker's broad S3-full-access"
+- Tencent Cloud SimpleDM + COS port: tracked separately
+- TEE-held BYODKIM migration: [`docs/spec/heima-gaps-vs-desired-architecture.md`](spec/heima-gaps-vs-desired-architecture.md) §4
diff --git a/docs/heima-setup.md b/docs/heima-setup.md
new file mode 100644
index 0000000..7c537a7
--- /dev/null
+++ b/docs/heima-setup.md
@@ -0,0 +1,106 @@
+# Heima setup — AgentKeys
+
+**Audience:** the operator bringing AgentKeys up on a Heima chain (mainnet, Paseo, or local Anvil).
+**Scope:** one command that walks the 15-step chain bring-up end-to-end.
+**Companion:** [`docs/cloud-setup.md`](cloud-setup.md) for the AWS/broker side. Run cloud-setup first — Heima setup expects [`scripts/operator-workstation.env`](../scripts/operator-workstation.env) to already exist.
+**FAQ + troubleshooting:** [`wiki/heima-setup-faq.md`](../wiki/heima-setup-faq.md).
+
+## TL;DR
+
+```bash
+# Mainnet (default; AGENTKEYS_CHAIN=heima implicit)
+AWS_PROFILE=agentkeys-admin bash scripts/setup-heima.sh
+
+# Paseo testnet (no real HEI cost; Alice sudo funds the deployer)
+AWS_PROFILE=agentkeys-admin bash scripts/setup-heima.sh --chain heima-paseo
+
+# Local Anvil (fully ephemeral, instant finality, free)
+AWS_PROFILE=agentkeys-admin bash scripts/setup-heima.sh --chain anvil
+```
+
+[`scripts/setup-heima.sh`](../scripts/setup-heima.sh) is **the single idempotent entry point** for Heima bring-up. Re-running is safe: every step pre-checks chain state and short-circuits when the work is already a no-op. Per-step helpers (`scripts/heima-{bring-up,device-register,agent-create,scope-set,credential-audit,worker-smoke}.sh`) stay callable directly for surgical re-runs.
+
+## What runs, in order
+
+| # | Step | Idempotency check | Helper script |
+|---|------|-------------------|---------------|
+| 1 | Tool sanity-check (jq curl aws cast forge node npx python3 + `agentkeys` binary) | tool presence | — |
+| 2 | Source `scripts/operator-workstation.env` | file exists + `REGION` set | — |
+| 3 | Chain reachability + `eth_chainId` matches the profile's claim | catches "you said paseo but the RPC is mainnet" footguns | — |
+| 4 | Generate/reuse deployer keypair at `~/.agentkeys/${chain}-deployer.key` (0600) | file exists | (inline) |
+| 5 | Fund the deployer | balance ≥ floor | [`heima-fund-account.sh`](../scripts/heima-fund-account.sh) |
+| 6 | Deploy the 6 stage-1 contracts atomically (P256Verifier → K11Verifier → SidecarRegistry → AgentKeysScope → K3EpochCounter → CredentialAudit) | `cast code` on every claimed address; skip when present | [`heima-bring-up.sh`](../scripts/heima-bring-up.sh) |
+| 7 | Persist contract addresses to `operator-workstation.env` namespaced by chain | (sed replace-or-append, no-op when unchanged) | (inside bring-up) |
+| 8 | Verify contracts on-chain (read-only RPC: bytecode + ABI + wiring) | always runs, ~3s | [`verify-heima-contracts.sh`](../scripts/verify-heima-contracts.sh) |
+| 9 | Register operator master device (first-master bootstrap) | `getDevice.registeredAt > 0` check | [`heima-device-register.sh`](../scripts/heima-device-register.sh) |
+| 10 | K11 enrollment (stub bytes by default; `--webauthn` for real Touch ID) | enrollment file exists at `~/.agentkeys/k11/<omni>.json` | (inline) |
+| 11 | Create demo agent device | `getDevice.registeredAt > 0` check | [`heima-agent-create.sh`](../scripts/heima-agent-create.sh) |
+| 12 | Set scope for agent (K11-gated — needs `--webauthn`) | `getScope` config-equality check; skipped without `--webauthn` | [`heima-scope-set.sh`](../scripts/heima-scope-set.sh) |
+| 13 | Append a credential-audit row (V1 path) | **intentionally append-only** — re-runs add a fresh row | [`heima-credential-audit.sh`](../scripts/heima-credential-audit.sh) |
+| 14 | Tier-A audit relay + worker `/healthz` smoke | **intentionally append-only** | [`heima-worker-smoke.sh`](../scripts/heima-worker-smoke.sh) |
+| 15 | Summary — print contract addresses + suggested next-step re-runs | always | — |
+
+## Per-step re-runs
+
+The orchestrator accepts `--from-step N`, `--to-step N`, and `--only-step N`. Use these to surgically re-run after fixing an issue without re-walking the whole pipeline:
+
+```bash
+# Just re-check the deploy (cast-code idempotency means nothing redeploys
+# unless an address is empty)
+bash scripts/setup-heima.sh --only-step 6
+
+# Re-register the master after rotating the session JWT
+bash scripts/setup-heima.sh --only-step 9
+
+# Just smoke the workers
+bash scripts/setup-heima.sh --only-step 14
+```
+
+## Mainnet vs Paseo vs Anvil
+
+| | `heima` (mainnet) | `heima-paseo` (testnet) | `anvil` (local dev) |
+|---|---|---|---|
+| Chain ID | 212013 | 2013 | 31337 |
+| Cost per deploy | real HEI gas | 0 | 0 |
+| Deployer funding | operator's personal wallet (no sudo on mainnet) | Alice sudo via [`heima-fund-account.sh`](../scripts/heima-fund-account.sh) | anvil pre-funds the default key with 10 000 ETH |
+| Finality | per chain profile | per chain profile | instant |
+| Used by | production | dev / pre-merge sanity | unit tests + ephemeral dev |
+| Mainnet deploy guard | requires `MAINNET_CONFIRM=1` env var | — | — |
+| Stage-1 K11 stub on this chain | refuses unless `AGENTKEYS_ALLOW_STAGE1_STUBS=1` (per arch.md §22b.1) | allowed | allowed |
+
+## After a successful run
+
+`setup-heima.sh` writes the contract addresses to `scripts/operator-workstation.env` under chain-namespaced keys (e.g. `SCOPE_CONTRACT_ADDRESS_HEIMA=0x…`). Subsequent steps + the broker workers source the same env file, so no manual copy-paste is needed.
+
+Verify any time:
+
+```bash
+AGENTKEYS_CHAIN=heima       bash scripts/verify-heima-contracts.sh
+AGENTKEYS_CHAIN=heima-paseo bash scripts/verify-heima-contracts.sh
+```
+
+Read-only RPC, zero gas, exits 0 on all-pass.
+
+## Chain-profile source of truth
+
+Built-in profiles ship in [`crates/agentkeys-core/chain-profiles/`](../crates/agentkeys-core/chain-profiles/) (`heima.json`, `heima-paseo.json`, `anvil.json`, `base.json`, `base-sepolia.json`, `ethereum.json`, `sepolia.json`). Each carries: RPC URL, chain ID, gas model, default block tag for finality, foundry chain arg.
+
+To override the RPC for one run without forking a profile:
+
+```bash
+AGENTKEYS_CHAIN_PROFILE_FILE=./my-custom-profile.json bash scripts/setup-heima.sh
+```
+
+The JSON shape is documented in [`docs/spec/architecture.md`](spec/architecture.md) §22a.
+
+## Heima EVM version pin
+
+Heima Frontier runs at London EVM level (pre-Merge). [`crates/agentkeys-chain/foundry.toml`](../crates/agentkeys-chain/foundry.toml) pins `evm_version = "london"` so Foundry's simulator doesn't reject `prevrandao`-less block headers. **Don't change this** without re-verifying against a live Heima block header — see [CLAUDE.md "Heima EVM compatibility level"](../CLAUDE.md) for the verification recipe.
+
+## Related
+
+- Cloud / AWS prereqs: [`docs/cloud-setup.md`](cloud-setup.md)
+- CI setup: [`docs/ci-setup.md`](ci-setup.md)
+- Live contract addresses: [`docs/spec/deployed-contracts.md`](spec/deployed-contracts.md)
+- Architecture: [`docs/spec/architecture.md`](spec/architecture.md) §22 (chain profiles), §22b (per-actor binding ceremonies)
+- FAQ + troubleshooting: [`wiki/heima-setup-faq.md`](../wiki/heima-setup-faq.md)
diff --git a/wiki/ci-setup-faq.md b/wiki/ci-setup-faq.md
new file mode 100644
index 0000000..b8af0d3
--- /dev/null
+++ b/wiki/ci-setup-faq.md
@@ -0,0 +1,96 @@
+# CI setup — FAQ
+
+Troubleshooting + edge cases for [`docs/ci-setup.md`](https://github.com/litentry/agentKeys/blob/main/docs/ci-setup.md) + [`.github/workflows/harness-ci.yml`](https://github.com/litentry/agentKeys/blob/main/.github/workflows/harness-ci.yml).
+
+## Q. The `harness-e2e` job always shows "skipped" — what gives?
+
+That's the designed behavior until `TEST_OIDC_AWS_ROLE_ARN` is set as a repo secret. The preflight job emits a `::warning::` reminder. Until the operator finishes the 7-step bring-up in `docs/ci-setup.md`, only `rust-checks` runs — and that's enough to catch most regressions (600+ tests).
+
+## Q. `AssumeRoleWithWebIdentity` returns `InvalidIdentityToken: No OpenIDConnect provider found`
+
+AWS hasn't found the test broker's OIDC provider. Three checks:
+
+1. The OIDC provider ARN matches the broker's `BROKER_OIDC_ISSUER` byte-for-byte (including scheme and trailing slash).
+2. The broker's `.well-known/openid-configuration` is reachable from the public internet (curl from a random box, not just the runner).
+3. The IAM trust policy on the test role lists the OIDC provider ARN under `Principal.Federated`.
+
+## Q. `harness-e2e` runs but stage-3 fails with `AccessDenied` on the cross-actor write
+
+That's the test working — stage-3 step 5 / 8 / 9 are NEGATIVE tests that EXPECT `AccessDenied`. If they pass-as-success, the workflow exits 0. If they pass with `AccessDenied`, the harness script asserts that (the per-actor + per-data-class invariants from CLAUDE.md). A genuine failure is the script exiting non-zero, not the AWS API returning `AccessDenied`.
+
+## Q. Concurrent runs collide on S3 writes
+
+Per-run prefix isolation via `CI_S3_PREFIX=ci/run-${GITHUB_RUN_ID}` should prevent this. If you see it anyway:
+
+- Confirm `CI_S3_PREFIX` is being honored by every write site in the harness (currently `harness/v2-stage3-demo.sh` honors it; verify if you've added other harness steps).
+- Make sure `concurrency.cancel-in-progress: true` is set in the workflow (it is — but a previous-run-in-flight can briefly overlap).
+
+## Q. Test contract addresses drifted from the secrets
+
+Happens when the operator redeploys the test contracts (e.g. after a `.sol` source change) but forgets to update the `TEST_*_HEIMA` secrets. Symptoms: stage-1 step 8 (verify-contracts) fails with "no bytecode at $SCOPE_ADDR".
+
+**Fix:** re-read addresses from `scripts/operator-workstation.env` post-redeploy, update the six `TEST_*_HEIMA` secrets via the GitHub UI. Use the GitHub CLI:
+
+```bash
+for addr in SCOPE_CONTRACT_ADDRESS_HEIMA SIDECAR_REGISTRY_ADDRESS_HEIMA K3_EPOCH_COUNTER_ADDRESS_HEIMA \
+            CREDENTIAL_AUDIT_ADDRESS_HEIMA P256_VERIFIER_ADDRESS_HEIMA K11_VERIFIER_ADDRESS_HEIMA; do
+  val=$(grep "^${addr}=" scripts/operator-workstation.env | cut -d= -f2)
+  gh secret set "TEST_${addr}" --body "$val"
+done
+```
+
+## Q. The test deployer wallet ran out of HEI
+
+CI doesn't redeploy on every run (it uses pinned addresses from secrets). The deployer wallet is only spent when the operator manually re-runs `setup-heima.sh` for the test instance. If it does run out:
+
+```bash
+# Check balance
+cast balance "$(cast wallet address $(cat ~/.agentkeys/heima-deployer-test.key))" \
+  --rpc-url "$(agentkeys chain show heima | jq -r .rpc.http)"
+
+# Top up from your personal wallet — small float (~1 HEI) is enough
+```
+
+## Q. Manual dispatch errors with `inputs.stage` unrecognized
+
+`workflow_dispatch.inputs` requires the workflow to be on the default branch (or your fork's default). If the workflow file landed on a feature branch, `gh workflow run` may fail. Either land it on `main` first, or push the feature branch and re-target:
+
+```bash
+gh workflow run harness-ci.yml --ref my-branch --field stage=3
+```
+
+## Q. Can the workflow run on every PR (not just operator-dispatched)?
+
+It already does — push + pull_request triggers are wired in `on:` at the top. The gate is `TEST_OIDC_AWS_ROLE_ARN`, not the trigger. Every PR's `rust-checks` job runs unconditionally; the `harness-e2e` job runs only if the secret is set.
+
+## Q. The workflow won't trigger on a PR from a fork
+
+GitHub doesn't pass secrets to fork PRs by default — that's a platform security feature. The `harness-e2e` job will preflight-skip on fork PRs even with the secret set. Reviewer needs to push the fork branch to the upstream repo or manually dispatch the workflow from the PR page.
+
+## Q. `aws-actions/configure-aws-credentials` succeeds but `aws sts get-caller-identity` says `agentkeys-admin`
+
+You forgot to update the role ARN secret after rotating to OIDC. The default credential chain falls through to whatever AWS profile is on the runner image. Set `TEST_OIDC_AWS_ROLE_ARN` to the GitHub Actions OIDC role ARN (not the admin user ARN), and the OIDC web identity will assume the right role.
+
+## Q. Why is `--test-threads=1` on `cargo test`?
+
+Per the existing `@claude` review workflow convention: broker integration tests mutate process-global `$HOME` + `$AWS_*` env, and the keyring tests serialize on a per-process accounts map. Concurrent threads see each other's mutations and flake. Single-threaded test execution is the conservative default; per-test isolation cleanup is a future improvement.
+
+## Q. CI runs are slow — anything to tune?
+
+- `Swatinem/rust-cache@v2` with `shared-key: harness-ci` is enabled — both jobs share a cache.
+- `concurrency.cancel-in-progress: true` cancels stale runs on a re-push.
+- Foundry toolchain is the slowest install; pin to `version: stable` for cache hits.
+- The 60-minute timeout on `harness-e2e` is generous; typical run is 20–30 min.
+
+If runs still feel slow, profile with `gh run view <run-id> --log-failed | head -50` to find the longest step.
+
+## Q. Where do I read the harness logs after a failure?
+
+Each harness script writes a temp dir under `/tmp/agentkeys-*`. The workflow uploads `/tmp/agentkeys-ci-ephemeral-*/` as the `ephemeral-stack-logs` artifact on failure (for the harness-e2e job). Download via `gh run download <run-id>`.
+
+## Related
+
+- Operator runbook: [docs/ci-setup.md](https://github.com/litentry/agentKeys/blob/main/docs/ci-setup.md)
+- Workflow file: [.github/workflows/harness-ci.yml](https://github.com/litentry/agentKeys/blob/main/.github/workflows/harness-ci.yml)
+- Cloud setup FAQ: [cloud-setup-faq](./cloud-setup-faq.md)
+- Heima setup FAQ: [heima-setup-faq](./heima-setup-faq.md)
diff --git a/wiki/cloud-setup-faq.md b/wiki/cloud-setup-faq.md
new file mode 100644
index 0000000..2beca18
--- /dev/null
+++ b/wiki/cloud-setup-faq.md
@@ -0,0 +1,94 @@
+# Cloud setup — FAQ
+
+Troubleshooting + edge cases that didn't fit in [`docs/cloud-setup.md`](https://github.com/litentry/agentKeys/blob/main/docs/cloud-setup.md). Use ⌘F to find your error.
+
+## Q. `setup-broker-host.sh` says "BROKER_OIDC_ISSUER mismatch" on re-run
+
+The script auto-detects an existing systemd unit and reads `Environment=` lines to decide bootstrap-vs-upgrade. If you ran with a different `--issuer-url` previously and the AWS OIDC provider was already registered for the old URL, the new run refuses.
+
+**Fix:** decide which URL is canonical. AWS validates the OIDC issuer URL byte-for-byte against the JWT `iss` claim, so the issuer URL is effectively immutable once the IAM trust policy is built. Either:
+- Re-run with the OLD `--issuer-url` (the trust policy already matches).
+- Or delete the OIDC provider, redo §4 from cloud-setup.md, and re-run with the NEW URL.
+
+## Q. nginx 502 after a fresh `setup-broker-host.sh` run
+
+systemd may have started the broker before nginx finished its first `systemctl reload`. Two-step fix:
+
+```bash
+sudo systemctl status agentkeys-broker          # → active (running)
+sudo systemctl restart nginx                    # picks up the new vhost
+curl -sf https://${BROKER_HOST}/healthz         # → 200
+```
+
+If the broker itself is failing to boot, `journalctl -u agentkeys-broker -n 50` is authoritative.
+
+## Q. `verify_sender_ready` precheck fails at broker boot
+
+The broker calls SES `GetEmailIdentity` on `BROKER_EMAIL_FROM_ADDRESS` at startup. If the SES domain identity isn't verified yet, boot refuses. Run [`scripts/ses-verify-sender.sh`](https://github.com/litentry/agentKeys/blob/main/scripts/ses-verify-sender.sh) and wait for the DKIM tokens to propagate (5–30 min typical), then restart the broker.
+
+## Q. `aws iam create-open-id-connect-provider` returns `EntityAlreadyExistsException`
+
+The OIDC provider already exists. Verify with:
+
+```bash
+aws iam list-open-id-connect-providers \
+  | jq -r '.OpenIDConnectProviderList[].Arn' \
+  | grep "${BROKER_HOST}"
+```
+
+If the ARN is correct, you're done — the trust policy and bucket policy from §4.3/§4.4 are the only steps that remain.
+
+## Q. `AccessDenied` from S3 even though the role + bucket policy look right
+
+Three things almost always:
+
+1. The role's **inline policy** still has the broad-bucket grant from §3.5 — strip it via §4.4.1.
+2. The bucket policy's `s3:prefix` condition needs the `${aws:PrincipalTag/agentkeys_actor_omni}` interpolation to be lowercased — addresses are case-sensitive in policy string comparisons.
+3. `s3:ListBucket` needs the `s3:prefix=bots/${PrincipalTag}/<class>/*` condition in a separate statement (the v3 split-statement bucket policy from codex P2). Listing the bucket root without that condition always returns AccessDenied.
+
+CloudTrail's `Decision` field tells you which statement evaluated.
+
+## Q. Per-profile default region trap (real 2026-05-12 incident)
+
+`agentkeys-admin` defaults to `us-west-2`; `agentkeys-broker` / `agentkeys-daemon` default to `us-east-1`. Every regional CLI call must pass `--region "$REGION"` explicitly. The CLAUDE.md "Per-profile default region is NOT uniform" section covers this in detail.
+
+## Q. Cert renewal failed silently — workflow turned red overnight
+
+certbot renewals run on a 90-day cadence. If they fail (often: rate limit, DNS-01 hiccup, port 80 firewall block), AWS stops trusting the OIDC issuer (TLS chain breaks). Symptoms:
+
+- `harness-e2e` CI job fails on the first `curl https://${BROKER_HOST}` with a TLS error.
+- `journalctl -u certbot-renew` shows the failure reason.
+
+**Recovery:** rerun `sudo certbot renew --force-renewal` (works for transient rate-limit issues), or fix the DNS / firewall and re-run. The broker doesn't need to restart — nginx reloads automatically.
+
+## Q. Switching AWS accounts for the test instance
+
+Same-account is fine — isolation comes from the `-test` suffix, not from the AWS account boundary. If you want hard account isolation, every reference to `${ACCOUNT_ID}` in cloud-setup.md becomes `${TEST_ACCOUNT_ID}`, including the role ARN that the broker assumes via OIDC. The setup-broker-host.sh script accepts `--account-id` to point at a different account.
+
+## Q. Tencent Cloud port?
+
+§2.2 of cloud-setup.md sketches SimpleDM + COS as the swap-in at the §3+ boundary. The boundary is real — DNS + inbound mail are the only AWS-specific layers; everything from `agentkeys-data-role` onward is provider-agnostic in shape, with COS providing S3-compatible PutObject/GetObject and Tencent's IAM providing OIDC federation. Real port work is tracked separately.
+
+## Q. Can I run the broker without nginx?
+
+Yes — `setup-broker-host.sh --without-nginx --without-certbot` skips both. You're then responsible for TLS termination upstream (CloudFront, ALB, custom reverse proxy). AWS still needs to fetch the OIDC discovery + JWKS over public TLS, so whatever fronts the broker must serve `https://${BROKER_HOST}/.well-known/*` with a valid leaf cert.
+
+## Q. The systemd unit was hand-edited and now setup-broker-host.sh refuses
+
+Per CLAUDE.md "Remote broker host (single entry point)" — don't hand-edit. To recover:
+
+```bash
+sudo systemctl stop agentkeys-broker
+sudo rm /etc/systemd/system/agentkeys-broker.service
+sudo systemctl daemon-reload
+sudo bash scripts/setup-broker-host.sh --yes
+```
+
+The script rewrites the unit clean. If you had a legitimately custom field, add a `--*-host` or `--cred-mode` flag to the script and re-run — that's how all per-host overrides ship.
+
+## Related
+
+- Operator runbook: [docs/cloud-setup.md](https://github.com/litentry/agentKeys/blob/main/docs/cloud-setup.md)
+- Single entry point: [scripts/setup-broker-host.sh](https://github.com/litentry/agentKeys/blob/main/scripts/setup-broker-host.sh)
+- Heima chain FAQ: [heima-setup-faq](./heima-setup-faq.md)
+- CI FAQ: [ci-setup-faq](./ci-setup-faq.md)
diff --git a/wiki/heima-setup-faq.md b/wiki/heima-setup-faq.md
new file mode 100644
index 0000000..9281843
--- /dev/null
+++ b/wiki/heima-setup-faq.md
@@ -0,0 +1,111 @@
+# Heima setup — FAQ
+
+Troubleshooting + edge cases for [`docs/heima-setup.md`](https://github.com/litentry/agentKeys/blob/main/docs/heima-setup.md) + [`scripts/setup-heima.sh`](https://github.com/litentry/agentKeys/blob/main/scripts/setup-heima.sh).
+
+## Q. `chain mismatch: profile says chain_id=X but RPC reports Y`
+
+Step 3 caught a misconfigured RPC. Usually means `AGENTKEYS_CHAIN=heima` is set but the chain profile's `rpc.http` points at Paseo (or vice versa). Either:
+
+- Edit the chain profile JSON in [`crates/agentkeys-core/chain-profiles/`](https://github.com/litentry/agentKeys/tree/main/crates/agentkeys-core/chain-profiles).
+- Override per-run via `AGENTKEYS_CHAIN_PROFILE_FILE=./my-profile.json`.
+
+Never set `AGENTKEYS_CHAIN=heima` and then point at a Paseo RPC — many downstream balance / nonce reads will return wrong-chain data.
+
+## Q. Step 6 says "deploy skipped" but I expect a fresh deploy
+
+`heima-bring-up.sh` runs `cast code` on every claimed address in `operator-workstation.env` and short-circuits if all six addresses already have bytecode on chain. Force a redeploy with:
+
+```bash
+# Clear the saved addresses for this chain, then re-run
+PROFILE_UC=$(printf '%s' "${AGENTKEYS_CHAIN:-heima}" | tr 'a-z-' 'A-Z_')
+sed -i.bak "/^.*_CONTRACT_ADDRESS_${PROFILE_UC}=.*/d" scripts/operator-workstation.env
+bash scripts/setup-heima.sh --only-step 6
+```
+
+Mainnet deploys cost real HEI — confirm you actually want a redeploy before clearing.
+
+## Q. Mainnet deploy refuses with "MAINNET_CONFIRM=1 required"
+
+The mainnet path has a paranoid guard against accidental redeploys. Pass `MAINNET_CONFIRM=1` only when you're sure:
+
+```bash
+MAINNET_CONFIRM=1 AGENTKEYS_CHAIN=heima bash scripts/setup-heima.sh --only-step 6
+```
+
+## Q. Paseo step 5 (fund deployer) hangs
+
+Paseo collators were halted at block 2,905,430 (frozen since 2026-01-15 per CLAUDE.md). When they're down, `heima-fund-account.sh` can't reach the chain. Three options:
+
+- Wait for the parachain to recover.
+- Switch to `--chain anvil` for local dev work.
+- Switch to `--chain heima` mainnet (fund from your personal wallet — no sudo on mainnet).
+
+## Q. K11 enrollment stub refuses on mainnet
+
+Per arch.md §22b.1: stage-1 K11 stub on mainnet requires `AGENTKEYS_ALLOW_STAGE1_STUBS=1`. The flag exists to keep accidental stub enrollments off mainnet — the on-chain `length != 0` gate accepts stubs but the bytes aren't cryptographically bound.
+
+For real Touch ID:
+
+```bash
+bash scripts/setup-heima.sh --webauthn
+```
+
+For one-time deliberate stub on mainnet (dev / debug):
+
+```bash
+AGENTKEYS_ALLOW_STAGE1_STUBS=1 bash scripts/setup-heima.sh
+```
+
+## Q. Step 12 (scope set) skipped — what now?
+
+Step 12 needs a real K11 ceremony (master-mutation, not just creation). Re-run the orchestrator with `--webauthn`, or invoke `heima-scope-set.sh --webauthn` directly:
+
+```bash
+bash scripts/heima-scope-set.sh \
+  --webauthn \
+  --agent demo-agent \
+  --services openrouter \
+  --session-id alice
+```
+
+## Q. Why are steps 13 + 14 "intentionally append-only"?
+
+The audit log + tier-A relay are designed to grow. Each re-run advances `entryCount` and adds a fresh row — that's the audit trail working as intended, not a regression. If you re-run setup-heima.sh weekly for sanity, the audit log will accumulate ~weekly rows.
+
+To check the entry count any time:
+
+```bash
+cast call "$CREDENTIAL_AUDIT_ADDRESS_HEIMA" "entryCount()(uint256)" \
+  --rpc-url "$(agentkeys chain show heima | jq -r .rpc.http)"
+```
+
+## Q. Per-step re-run fails with "missing session JWT"
+
+Steps 9–13 read `~/.agentkeys/${SESSION_ID}/session.json` to derive the operator's `actor_omni`. If the JWT expired or was deleted, re-mint:
+
+```bash
+agentkeys init --session-id alice --email alice@example.com
+```
+
+Then re-run the orchestrator from the failing step.
+
+## Q. `forge script` errors with "header validation error: `prevrandao` not set"
+
+Heima Frontier is at London EVM level (pre-Merge). [`crates/agentkeys-chain/foundry.toml`](https://github.com/litentry/agentKeys/blob/main/crates/agentkeys-chain/foundry.toml) must pin `evm_version = "london"`. If you bumped it for unrelated reasons, revert. The full diagnosis is in CLAUDE.md "Heima EVM compatibility level".
+
+## Q. Anvil contract addresses are different every run — is that wrong?
+
+No. Anvil starts fresh per process; the deterministic deployer key + nonce-0 still produces the canonical first address (`0x5FbDB2315678afecb367f032d93F642f64180aa3` for P256Verifier), but `operator-workstation.env`'s pinned addresses are for the persistent chains (heima / heima-paseo), not for anvil. The `verify-heima-contracts.sh` flow + chain-namespaced env keys handle this — anvil reuses the deploy-time addresses for the lifetime of one anvil process.
+
+## Q. I want to redeploy ONLY one contract
+
+The atomic deploy is by design — each downstream contract takes the prior address via constructor, so partial redeploys break wiring. If you need a single-contract upgrade, use a proxy pattern (out of scope for stage-1) or do a full redeploy + update the env file.
+
+## Related
+
+- Operator runbook: [docs/heima-setup.md](https://github.com/litentry/agentKeys/blob/main/docs/heima-setup.md)
+- Orchestrator: [scripts/setup-heima.sh](https://github.com/litentry/agentKeys/blob/main/scripts/setup-heima.sh)
+- Per-step helpers: [scripts/heima-*.sh](https://github.com/litentry/agentKeys/tree/main/scripts)
+- Live contract addresses: [docs/spec/deployed-contracts.md](https://github.com/litentry/agentKeys/blob/main/docs/spec/deployed-contracts.md)
+- Cloud setup FAQ: [cloud-setup-faq](./cloud-setup-faq.md)
+- CI setup FAQ: [ci-setup-faq](./ci-setup-faq.md)

From 5b88bb6b595f8ada1092bf2195d8d62c6f9f996e Mon Sep 17 00:00:00 2001
From: wildmeta-agent <agent@wildmeta.ai>
Date: Thu, 21 May 2026 10:31:37 +0800
Subject: [PATCH 4/4] docs: extract first-time cloud bootstrap into separate
 doc
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per operator request: the very-beginning cloud-account provisioning
(IAM users + role, DNS, SES, S3 buckets, instance profile) needs to
live in a separate doc so it stays reachable when:

  - Adding a second AWS account (test instance, regional shard)
  - Migrating to AliCloud / GCP / Tencent Cloud
  - Re-bootstrapping after a teardown
  - Auditing the identity surface

The previous condense pass collapsed those sections into cloud-setup.md's
slim §1-§3 — convenient for day-to-day operators but stripped the depth
needed for the migration / second-account use cases.

What changed:

  docs/cloud-bootstrap.md  — NEW, 365 lines
    First-time, per-account, cloud-provider-portable bootstrap doc:

      §1  Identities             — four IAM principals, cloud-agnostic
      §2  Domain + DNS           — subdomain map, parent-zone confirm
      §3  Email backend          — SES domain verify + receipt rule +
                                    inbound S3 bucket creation
      §4  IAM users + roles      — agentkeys-daemon + agentkeys-data-role +
                                    per-data-class vault/memory roles
      §5  Initial bucket policy  — static-IAM variant (pre-OIDC)
      §6  Instance profile       — agentkeys-broker-host (EC2 optional)
      §7  Security audit         — strip legacy over-broad attached policies
                                    (`AmazonS3FullAccess` checklist from the
                                    pre-condense §3.4a)
      §8  Cloud-provider port    — AWS / AliCloud / GCP / Tencent Cloud
                                    1:1 mapping table + migration playbook

    Restores the operational depth (DKIM bulk-record bash, daemon user
    create, role trust shape, broker-host instance profile, security
    audit) that the previous condense pass removed. Adds the portability
    framing (concept first, AWS-specific commands as ONE implementation)
    so the doc is the durable reference for non-AWS deployments.

  docs/cloud-setup.md  — UPDATE, 314 → 202 lines
    Refocus on what comes AFTER bootstrap: OIDC federation activation
    (§1, was §4) + the setup-broker-host.sh runtime entry point (§2,
    was §5) + cleanup (§3, was §6). Drop the duplicate §1-§3 prereqs;
    add a clear cross-ref to cloud-bootstrap.md at the top. Section
    numbers renumbered.

  wiki/cloud-setup-faq.md — minor header tweak
    The FAQ now covers both cloud-bootstrap.md and cloud-setup.md
    (operators hit the same gotchas across both phases).

Constraints applied:

  - Concise: every doc still fits in a few screens (bootstrap is
    longest at 365 lines because it carries the actual provisioning
    commands; cloud-setup.md is now 202 lines, down from 970 originally).
  - Idempotent: every flow uses the existing idempotent helper scripts.
  - No project credentials exposed: same placeholder convention as the
    prior pass (${ACCOUNT_ID}, ${ZONE}, etc.). Verified via grep.

All internal links verified (python url-walker).
---
 docs/cloud-bootstrap.md | 365 ++++++++++++++++++++++++++++++++++++++++
 docs/cloud-setup.md     | 171 +++----------------
 wiki/cloud-setup-faq.md |   7 +-
 3 files changed, 397 insertions(+), 146 deletions(-)
 create mode 100644 docs/cloud-bootstrap.md

diff --git a/docs/cloud-bootstrap.md b/docs/cloud-bootstrap.md
new file mode 100644
index 0000000..e5048b4
--- /dev/null
+++ b/docs/cloud-bootstrap.md
@@ -0,0 +1,365 @@
+# Cloud bootstrap — AgentKeys
+
+**Audience:** the operator standing up a brand-new cloud account to host AgentKeys for the first time, or porting the deployment to a new cloud provider (AliCloud, GCP, Tencent Cloud).
+**Scope:** the per-account, run-once provisioning that has to happen **before** anything in [`docs/cloud-setup.md`](cloud-setup.md), [`docs/heima-setup.md`](heima-setup.md), or [`docs/ci-setup.md`](ci-setup.md) can run. Identifiers (DNS names, IAM principals, mail backend, object store, initial bucket policy) — never runtime processes.
+**FAQ + troubleshooting:** [`wiki/cloud-setup-faq.md`](../wiki/cloud-setup-faq.md).
+
+After this doc is run, the operator returns here ONLY when:
+- Switching cloud providers (e.g. AWS → AliCloud)
+- Adding a second AWS account (test instance, regional shard)
+- Re-bootstrapping after a teardown
+- Auditing the identity surface (the security-audit checklist in §7)
+
+The day-to-day broker re-deploys live in [`docs/cloud-setup.md`](cloud-setup.md) §5 (`setup-broker-host.sh`); they never re-enter this doc.
+
+## TL;DR — operator flow
+
+```
+§1  Identities         — four IAM principals; concept first, then provider commands
+§2  Domain + DNS       — subdomain ownership; parent-zone confirmation
+§3  Email backend      — SES domain identity + receipt rule + S3 inbound bucket
+§4  IAM users + roles  — agentkeys-{admin,broker,daemon} + agentkeys-data-role
+§5  Bucket policy      — static-IAM variant (pre-OIDC; replaced in cloud-setup.md §4)
+§6  Instance profile   — agentkeys-broker-host (optional, EC2-only)
+§7  Security audit     — strip legacy over-broad attached policies
+§8  Cloud portability  — AWS → AliCloud / GCP / Tencent Cloud mapping
+```
+
+```bash
+# Per-account shell vars used throughout. Source from operator-workstation.env
+# wherever possible; placeholders here for clarity.
+awsp agentkeys-admin
+aws sts get-caller-identity                  # → agentkeys-admin
+
+export REGION=us-east-1                      # SES inbound regions: us-east-1, us-west-2, eu-west-1
+export MAIL_DOMAIN=bots.${ZONE}              # SES inbound subdomain
+export BROKER_HOST=broker.${ZONE}            # broker TLS-terminating reverse proxy
+export PARENT_ZONE_ID=ZXXXXXXXXXXXXX         # existing parent zone (Route 53 / AliCloud / etc.)
+export ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
+export BUCKET=agentkeys-mail-${ACCOUNT_ID}   # global-unique by account-id suffix
+```
+
+> **Why `jq -n --arg` and not `cat > file.json <<EOF`:** `jq --arg` passes values outside shell parameter expansion, sidestepping the zsh modifier bug (`$VAR:r` etc.) that silently corrupts ARNs. JSON is validated on construction, command substitution feeds straight into `--policy-document`, no file lands on disk. Apply this convention everywhere this doc shows a JSON policy.
+
+## §1 Identities — mental model
+
+Cloud-agnostic. The four principals exist in every cloud the broker runs on; the cloud changes only which API creates them.
+
+| Identity | Type | Holds | Purpose |
+|---|---|---|---|
+| `agentkeys-admin` | privileged user | Long-lived access key | One-shot provisioning. Runs every command in this doc. IAM-admin scope. |
+| `agentkeys-broker` | scoped user | Long-lived access key | Operator's SSH-into-EC2 path via EC2 Instance Connect (AWS) / SSH key (other clouds). No data-plane access. |
+| `agentkeys-daemon` | runtime user | Long-lived access key | The **broker process** uses this at runtime. Only permission: assume the data role. |
+| `agentkeys-data-role` | assumed role | (none — assumed) | Holds the actual storage + email permissions. Trusted by the runtime user (Stage 6) or by the OIDC provider (Stage 7). |
+| `agentkeys-broker-host` | instance profile (optional) | (none — bound to a VM) | If the broker runs on a managed VM, attach this so the daemon never sees a static key. Runtime creds come from IMDS / metadata server. |
+
+> Why "data role" and not "agent role": the project word "agent" already means three things (the AI agent, the AgentKeys product, an IAM role). The role holds **data-plane** permissions. The broker still accepts the legacy `BROKER_AGENT_ROLE_ARN` env var for backwards compatibility.
+
+## §2 Domain + DNS
+
+Six subdomains under the operator's parent zone (substitute `${ZONE}` everywhere):
+
+| Host | Purpose | Provisioned in |
+|---|---|---|
+| `${MAIL_DOMAIN}` (e.g. `bots.${ZONE}`) | SES / email backend inbound | §3 |
+| `${BROKER_HOST}` (e.g. `broker.${ZONE}`) | Broker public reverse proxy | §5.1 of cloud-setup.md |
+| `signer.${ZONE}` | Signer service (issue #74 step 1b) | §5.1 of cloud-setup.md |
+| `audit.${ZONE}` / `email.${ZONE}` / `cred.${ZONE}` / `memory.${ZONE}` | Service workers (issue #90) | §5.1 of cloud-setup.md (dev co-location on broker EIP today) |
+
+Confirm the parent zone is reachable before any record changes (AWS Route 53 example; the same `get-hosted-zone` shape exists on AliCloud DNS + Cloud DNS):
+
+```bash
+aws route53 get-hosted-zone --id "$PARENT_ZONE_ID" \
+  --query 'HostedZone.{name:Name, private:Config.PrivateZone}'
+# → {"name": "${ZONE}.", "private": false}
+```
+
+The bulk service-worker A-record creation is automated by [`scripts/dns-upsert-workers.sh`](../scripts/dns-upsert-workers.sh) (AWS Route 53 today). For other providers, replicate the same shape — the hostnames are the migration seam.
+
+## §3 Email backend
+
+### §3.1 Verify the SES domain identity (AWS)
+
+```bash
+aws sesv2 create-email-identity \
+  --region "$REGION" --email-identity "$MAIL_DOMAIN" \
+  --dkim-signing-attributes NextSigningKeyLength=RSA_2048_BIT
+```
+
+Then publish DKIM + SPF + DMARC + MX records in one DNS change. AWS Route 53:
+
+```bash
+read -r T1 T2 T3 <<<"$(aws sesv2 get-email-identity --region "$REGION" \
+  --email-identity "$MAIL_DOMAIN" --query 'DkimAttributes.Tokens' --output text)"
+
+aws route53 change-resource-record-sets --hosted-zone-id "$PARENT_ZONE_ID" \
+  --change-batch "$(jq -n \
+    --arg domain "$MAIL_DOMAIN" --arg region "$REGION" \
+    --arg t1 "$T1" --arg t2 "$T2" --arg t3 "$T3" '{
+      Comment: "AgentKeys email infra for \($domain)",
+      Changes: [
+        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t1)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t1).dkim.amazonses.com"}]}},
+        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t2)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t2).dkim.amazonses.com"}]}},
+        {Action:"UPSERT", ResourceRecordSet:{Name:"\($t3)._domainkey.\($domain)", Type:"CNAME", TTL:300, ResourceRecords:[{Value:"\($t3).dkim.amazonses.com"}]}},
+        {Action:"UPSERT", ResourceRecordSet:{Name:$domain, Type:"MX",  TTL:300, ResourceRecords:[{Value:"10 inbound-smtp.\($region).amazonaws.com"}]}},
+        {Action:"UPSERT", ResourceRecordSet:{Name:$domain, Type:"TXT", TTL:300, ResourceRecords:[{Value:"\"v=spf1 include:amazonses.com -all\""}]}},
+        {Action:"UPSERT", ResourceRecordSet:{Name:"_dmarc.\($domain)", Type:"TXT", TTL:300, ResourceRecords:[{Value:"\"v=DMARC1; p=quarantine; rua=mailto:dmarc@\($domain)\""}]}}
+      ]
+    }')"
+```
+
+Wait ~5 min for DKIM propagation, then verify:
+
+```bash
+aws sesv2 get-email-identity --region "$REGION" --email-identity "$MAIL_DOMAIN" \
+  --query '{verified: VerifiedForSendingStatus, dkim: DkimAttributes.Status}'
+# → {"verified": true, "dkim": "SUCCESS"}
+```
+
+> **DKIM key custody:** in this interim setup, the email service holds the private DKIM key (AWS-internal on SES, AliCloud-internal on DirectMail, etc.). Trust surface = provider could forge mail signed as us → bounded blast radius (reputation, not user-data custody). Migration target is TEE-held BYODKIM — track in [`docs/spec/heima-gaps-vs-desired-architecture.md`](spec/heima-gaps-vs-desired-architecture.md) §4. Do **not** intermediate-step to "BYODKIM with file-stored key" (strictly worse than provider-managed).
+
+### §3.2 Create the S3 bucket for inbound mail
+
+```bash
+aws s3api create-bucket \
+  --region "$REGION" --bucket "$BUCKET" \
+  $([ "$REGION" != "us-east-1" ] && echo "--create-bucket-configuration LocationConstraint=$REGION")
+
+aws s3api put-public-access-block --region "$REGION" --bucket "$BUCKET" \
+  --public-access-block-configuration BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
+
+# 30-day TTL on inbound objects (throwaway-inbox model)
+aws s3api put-bucket-lifecycle-configuration --region "$REGION" --bucket "$BUCKET" \
+  --lifecycle-configuration "$(jq -n '{
+    Rules: [{ID:"inbound-30d-ttl", Status:"Enabled", Filter:{Prefix:"inbound/"}, Expiration:{Days:30}}]
+  }')"
+```
+
+### §3.3 Create the SES receipt rule
+
+```bash
+aws ses create-receipt-rule-set --rule-set-name agentkeys --region "$REGION" 2>/dev/null || true
+aws ses create-receipt-rule --region "$REGION" --rule-set-name agentkeys \
+  --rule "$(jq -n --arg domain "$MAIL_DOMAIN" --arg bucket "$BUCKET" '{
+    Name: "agentkeys-inbound", Enabled: true, ScanEnabled: true, TlsPolicy: "Optional",
+    Recipients: [$domain],
+    Actions: [{S3Action: {BucketName: $bucket, ObjectKeyPrefix: "inbound/"}}]
+  }')"
+aws ses set-active-receipt-rule-set --rule-set-name agentkeys --region "$REGION"
+```
+
+Inbound MIME lands at `s3://$BUCKET/inbound/<msg_id>`. First object: `AMAZON_SES_SETUP_NOTIFICATION` (provider's "I successfully wrote to your bucket" marker). Real mail follows.
+
+**Sandbox vs production sending:** inbound is unaffected by SES sandbox; **outbound** to arbitrary addresses needs Console → Support → "SES Sending Limits" → "Request Production Access".
+
+## §4 IAM users + roles
+
+### §4.1 `agentkeys-daemon` — broker runtime user
+
+```bash
+aws iam create-user --user-name agentkeys-daemon
+aws iam create-access-key --user-name agentkeys-daemon
+# → save AccessKeyId + SecretAccessKey to your secret manager. NEVER to git.
+
+aws iam put-user-policy --user-name agentkeys-daemon \
+  --policy-name agentkeys-daemon-assume-role \
+  --policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
+    Version:"2012-10-17",
+    Statement:[{
+      Effect:"Allow", Action:"sts:AssumeRole",
+      Resource:"arn:aws:iam::\($acct):role/agentkeys-data-role"
+    }]
+  }')"
+```
+
+The daemon user can do exactly one thing: assume `agentkeys-data-role`. Any storage / email action goes through the role's permissions, never the user's.
+
+### §4.2 `agentkeys-data-role` (static-IAM-user trust variant)
+
+The role's trust policy starts with the static-IAM-user variant. After the broker is publicly reachable, [`docs/cloud-setup.md`](cloud-setup.md) §4 swaps it for the OIDC-federated variant.
+
+```bash
+aws iam create-role --role-name agentkeys-data-role \
+  --assume-role-policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
+    Version:"2012-10-17",
+    Statement:[{
+      Effect:"Allow",
+      Principal:{AWS:"arn:aws:iam::\($acct):user/agentkeys-daemon"},
+      Action:"sts:AssumeRole"
+    }]
+  }')"
+
+aws iam put-role-policy --role-name agentkeys-data-role \
+  --policy-name agentkeys-data-role-inline \
+  --policy-document "$(jq -n \
+    --arg bucket "$BUCKET" --arg region "$REGION" \
+    --arg acct "$ACCOUNT_ID" --arg domain "$MAIL_DOMAIN" '{
+      Version:"2012-10-17",
+      Statement:[
+        {Effect:"Allow", Action:"s3:ListBucket", Resource:"arn:aws:s3:::\($bucket)"},
+        {Effect:"Allow", Action:"s3:GetObject",  Resource:"arn:aws:s3:::\($bucket)/*"},
+        {Effect:"Allow", Action:["ses:SendEmail","ses:GetEmailIdentity"],
+         Resource:["arn:aws:ses:\($region):\($acct):identity/\($domain)",
+                   "arn:aws:ses:\($region):\($acct):identity/*@\($domain)"]}
+      ]
+    }')"
+
+export ROLE_ARN=$(aws iam get-role --role-name agentkeys-data-role --query 'Role.Arn' --output text)
+echo "ROLE_ARN=$ROLE_ARN"
+```
+
+### §4.3 Per-data-class roles (`agentkeys-vault-role`, `agentkeys-memory-role`)
+
+Per arch.md §17.2: separate roles for credentials + memory data classes. Same trust shape as §4.2, distinct inline policies + PrincipalTag scoping. Provisioned by per-data-class helpers (idempotent):
+
+```bash
+bash scripts/provision-vault-bucket.sh        # agentkeys-vault-${ACCOUNT_ID}
+bash scripts/provision-vault-role.sh          # agentkeys-vault-role
+bash scripts/apply-vault-bucket-policy.sh     # v3 split-statement PrincipalTag policy
+
+bash scripts/provision-memory-bucket.sh
+bash scripts/provision-memory-role.sh
+bash scripts/apply-memory-bucket-policy.sh
+
+bash scripts/cleanup-mail-bucket-policy.sh    # restore email-only grants on $BUCKET
+```
+
+These scripts are the **source of truth** for the policy shape — read them, don't transcribe.
+
+### §4.4 `agentkeys-admin`, `agentkeys-broker` (already provisioned)
+
+If you reached this section, `agentkeys-admin` exists (you're using it). `agentkeys-broker` is whatever IAM user you SSH into the broker host with — its perms are out of scope (`ec2-instance-connect:SendSSHPublicKey` on the host's instance ID is sufficient for AWS Instance Connect).
+
+## §5 S3 bucket policy (initial, static-IAM variant)
+
+```bash
+aws s3api put-bucket-policy --region "$REGION" --bucket "$BUCKET" \
+  --policy "$(jq -n --arg bucket "$BUCKET" --arg acct "$ACCOUNT_ID" '{
+    Version:"2012-10-17",
+    Statement:[
+      {
+        Sid:"AllowSESWriteInbound", Effect:"Allow",
+        Principal:{Service:"ses.amazonaws.com"},
+        Action:"s3:PutObject",
+        Resource:"arn:aws:s3:::\($bucket)/*",
+        Condition:{StringEquals:{"aws:Referer":$acct}}
+      },
+      {
+        Sid:"AllowDaemonRead", Effect:"Allow",
+        Principal:{AWS:"arn:aws:iam::\($acct):role/agentkeys-data-role"},
+        Action:["s3:GetObject","s3:ListBucket"],
+        Resource:["arn:aws:s3:::\($bucket)","arn:aws:s3:::\($bucket)/*"]
+      }
+    ]
+  }')"
+```
+
+The PrincipalTag-scoped federated variant (which replaces this once OIDC federation is up) lives in [`docs/cloud-setup.md`](cloud-setup.md) §4.4.
+
+## §6 `agentkeys-broker-host` instance profile (EC2-only, optional)
+
+If the broker runs on AWS EC2, attach this so the daemon never holds a static key. Runtime creds come from IMDS.
+
+```bash
+ROLE=agentkeys-broker-host
+
+aws iam create-role --role-name "$ROLE" \
+  --assume-role-policy-document "$(jq -n '{
+    Version:"2012-10-17",
+    Statement:[{Effect:"Allow", Principal:{Service:"ec2.amazonaws.com"}, Action:"sts:AssumeRole"}]
+  }')"
+
+aws iam put-role-policy --role-name "$ROLE" --policy-name BrokerAssumeData \
+  --policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
+    Version:"2012-10-17",
+    Statement:[{Effect:"Allow", Action:"sts:AssumeRole",
+                Resource:"arn:aws:iam::\($acct):role/agentkeys-data-role"}]
+  }')"
+
+aws iam create-instance-profile --instance-profile-name "$ROLE"
+aws iam add-role-to-instance-profile --instance-profile-name "$ROLE" --role-name "$ROLE"
+aws ec2 associate-iam-instance-profile --region "$REGION" \
+  --instance-id "$INSTANCE_ID" \
+  --iam-instance-profile Name="$ROLE"
+```
+
+> **Caller-region trap:** `agentkeys-admin` profile defaults to `us-west-2`; the broker EC2 usually lives in `us-east-1`. Without `--region "$REGION"`, `describe-instances` silently returns empty and downstream `put-role-policy` runs with `--role-name ""`. Pass `--region` explicitly on every regional call. See [CLAUDE.md "AWS local-profile ↔ remote-IAM mapping"](../CLAUDE.md).
+
+### §6.1 `ses:SendEmail` grant on the runtime role
+
+The broker calls SES v2 `SendEmail` with its **own** runtime credentials (instance profile), not via the assumed `agentkeys-data-role`. Without `ses:SendEmail` on the broker's role, the operator hits:
+
+```
+broker rejected /v1/auth/email/request: status=502 body=
+{"error":"backend_unreachable","message":"… ses SendEmail:
+ unhandled error (AccessDeniedException)"}
+```
+
+The IAM action is `ses:SendEmail` (sesv2), NOT `ses:SendRawEmail` (v1; different code path the broker doesn't use). The grant lives on the broker's runtime role (`agentkeys-broker-host` on EC2; the user `agentkeys-daemon` otherwise) — see [`docs/cloud-setup.md`](cloud-setup.md) §3.3 for the exact statement.
+
+## §7 Security audit — strip legacy over-broad attached policies
+
+Some early deploys ship with `AmazonS3FullAccess` (or similar wide permissions) attached to the broker's runtime role. The broker at runtime ONLY uses `aws-sdk-sts` (the GetCallerIdentity startup probe) + `aws-sdk-sesv2` (the §6.1 grant) — it never accesses S3 with its own creds. Per-user S3 is via JWT-assumed `agentkeys-{data,vault,memory}-role`, not the broker's runtime role.
+
+A broker compromise with `AmazonS3FullAccess` would expose every inbound email in the SES bucket (verification tokens, magic links). Strip it:
+
+```bash
+# Discover the actual role attached to the broker host (canonical name:
+# agentkeys-broker-host; some early deploys landed on different names):
+INSTANCE_PROFILE_ARN=$(aws ec2 describe-instances --region "$REGION" \
+  --filters "Name=ip-address,Values=$EIP" \
+  --query 'Reservations[].Instances[].IamInstanceProfile.Arn' --output text)
+
+ROLE=$(aws iam get-instance-profile \
+  --instance-profile-name "${INSTANCE_PROFILE_ARN##*/}" \
+  --query 'InstanceProfile.Roles[0].RoleName' --output text)
+echo "broker runtime role: $ROLE"
+
+# Audit attached policies:
+aws iam list-attached-role-policies --role-name "$ROLE"
+
+# Detach AmazonS3FullAccess if present:
+aws iam detach-role-policy --role-name "$ROLE" \
+  --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess
+
+# Verify only the narrow inline policy (BrokerSendEmail + AssumeDataRole) remains:
+aws iam list-role-policies --role-name "$ROLE"
+aws iam list-attached-role-policies --role-name "$ROLE"
+```
+
+## §8 Cloud-provider portability
+
+Every layer in §3–§5 has a 1:1 analog on the major providers. The provisioning shape carries; only the API endpoints + JSON dialects differ.
+
+| Layer | AWS (current) | AliCloud (in progress) | GCP | Tencent Cloud |
+|---|---|---|---|---|
+| Privileged user | IAM user with `IAMFullAccess` | RAM user with `AliyunRAMFullAccess` | IAM service account with `roles/iam.securityAdmin` | CAM user with `AdministratorAccess` |
+| Runtime user | IAM user + access key | RAM user + AK/SK | Service account + key file (or Workload Identity) | CAM user + SecretId/SecretKey |
+| Data role | IAM role + assume policy | RAM role + assume policy | Service account + IAM bindings | CAM role + assume policy |
+| Federation | IAM OIDC provider | RAM IDaaS / OIDC provider | Workload Identity Pool | CAM OIDC provider |
+| Object store | S3 + bucket policy | OSS + bucket policy | Cloud Storage + IAM bindings | COS + bucket policy |
+| Email backend | SES + S3 receipt rule | DirectMail / SimpleDM + OSS event notification | SendGrid / Mailgun (no GCP-native) | SimpleDM + COS |
+| TLS termination | nginx + Let's Encrypt | nginx + Let's Encrypt | nginx + Let's Encrypt | nginx + Let's Encrypt |
+| Compute (broker host) | EC2 + EIP | ECS + EIP | Compute Engine + external IP | CVM + EIP |
+| DNS | Route 53 | AliCloud DNS | Cloud DNS | DNSPod / Cloud DNS |
+| Secrets storage | Secrets Manager / SSM Parameter Store | KMS Secrets Manager | Secret Manager | KMS |
+
+**Migration playbook (cloud → cloud):**
+
+1. Re-bind operator-workstation.env to the new provider's identifiers (account ID, region, role ARNs, bucket name).
+2. Re-run this doc top-to-bottom against the new provider.
+3. Re-run [`docs/cloud-setup.md`](cloud-setup.md) §4 (OIDC federation) — substitute the provider's OIDC API.
+4. Re-run `scripts/setup-broker-host.sh` on the new host (the script doesn't care which cloud — it consumes already-provisioned identifiers).
+5. Re-run `scripts/setup-heima.sh` — the chain side is cloud-agnostic.
+6. Re-run the harness scripts to validate end-to-end.
+
+The boundary is sharp: the broker process itself contains zero cloud-specific code — it talks STS-compatible OIDC + S3-compatible PutObject/GetObject + SMTP-compatible SendEmail. Every cloud above offers all three primitives. The [`provisioner-scripts/email-backends/`](../provisioner-scripts/) directory documents the email-backend trait; a new backend slots in as `tencent-simpledm-cos` (or similar) with the same upstream API as `ses-s3`.
+
+## Related
+
+- Day-to-day broker re-deploys: [`docs/cloud-setup.md`](cloud-setup.md)
+- Chain bring-up: [`docs/heima-setup.md`](heima-setup.md)
+- CI activation: [`docs/ci-setup.md`](ci-setup.md)
+- Architecture (per-data-class buckets + isolation invariants): [`docs/spec/architecture.md`](spec/architecture.md) §17, §17.2
+- Future Tencent / TEE DKIM: [`docs/spec/heima-gaps-vs-desired-architecture.md`](spec/heima-gaps-vs-desired-architecture.md) §4
+- FAQ + troubleshooting: [`wiki/cloud-setup-faq.md`](../wiki/cloud-setup-faq.md)
diff --git a/docs/cloud-setup.md b/docs/cloud-setup.md
index 3a7820f..9dd75ad 100644
--- a/docs/cloud-setup.md
+++ b/docs/cloud-setup.md
@@ -1,7 +1,7 @@
 # Cloud setup — AgentKeys
 
-**Audience:** the operator provisioning the cloud account that hosts AgentKeys infrastructure.
-**Scope:** the prereqs that the idempotent [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh) entry point can't do for itself (DNS, SES, IAM, OIDC provider, S3 buckets). Run those once per account, then re-run the broker-host script as often as needed.
+**Audience:** the operator running ongoing broker re-deploys after first-time cloud-account bootstrap is done.
+**Scope:** OIDC federation activation (the per-broker security upgrade) + the [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh) runtime entry point + tear-down. **Prereqs handled in [`docs/cloud-bootstrap.md`](cloud-bootstrap.md)** — read that first if standing up a brand-new account or porting to another cloud provider.
 **Companion:** [`docs/heima-setup.md`](heima-setup.md) for chain bring-up, [`docs/ci-setup.md`](ci-setup.md) for CI activation.
 **FAQ + troubleshooting:** [`wiki/cloud-setup-faq.md`](../wiki/cloud-setup-faq.md).
 
@@ -12,11 +12,17 @@
 awsp agentkeys-admin
 set -a; source scripts/operator-workstation.env; set +a   # ${ACCOUNT_ID}, ${REGION}, ${BROKER_HOST}, ${BUCKET}, ...
 
-# 1. Per-account, one-shot, manual (this doc):
-#    §1 DNS subdomains, §2 SES domain identity, §3 IAM users + role,
-#    §4 OIDC federation provider + trust policy + bucket policy.
+# 0. First-time cloud-account bootstrap (cloud-bootstrap.md):
+#    DNS subdomains, SES domain identity, IAM users + roles, initial
+#    bucket policy. Run ONCE per account; re-enter only when migrating
+#    cloud providers or adding a second account.
 
-# 2. Per-broker-host, idempotent re-runnable (script):
+# 1. OIDC federation activation (this doc §1):
+#    Once the broker is publicly reachable, register the IAM OIDC
+#    provider + swap the role trust policy + apply PrincipalTag
+#    bucket policy. Per-broker, one-shot.
+
+# 2. Per-broker-host, idempotent re-runnable (this doc §2):
 sudo bash scripts/setup-broker-host.sh \
   --issuer-url "https://${BROKER_HOST}" \
   --account-id "${ACCOUNT_ID}" \
@@ -33,145 +39,17 @@ bash scripts/setup-heima.sh                                # see docs/heima-setu
 
 `setup-broker-host.sh` is **the single entry point** for every remote-host change (binary upgrades, systemd edits, env tweaks, nginx/certbot wiring, mock-server redeploys). Per [CLAUDE.md "Remote broker host"](../CLAUDE.md): no ad-hoc `systemctl` edits, no hand-built `scp`.
 
-The split: §1–§4 below sets up the **identifiers** (DNS names, IAM principals, OIDC trust, bucket policies); the script consumes those identifiers and stands up the actual processes.
-
-## 0. Identities — mental model
-
-| Identity | Type | Holds | Purpose |
-|---|---|---|---|
-| `agentkeys-admin` | IAM user | Long-lived access key | One-shot provisioning. Runs every command in this doc. IAM-admin scope. |
-| `agentkeys-broker` | IAM user | Long-lived access key | Operator's SSH-into-EC2 path via EC2 Instance Connect. No data-plane access. |
-| `agentkeys-daemon` | IAM user | Long-lived access key | Broker process at runtime. Only permission: `sts:AssumeRole` on the data role. |
-| `agentkeys-data-role` | IAM role | (assumed) | Holds the actual S3/SES permissions. `agentkeys-daemon` (Stage 6) or the OIDC provider (Stage 7) is allowed to assume. |
-| `agentkeys-vault-role` / `agentkeys-memory-role` | IAM role | (assumed) | Per-data-class roles (arch.md §17.2). Trust the OIDC provider; PrincipalTag-scoped to `bots/<actor_omni>/{credentials,memory}/*`. |
-| `agentkeys-broker-host` | IAM role | (assumed by EC2) | Optional. If the broker runs on EC2, attach as instance profile so the daemon never sees a static key. |
-
-The word "agent" already means three things (the AI agent, the AgentKeys product, an IAM role) — these roles hold **data-plane** permissions, so they're named `*-data-role` / `*-vault-role` / `*-memory-role`.
-
-## 1. DNS
-
-Two-and-six subdomains under your parent zone (e.g. `litentry.org`):
-
-| Host | Purpose | Set in |
-|---|---|---|
-| `${MAIL_DOMAIN}` (e.g. `bots.litentry.org`) | SES inbound | §2 |
-| `${BROKER_HOST}` (e.g. `broker.litentry.org`) | Broker TLS-terminating reverse proxy | §5 — A record to broker EIP |
-| `signer.${ZONE}` | Signer service (issue #74 step 1b) | §5 — A record to broker EIP (co-located today) |
-| `audit.${ZONE}` / `email.${ZONE}` / `cred.${ZONE}` / `memory.${ZONE}` | Service workers (issue #90) | §5 — same EIP (dev co-location) |
-
-For the bulk service-worker DNS, use [`scripts/dns-upsert-workers.sh`](../scripts/dns-upsert-workers.sh). The hostnames are the migration seam — when a worker moves to its own machine, only the A record changes.
-
-## 2. SES inbound mail
-
-```bash
-# Verify the SES domain identity
-aws sesv2 create-email-identity --region "$REGION" \
-  --email-identity "$MAIL_DOMAIN" \
-  --dkim-signing-attributes NextSigningKeyLength=RSA_2048_BIT
-
-# Publish DKIM + SPF + DMARC + MX in one Route 53 change (read DKIM tokens
-# from `aws sesv2 get-email-identity`, then upsert via Route 53 — see
-# wiki/cloud-setup-faq.md for the full record set).
-
-# Create the inbound bucket (30-day TTL on inbound/* objects)
-aws s3api create-bucket --region "$REGION" --bucket "$BUCKET" \
-  $([ "$REGION" != "us-east-1" ] && echo "--create-bucket-configuration LocationConstraint=$REGION")
-aws s3api put-public-access-block --region "$REGION" --bucket "$BUCKET" \
-  --public-access-block-configuration BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
-
-# Receipt rule: route mail for $MAIL_DOMAIN into s3://$BUCKET/inbound/*
-aws ses create-receipt-rule-set --rule-set-name agentkeys --region "$REGION" 2>/dev/null || true
-aws ses create-receipt-rule --region "$REGION" --rule-set-name agentkeys \
-  --rule "$(jq -n --arg domain "$MAIL_DOMAIN" --arg bucket "$BUCKET" '{
-    Name: "agentkeys-inbound", Enabled: true, ScanEnabled: true, TlsPolicy: "Optional",
-    Recipients: [$domain],
-    Actions: [{S3Action: {BucketName: $bucket, ObjectKeyPrefix: "inbound/"}}]
-  }')"
-aws ses set-active-receipt-rule-set --rule-set-name agentkeys --region "$REGION"
-
-# Verify the bot's sending identity (the broker's BROKER_EMAIL_FROM_ADDRESS
-# precheck refuses to boot if this isn't verified)
-bash scripts/ses-verify-sender.sh
-```
-
-**Sandbox vs production sending:** inbound is unaffected by SES sandbox; only **outbound** to arbitrary addresses needs Console → Support → "SES Sending Limits" → "Request Production Access".
-
-**Per-recipient routing Lambda (issue #83):** after §4 lands, the broker's role is intentionally denied read on `inbound/*`. Service-provisioning verification emails route to `bots/<wallet>/inbound/<msg>` via [`infra/ses-routing-lambda/deploy.sh`](../infra/ses-routing-lambda/deploy.sh). Idempotent, deploy once per AWS account.
-
-**Future Tencent Cloud port:** SES + S3 are the only AWS-specific layers in this doc. SimpleDM + COS slot in at the §3+ boundary — IAM model maps 1:1 onto CAM. The `provisioner-scripts/email-backends/` interface already abstracts the inbound contract.
-
-## 3. IAM identities
-
-The daemon user + data role are the boundary between manual provisioning (this doc) and the script-driven runtime (`setup-broker-host.sh`).
-
-### 3.1 The four principals
-
-```bash
-# Runtime user (broker process)
-aws iam create-user --user-name agentkeys-daemon
-aws iam create-access-key --user-name agentkeys-daemon
-#   → save AccessKeyId + SecretAccessKey to the operator's secret manager.
-#     NEVER commit. setup-broker-host.sh consumes these via the systemd
-#     env file written under /etc/agentkeys/.
-
-# Daemon may only assume the data role (no direct S3/SES grants).
-aws iam put-user-policy --user-name agentkeys-daemon \
-  --policy-name agentkeys-daemon-assume-role \
-  --policy-document "$(jq -n --arg acct "$ACCOUNT_ID" '{
-    Version:"2012-10-17",
-    Statement:[{Effect:"Allow", Action:"sts:AssumeRole",
-                Resource:"arn:aws:iam::\($acct):role/agentkeys-data-role"}]
-  }')"
-```
-
-For `agentkeys-admin` + `agentkeys-broker` (one-shot, you already have these per CLAUDE.md "AWS local-profile ↔ remote-IAM mapping"), confirm with `aws iam list-users`.
-
-### 3.2 The three data roles
-
-Per arch.md §17.2 (per-data-class isolation): separate roles for credentials + memory + email. Same trust shape, distinct inline policies and PrincipalTag scoping. Provision via the per-data-class helpers (idempotent):
-
-```bash
-bash scripts/provision-vault-bucket.sh        # agentkeys-vault-${ACCOUNT_ID}
-bash scripts/provision-vault-role.sh          # agentkeys-vault-role
-bash scripts/apply-vault-bucket-policy.sh     # v3 split-statement PrincipalTag policy
-
-bash scripts/provision-memory-bucket.sh
-bash scripts/provision-memory-role.sh
-bash scripts/apply-memory-bucket-policy.sh
-
-bash scripts/cleanup-mail-bucket-policy.sh    # restore email-only grants on $BUCKET
-```
-
-The data-role trust shape is shown in [§4.3](#43-trust-policy) below — it's the same template for all three roles. The inline grants differ per role (vault → credentials prefix; memory → memory prefix; data-role → mail prefix).
-
-### 3.3 SES sender grant (email-link auth prereq)
-
-The broker's runtime role needs `ses:SendEmail` on the verified sender identity for email-link auth. Add this statement to the data role's inline policy:
-
-```json
-{
-  "Effect": "Allow",
-  "Action": ["ses:SendEmail", "ses:SendRawEmail"],
-  "Resource": [
-    "arn:aws:ses:${REGION}:${ACCOUNT_ID}:identity/${BROKER_EMAIL_FROM_ADDRESS}",
-    "arn:aws:ses:${REGION}:${ACCOUNT_ID}:configuration-set/*"
-  ]
-}
-```
-
-The broker's `verify_sender_ready` precheck calls `ses:GetEmailIdentity` at boot and refuses to start if the identity isn't both verified AND grantable. Triggered without this grant: cryptic `AccessDenied: ses:SendEmail` at the magic-link send step.
-
-## 4. OIDC federation (Stage 7)
+## 1. OIDC federation (Stage 7)
 
 The broker mints OIDC JWTs that AWS STS validates via the broker's public JWKS endpoint. Three one-shot steps per account.
 
-### 4.1 Prereqs
+### 1.1 Prereqs
 
 - Broker reachable at `https://${BROKER_HOST}` over public TLS (`setup-broker-host.sh` provisions this with certbot).
 - `https://${BROKER_HOST}/.well-known/openid-configuration` returns 200 with the expected `issuer` + `jwks_uri`.
 - `https://${BROKER_HOST}/.well-known/jwks.json` returns at least one ES256 key.
 
-### 4.2 Register the OIDC provider
+### 1.2 Register the OIDC provider
 
 ```bash
 thumb=$(echo | openssl s_client -servername "$BROKER_HOST" \
@@ -187,7 +65,7 @@ aws iam create-open-id-connect-provider \
 
 **AWS validates the issuer URL byte-for-byte** against the JWT `iss` claim. Once the OIDC provider is registered, the URL is effectively immutable for the life of the deployment — switching means new provider ARN + new trust policy + new federated grants.
 
-### 4.3 Trust policy
+### 1.3 Trust policy (federated variant)
 
 Apply to each of the three data roles. Use `$ROLE` ∈ `{agentkeys-data-role, agentkeys-vault-role, agentkeys-memory-role}`.
 
@@ -204,7 +82,7 @@ aws iam update-assume-role-policy --role-name "$ROLE" --policy-document "$(jq -n
   }')"
 ```
 
-### 4.4 PrincipalTag-scoped bucket policy
+### 1.4 PrincipalTag-scoped bucket policy
 
 Per CLAUDE.md "Per-actor + per-data-class isolation invariants": every S3 read/write is scoped to `bots/${aws:PrincipalTag/agentkeys_actor_omni}/{credentials,memory}/*`. The split-statement v3 bucket policy is applied by [`scripts/apply-{vault,memory}-bucket-policy.sh`](../scripts/) — those scripts ARE the source of truth for the policy shape.
 
@@ -214,21 +92,21 @@ After §4.3 + §4.4: strip the §3 broad-bucket inline grant from the role's pol
 aws iam delete-role-policy --role-name "$ROLE" --policy-name agentkeys-data-role-s3-broad
 ```
 
-### 4.5 End-to-end proof
+### 1.5 End-to-end proof
 
 Run [`harness/v2-stage3-demo.sh`](../harness/v2-stage3-demo.sh) — it mints a session JWT → OIDC JWT → STS creds, then proves both POSITIVE (own prefix) and NEGATIVE (cross-actor prefix → AccessDenied) writes for both data classes plus the cross-role isolation matrix. Walks the full §17.2 isolation table from CLAUDE.md.
 
-## 5. Broker host: `setup-broker-host.sh`
+## 2. Broker host: `setup-broker-host.sh`
 
 §1–§4 set up identifiers. This step stands up the actual processes — broker + mock-server + signer + 4 service workers — on the EC2 host (or any Linux box with public-internet egress + the broker's hostname).
 
-### 5.1 Prereqs
+### 2.1 Prereqs
 
 - Fresh Linux host with sudo, systemd, public-internet egress, ports 80 + 443 open inbound (for certbot + nginx).
 - DNS A records for `${BROKER_HOST}` + `signer.${ZONE}` + `audit.${ZONE}` + `email.${ZONE}` + `cred.${ZONE}` + `memory.${ZONE}` all pointing at the host's public IP.
 - AWS credentials in `/etc/agentkeys/broker.env` (the script writes the file template; operator pastes the `agentkeys-daemon` access key from §3.1).
 
-### 5.2 Run
+### 2.2 Run
 
 ```bash
 # Bootstrap a fresh host:
@@ -258,7 +136,7 @@ The script:
 
 Auto-detects bootstrap vs upgrade by reading the existing systemd unit's `Environment=` lines. Pass `--ref <branch>` to opt into an in-script `git fetch + pull`.
 
-### 5.3 Verify
+### 2.3 Verify
 
 ```bash
 curl -sf "https://${BROKER_HOST}/healthz"                  # → 200
@@ -269,7 +147,9 @@ curl -sf "https://audit.${ZONE}/healthz"                   # → 200 (and friend
 
 For full E2E (broker + workers + chain + AWS), run the harness scripts — see [`docs/heima-setup.md`](heima-setup.md) for the chain side and [`docs/ci-setup.md`](ci-setup.md) for the automated path.
 
-## 6. Cleanup
+## 3. Cleanup (full account teardown)
+
+Tears down everything provisioned by both [`docs/cloud-bootstrap.md`](cloud-bootstrap.md) and this doc. Use only when retiring the deployment.
 
 Tear down the whole AgentKeys footprint in one account:
 
@@ -309,6 +189,7 @@ aws sesv2 delete-email-identity --email-identity "$MAIL_DOMAIN" --region "$REGIO
 
 ## Related
 
+- **First-time cloud-account bootstrap (prereq for this doc):** [`docs/cloud-bootstrap.md`](cloud-bootstrap.md)
 - Chain bring-up: [`docs/heima-setup.md`](heima-setup.md)
 - CI activation: [`docs/ci-setup.md`](ci-setup.md)
 - Broker host script (single entry point): [`scripts/setup-broker-host.sh`](../scripts/setup-broker-host.sh)
diff --git a/wiki/cloud-setup-faq.md b/wiki/cloud-setup-faq.md
index 2beca18..392a10c 100644
--- a/wiki/cloud-setup-faq.md
+++ b/wiki/cloud-setup-faq.md
@@ -1,6 +1,11 @@
 # Cloud setup — FAQ
 
-Troubleshooting + edge cases that didn't fit in [`docs/cloud-setup.md`](https://github.com/litentry/agentKeys/blob/main/docs/cloud-setup.md). Use ⌘F to find your error.
+Troubleshooting + edge cases for the two cloud-side operator docs:
+
+- [`docs/cloud-bootstrap.md`](https://github.com/litentry/agentKeys/blob/main/docs/cloud-bootstrap.md) — first-time provisioning (per account or per cloud provider).
+- [`docs/cloud-setup.md`](https://github.com/litentry/agentKeys/blob/main/docs/cloud-setup.md) — ongoing OIDC federation + broker-host re-deploys.
+
+Use ⌘F to find your error.
 
 ## Q. `setup-broker-host.sh` says "BROKER_OIDC_ISSUER mismatch" on re-run