config: add secret_source=metadata mode and ${secret:NAME} interpolation#92
config: add secret_source=metadata mode and ${secret:NAME} interpolation#92
Conversation
Phase C Cloud tenants need to fetch the provider API key from the host metadata gateway instead of reading process.env. The loader now supports two modes: the existing default secret_source: env continues to read process.env unchanged, and the new secret_source: metadata fetches the named secret from http://169.254.169.254/v1/secrets/<name>, populates process.env.ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN, and walks the parsed config replacing ${secret:NAME} references with their resolved plaintext. Self-host installs that omit secret_source see no behaviour change. Whole-string interpolation only, so partial-string references in URLs and other composed values are intentionally left untouched to avoid leaking plaintext through any code path that logs the original string. Adds MetadataSecretFetcher (60s TTL cache, ETag/If-None-Match against X-Phantom-Rotation-Id) and updates two callers (src/index.ts, src/cli/doctor.ts) to await the now-async loadConfig. The original sync code path is preserved as loadConfigSync for any context that genuinely cannot await.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0264481cce
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // are set, mirroring the dual-header pattern in buildProviderEnv: the bundled | ||
| // Agent SDK auth factory prefers ANTHROPIC_API_KEY, but third-party proxies | ||
| // sometimes accept only ANTHROPIC_AUTH_TOKEN. | ||
| const providerToken = await fetcher.get(config.provider.secret_name); |
There was a problem hiding this comment.
Avoid forcing provider_token for no-auth providers
This unconditional fetch makes secret_source: metadata fail fast for providers that intentionally do not use credentials (for example ollama, vllm, or custom without api_key_env). In those configs, a missing/unreachable provider_token now blocks startup even though the provider path itself is valid without a token, which is a functional regression from previous behavior. Gate this fetch on providers that actually require auth (or on explicit opt-in).
Useful? React with 👍 / 👎.
| // but a future caller could invoke get() directly. Reject anything that | ||
| // could smuggle a path or query into the gateway URL. | ||
| if (!VALID_SECRET_NAME.test(name)) { | ||
| throw new Error(`metadata: invalid secret name: ${name}`); |
There was a problem hiding this comment.
Remove raw secret-name echo from validation errors
The invalid-name error includes the unredacted name value. If an operator mistakenly pastes a real token into provider.secret_name, that plaintext is reflected into logs/diagnostics via this message, undermining the no-plaintext-in-errors security invariant. Prefer a generic validation message (or masked identifier) rather than interpolating user input.
Useful? React with 👍 / 👎.
… tests) Tighten secret_name schema regex to match the metadata fetcher's defense-in-depth check so invalid names fail at parse time rather than crashing at boot inside the fetcher (finding 1). Split loader.test.ts into loader.test.ts and loader-metadata.test.ts so neither file blows the 300-line cap by much, with duplicated writeYaml/cleanup helpers and distinct TEST_DIR paths to avoid races (finding 2). Document the single as-unknown-as cast at the loader.ts header so future maintainers know it is the only typed/generic boundary in the file (finding 3). Add two error-path regression tests to metadata-fetcher.test.ts for the 304-without-cache server-bug branch and the wrapped network-error branch (finding 5).
Summary
secret_source(defaultenv) andsecret_source_urltoPhantomConfigSchema, plusprovider.secret_name(defaultprovider_token).MetadataSecretFetcher(src/config/metadata-fetcher.ts) speaks the host metadata gateway wire contract: GET/v1/secrets/<name>, 60s in-process TTL cache,If-None-MatchagainstX-Phantom-Rotation-Id, and errors that include status and secret name but never plaintext.loadConfigbecomes async. The defaultsecret_source: envpath is sync-fast and byte-identical to today's behaviour. Whensecret_source: metadata, the loader pre-populatesprocess.env.ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKENso the unchangedbuildProviderEnvJust Works, then walks the parsed config replacing${secret:NAME}whole-string references with their resolved plaintext.${secret:NAME}regex matches the entire string value only. Partial matches likehttps://hook/?token=${secret:provider_token}are intentionally NOT resolved; this prevents secret leakage through any code path that logs URLs or composed strings.loadConfigSync. Two callers (src/index.ts:79,src/cli/doctor.ts:93) updated toawait loadConfig(). Test fixtures updated where TypeScript fixture literals needed the new fields explicit.Test plan
bun test(1841 tests, 1831 pass, 10 skip, 0 fail; +12 over baseline)bun run typecheckbun run lintsecret_sourcecontinue to load with no behaviour changeSpec
local/2026-04-24-phase-c-research/phase-c-implementation-plan.md./v1/secrets/<name>), error model at 7.7, ETag/rotation_id at 7.8.