diff --git a/docs/docs/designs/index.md b/docs/docs/designs/index.md index f022b90..9fa7248 100644 --- a/docs/docs/designs/index.md +++ b/docs/docs/designs/index.md @@ -25,6 +25,7 @@ Current designs: - [App RGD Design](./app-rgd.md) - [AWS Lab Account](./aws-lab-account.md) - [Keycloak](./keycloak.md) +- [Secrets and PKI](./secrets-and-pki.md) Once a design is implemented and considered durable, its steady-state shape should be folded back into the architecture overview and any relevant runbooks. diff --git a/docs/docs/designs/secrets-and-pki.md b/docs/docs/designs/secrets-and-pki.md new file mode 100644 index 0000000..e43872d --- /dev/null +++ b/docs/docs/designs/secrets-and-pki.md @@ -0,0 +1,424 @@ +--- +title: Secrets and PKI +description: Proposed design for bootstrap secrets, per-cluster Vault, and the lab PKI split. +--- + +# Secrets and PKI + +## Status + +Proposed. + +This document defines the target model for secrets and PKI across the lab. It +connects the AWS bootstrap foundation from [AWS Lab Account](./aws-lab-account.md) +to the future per-cluster Vault model and replaces the older router-hosted +`step-ca` internal PKI design. + +## Purpose + +The lab needs two different secret systems: + +- a **bootstrap** system that works before any cluster or Vault instance exists +- a **runtime** system that belongs to each cluster after that cluster exists + +The bootstrap system must be small, externally anchored, and able to recover +the lab from a cold start. The runtime system must be local to the cluster it +serves, because `platform`, `nonprod`, and `prod` are independent recovery and +security domains. + +PKI follows the same split. Public HTTP TLS uses public trust through Let's +Encrypt. Internal workload identity and mTLS use lab-managed authorities rooted +in an AWS KMS-held root CA and delegated to cluster-local Vault or future SPIRE +issuers. + +## Goals + +- Make AWS the authoritative control surface for bootstrap secret access. +- Remove standing PGP and age decryptors from SOPS-encrypted bootstrap + material. +- Allow short-lived, scope-limited access to selected bootstrap secrets. +- Keep runtime secrets in each cluster's own Vault instance. +- Prevent cross-cluster secret reads between `platform`, `nonprod`, and `prod`. +- Use Let's Encrypt for browser-facing HTTP TLS. +- Use cluster-local subordinate CAs for internal mTLS and future SPIFFE/SPIRE. +- Retire the router-hosted `step-ca` internal issuer from the target design. + +## Non-Goals + +- This document does not define exact IAM policy JSON, Vault policy HCL, or + Kubernetes manifests. +- This document does not define a general secret synchronization mechanism + between SOPS and Vault. +- This document does not make the platform cluster a central secret broker for + downstream clusters. +- This document does not federate SPIRE trust domains between clusters. +- This document does not cover Keycloak runtime backups or realm + configuration; those remain in [Keycloak](./keycloak.md). + +## Design Summary + +Secrets are split into two layers: + +| Layer | Authority | Scope | Purpose | +| --- | --- | --- | --- | +| Bootstrap | AWS + GitHub App + SOPS | Pre-cluster and recovery | Retrieve enough material to create or recover clusters, Vault, and base platform services. | +| Runtime | Per-cluster Vault | One cluster | Store and issue secrets for workloads in that cluster. | + +PKI is split into two layers: + +| Use case | Authority | Issuer | +| --- | --- | --- | +| HTTP TLS | Public WebPKI | Let's Encrypt through cert-manager DNS-01 and Route 53 | +| Internal mTLS / workload identity | Lab internal PKI | Per-cluster Vault subordinate CAs, with future SPIRE authorities | + +The root invariant is: + +```text +AWS grants the workload identity. +GitHub App grants temporary repository read access. +AWS KMS grants temporary scoped SOPS decrypt access. +Cluster-local Vault grants runtime secret and certificate access. +``` + +## Bootstrap Secrets + +### Source of truth + +Bootstrap secret payloads live in the private `GilmanLab/secrets` repository. +Public repositories keep only templates, variable contracts, and lookup logic. + +The `secrets` repository holds only material that is needed before a cluster's +Vault is ready, or material needed to recover that Vault. Examples include: + +- Talos and cluster bootstrap material +- initial Vault unseal/recovery storage configuration +- GitHub App bootstrap material +- credentials needed to create the first runtime secret sources +- emergency recovery material that cannot live inside the thing it recovers + +Runtime application secrets do not belong in SOPS once Vault exists. + +### SOPS over AWS KMS + +All existing SOPS files are rewrapped with the customer-managed KMS key in the +current `lab` AWS account: + +```text +alias/glab-sops +arn:aws:kms:us-west-2:186067932323:key/2aba1d94-6eaf-4d80-8d26-2077f32fd7c5 +``` + +The existing PGP and age recipients are removed from SOPS metadata. After the +cutover, AWS is the only routine decrypt control surface. + +This is an intentional break from the current state. Today, `secrets/.sops.yaml` +contains only age and PGP recipients. The KMS key exists, but the SOPS +recipient rollout has not been performed. + +### Scoped decryption + +SOPS files are encrypted with AWS KMS encryption context so IAM can grant +decrypt access by scope. Scopes are file/path oriented; SOPS is not a +field-level authorization system. + +Example scope layout: + +```text +network/tailscale/* Scope=network-tailscale +network/vyos/* Scope=network-vyos +compute/talos/platform/* Scope=talos-platform +vault/platform/* Scope=vault-platform +vault/nonprod/* Scope=vault-nonprod +vault/prod/* Scope=vault-prod +``` + +Each encrypted file includes KMS context similar to: + +```yaml +Repo: GilmanLab/secrets +Scope: network-tailscale +``` + +A workload that needs only `network/tailscale/*` receives short-lived AWS +credentials for a role that can call `kms:Decrypt` only when the request's +encryption context has `Repo=GilmanLab/secrets` and +`Scope=network-tailscale`. + +Because KMS encryption context is authenticated data bound to the encrypted +data key, changing the SOPS file metadata to a different scope does not allow +the ciphertext to decrypt. + +### Repository access + +Private repository access uses a GitHub App owned by `GilmanLab` and installed +on `GilmanLab/secrets`. + +The App private signing key is stored in SSM Parameter Store as a SecureString +in the `lab` account. A bootstrap workload uses its AWS identity to read that +specific SSM parameter, generates a GitHub App JWT, exchanges it for a +short-lived installation token, and clones the repository over HTTPS. + +Installation tokens are requested with the narrowest useful shape: + +```text +repositories = ["secrets"] +permissions = {"contents": "read"} +ttl = 1 hour +``` + +The GitHub token grants access to encrypted files. AWS KMS grants access to +plaintext. A workload may be able to clone the whole private repository while +still being unable to decrypt files outside its KMS context scope. + +### Historical exposure + +Removing PGP and age recipients from current SOPS files does not remove their +ability to decrypt old git revisions. Any bootstrap secret that must become +AWS-authoritative retroactively is rotated after the KMS cutover. + +History rewrite is possible but is not the default. For this lab, rotating +affected secrets is the simpler and more auditable path. + +## Runtime Secrets + +### Per-cluster Vault + +Each cluster runs its own HashiCorp Vault instance managed by `bank-vaults`. +Vault is the runtime source of truth for secrets in that cluster. + +The intended cluster split is: + +| Cluster | Vault scope | +| --- | --- | +| `platform` | Platform-cluster services and platform control-plane needs | +| `nonprod` | Non-production workloads | +| `prod` | Production workloads | + +Vault instances do not read each other's storage, policies, tokens, or secret +paths. `prod` does not depend on `nonprod`; `nonprod` does not depend on +`prod`; `platform` does not become a universal secret broker. + +### Environment separation + +Clusters that host multiple environments segregate secrets by path and policy. +For `nonprod`, the baseline shape is: + +```text +dev/* +staging/* +``` + +Path naming is not the security boundary by itself. Vault auth roles and +policies enforce which workloads, namespaces, and service accounts can read or +write each prefix. + +### Bootstrap-to-runtime handoff + +SOPS may seed initial Vault configuration and initial secret material during +cluster bootstrap. Once the cluster is operating, runtime mutation belongs in +Vault. + +There is no bidirectional SOPS-to-Vault synchronization loop. That would create +two sources of truth. The direction is: + +```text +SOPS bootstrap material -> initialize/configure Vault -> Vault owns runtime +``` + +If a runtime secret must be recovered from bootstrap material, the recovery +process is explicit and documented for that secret class. + +### Vault unseal material + +For cost management, Vault unseal and root/recovery material may share one +customer-managed AWS KMS key across clusters. The KMS key wraps distinct +per-cluster Vault material; it is not a shared Vault unseal key. + +Isolation is enforced with: + +- per-cluster S3 prefixes or buckets for bank-vaults storage +- per-cluster IAM roles +- KMS encryption context such as `Purpose=vault-unseal` and + `Cluster=nonprod` + +Example: + +```text +KMS key: alias/glab-vault-unseal + +S3: + s3://glab-vault-unseal/platform/* + s3://glab-vault-unseal/nonprod/* + s3://glab-vault-unseal/prod/* + +KMS context: + Purpose = vault-unseal + Cluster = platform | nonprod | prod +``` + +One KMS key per cluster would provide cleaner blast-radius isolation, but the +fixed monthly KMS cost is not worth it at this lab scale. + +## Public HTTP TLS + +HTTP TLS certificates are always issued by Let's Encrypt through ACME DNS-01 +against Route 53. + +Cluster responsibilities: + +- ExternalDNS manages service DNS records. +- cert-manager manages ACME orders, challenges, and certificate renewal. + +ExternalDNS does not manage `_acme-challenge` TXT records. Those belong to +cert-manager. + +The AWS account already contains a public Route 53 ACME validation zone, +`acme.glab.lol`, delegated from Cloudflare. The cluster TLS design uses that +zone rather than granting cluster workloads broad Cloudflare DNS access. + +The challenge delegation convention is: + +```text +_acme-challenge..glab.lol + CNAME _acme-challenge...acme.glab.lol +``` + +cert-manager is configured to follow CNAMEs and write TXT records into the +Route 53 ACME zone using short-lived AWS credentials. Each cluster's AWS role +is scoped to its own challenge names. + +No wildcard certificate is assumed. Individual services receive individual +certificates unless a future workload proves a wildcard is worth the broader +blast radius. + +## Internal PKI + +### Root CA + +The internal PKI root is an AWS KMS asymmetric signing key. The private key +never leaves KMS. + +The existing `infra/security/pki/root-ca` stack was applied against an old AWS +account and is not the target root. The target implementation recreates the +root CA in the current `lab` account, then cleans up the old-account root key +and state. + +During recreation, the root certificate's path length is increased from the +current `pathlen:1` model. The recommended target is `pathlen:2`: + +```text +Root CA pathlen:2 + -> cluster Vault intermediate pathlen:1 + -> SPIRE intermediate pathlen:0 + -> workload SVID leaves +``` + +This keeps the future SPIRE path open without requiring another root rotation. +For clusters where Vault directly issues mTLS leaves, the same hierarchy still +works: + +```text +Root CA pathlen:2 + -> cluster Vault intermediate pathlen:1 + -> workload mTLS leaves +``` + +Root signing is operationally offline. No always-on lab workload has standing +permission to use the root key. Root signing is used only to mint or rotate +cluster subordinate CAs. + +### Cluster subordinate CAs + +Each cluster gets its own subordinate CA, generated and held by that cluster's +Vault instance. Vault generates the intermediate private key and CSR; the AWS +KMS root signs the CSR; the signed intermediate is imported back into Vault. + +The cluster subordinate CA identity includes the cluster name. Example common +names: + +```text +glab platform Vault CA +glab nonprod Vault CA +glab prod Vault CA +``` + +Vault PKI roles issue short-lived certificates for internal use cases such as: + +- service-to-service mTLS +- database client authentication +- internal controllers that need X.509 credentials +- future SPIRE upstream authority material + +### SPIFFE and SPIRE + +SPIRE is a future addition, not a baseline dependency. + +The first SPIRE deployment in each cluster uses an independent trust domain and +does not federate with other clusters. That keeps `platform`, `nonprod`, and +`prod` aligned with the Vault isolation model. + +When SPIRE is introduced, the preferred shape is: + +```text +cluster Vault CA -> SPIRE intermediate -> workload SVIDs +``` + +The SPIRE intermediate is local to the cluster. Federation is deferred until a +real cross-cluster workload requires it. + +## Retiring step-ca + +The older architecture placed `Smallstep step-ca` on `VP6630` as the online +internal intermediate CA. That was the right bootstrap-oriented first shape, +but it is not the target model for this design. + +In this design: + +- public HTTP TLS moves to Let's Encrypt and Route 53 DNS-01 +- internal runtime issuance moves to per-cluster Vault +- future workload identity moves to per-cluster SPIRE +- `step-ca` is removed once its remaining consumers have migrated + +The root CA migration is the natural time to make this break. Old chains can +expire or be replaced as consumers move to the new issuers. + +## Implementation Slices + +This design should be implemented in small slices: + +1. Rewrap existing SOPS files with `alias/glab-sops`, add encryption context, + and remove PGP/age recipients. +2. Rotate bootstrap secrets that previously depended on PGP/age-only history. +3. Create the GitHub App + SSM bootstrap path for `secrets` repo access. +4. Recreate the internal root CA in the current `lab` AWS account with the new + path length. +5. Clean up the old-account PKI root after the new root is usable. +6. Add the shared Vault unseal KMS key and bank-vaults storage layout. +7. Stand up Vault in one cluster and prove SOPS -> Vault bootstrap handoff. +8. Add cert-manager DNS-01 with Route 53 ACME delegation for one cluster. +9. Migrate internal PKI consumers away from `step-ca`. + +## Open Threads + +- Exact KMS encryption context keys and scope names. +- Exact GitHub App name, installation ID storage, and SSM parameter path. +- Whether Vault unseal material uses one shared S3 bucket with prefixes or + separate per-cluster buckets. +- Whether the new root CA should use `pathlen:2` exactly or a larger value. +- How trust bundles are distributed to workloads that need to trust internal + Vault or SPIRE issuers. +- Whether public TLS certificates should ever use wildcards. +- When old `step-ca` certificates are allowed to expire versus actively + replaced. + +## References + +- [AWS Lab Account](./aws-lab-account.md) +- [Keycloak](./keycloak.md) +- [Multi-Cluster GitOps Model](./gitops-multi-cluster.md) +- [cert-manager Route 53 DNS-01](https://cert-manager.io/docs/configuration/acme/dns01/route53/) +- [cert-manager delegated DNS-01](https://cert-manager.io/docs/configuration/acme/dns01/#delegated-domains-for-dns01) +- [Vault PKI secrets engine](https://developer.hashicorp.com/vault/docs/secrets/pki) +- [Vault PKI intermediate guidance](https://developer.hashicorp.com/vault/docs/secrets/pki/considerations) +- [Bank-Vaults unseal keys](https://bank-vaults.dev/docs/concepts/unseal-keys/) +- [SPIRE configuration](https://spiffe.io/docs/latest/deploying/configuring/)