diff --git a/docs/html-protocol.md b/docs/html-protocol.md index 0de34eb..79649ef 100644 --- a/docs/html-protocol.md +++ b/docs/html-protocol.md @@ -10,18 +10,26 @@ Signed content uses the `` custom HTML element, as defined in th ### Required Attributes +Per spec §2.1, the wrapper element carries exactly four required attributes: + | Attribute | Description | Example | |---|---|---| -| `signature` | Base64-encoded cryptographic signature of the content hash + domain + author ID | `signature="aBcDeF123..."` | -| `keyid` | URL where the author's public key can be fetched, or a DID | `keyid="https://api.example.com/authors/123/public-key"` | -| `algorithm` | Cryptographic algorithm used for the signature | `algorithm="ed25519"` | -| `content-hash` | Hash of the canonicalized content, prefixed with the algorithm | `content-hash="sha256:abc123def456..."` | +| `keyid` | Identifies the signer; resolved per the rules in **Identity and Key Resolution** below. May be a DID, a direct URL to a public key document, or a trust-directory reference. | `keyid="did:web:author.example"` | +| `signature` | Base64-encoded (unpadded) cryptographic signature over the canonical binding string defined in **Signature Data Format** | `signature="aBcDeF123..."` | +| `content-hash` | Hash of the canonicalized text content, prefixed with the hash algorithm | `content-hash="sha256:abc123def456..."` | +| `algorithm` | Signature algorithm. Required by the spec; implementations MAY default to `ed25519` when the attribute is omitted, but producers SHOULD always emit it explicitly. | `algorithm="ed25519"` | + +### Optional Attributes + +There are **no** optional attributes on the `` wrapper itself in this revision. All claim and contextual metadata (author name, signed-at timestamp, license, content type, AI assistance, etc.) belongs in inner `` elements as documented under **Inner Metadata** below. This keeps the wrapper's attribute surface narrow and easy to validate. + +Presentational attributes such as `style` and `class` SHOULD NOT be set inline on ``. Styling is the user agent's responsibility (see the **CSS** section at the bottom of this document); inline presentational attributes mix concerns and are unnecessary for protocol conformance. ### Supported Algorithms | Value | Description | |---|---| -| `ed25519` | Ed25519 (recommended) | +| `ed25519` | Ed25519 (recommended; the default if the `algorithm` attribute is omitted) | | `rsa` | RSA with SHA-256 | | `ecdsa` | ECDSA with secp256k1 | @@ -92,43 +100,131 @@ Or appear as a **standalone marker** alongside content (e.g., when added by a CM Both forms are valid. Verifying clients should handle either case. -## Content Canonicalization +## Identity and Key Resolution + +The `keyid` attribute identifies the signer but the resolution mechanism is deliberately **pluggable** (spec §2.2). Implementations MUST accept multiple resolution methods and SHOULD treat none as canonical or privileged. Three forms are defined: + +| Form | Example | How it resolves | +|---|---|---| +| **Decentralized Identifier (DID)** | `did:web:author.example` | The user agent fetches the DID document at the author's origin (`https://author.example/.well-known/did.json` for `did:web`) and extracts the public key. Places no dependency on any third party. | +| **Direct URL to a public key** | `https://author.example/key.json` | The user agent fetches the URL and parses the response as either a JSON `{ publicKey, algorithm }` object or raw PEM. Simple to host as a static file with no extra tooling. | +| **Trust directory reference** | `https://directory.example/keys/abc123` | The user agent fetches the URL from a federated trust directory acting as a convenience key registry. Useful for less-technical authors who prefer a sign-up workflow over self-hosted identity publication. | + +**No resolver is privileged by the protocol.** Authors freely choose a resolution mechanism, and verifiers freely choose which methods they accept. Verifiers typically compose the three resolvers as a fallback chain in whatever order suits their threat model. The `keyid` is opaque to the signature protocol itself; only the resolved public key matters for cryptographic verification, and that verification is a local operation in the user agent that never requires contacting a directory. + +User agents MAY cache resolved keys (with appropriate freshness and revocation handling) so that signature verification scales to pages with many signed sections without repeated network calls. + +## Canonical Content Extraction + +The hash that the signature covers is taken from the **text content** of the signed region, after the extraction and normalization process described below (spec §2.1). This is performed in two stages: HTML extraction, then text normalization. + +### Stage 1: HTML extraction + +Given the inner contents of a `` element: + +1. **Strip excluded elements** entirely, including their text content: `