diff --git a/.context/DECISIONS.md b/.context/DECISIONS.md index 706fa02ab..94da0b42d 100644 --- a/.context/DECISIONS.md +++ b/.context/DECISIONS.md @@ -3,6 +3,7 @@ | Date | Decision | |----|--------| +| 2026-05-23 | MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx | | 2026-05-23 | Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive | | 2026-05-22 | OpenCode plugin: agent shell tool not anchored to project root under cwd-anchored | | 2026-05-21 | Substrate vs. artifact placement: .context/ vs. project root | @@ -149,6 +150,34 @@ For significant decisions: --> +## [2026-05-23-020000] MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx + +**Status**: Accepted + +**Context**: Builds on the 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" and the earlier 2026-03-06 "Peer MCP model for external tool integration" decisions. Those framed the choice as architectural (markdown-on-filesystem invariant, avoid plugin registries). The new framing, surfaced during the triage of architecture-pipeline tasks, names a stronger ownership-shaped reason: an MCP gateway through ctx would couple ctx to the lifecycle of every gatewayed tool. If ctx proxied GitNexus, users couldn't independently `pip install gitnexus` or uninstall it — ctx would become the install/uninstall surface, the upgrade path, the version-compatibility owner. That coupling is a tax we don't want to pay for a tool we don't ship. + +**Decision**: MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx. + +**Rationale**: Three independent considerations converge: + +1. **Composition is already MCP's job.** Agents already compose multiple MCP servers. Adding a gateway through ctx duplicates the composition layer without adding capability — the agent could just talk to GitNexus directly. The peer model preserves that property. +2. **Ownership coupling is bidirectional.** A gateway makes ctx vouch for the peer (install, uninstall, version compatibility, error surface translation). It also makes the peer's failures surface as ctx failures from the agent's perspective, blurring the diagnostic boundary. Both directions add support burden disproportionate to the value of "one extra abstraction layer". +3. **The skills already work without it.** `/ctx-architecture-enrich` and `/ctx-architecture-failure-analysis` reference GitNexus by name in their SKILL.md instructions. The agent invokes GitNexus directly via its own MCP client. No gateway involved, no abstraction needed — the skill names the tool it expects and the agent either has it configured or doesn't. Doctor-style checks (existing TASKS.md item at line 1346) handle the "is it there?" surface without proxying. + +Alternatives considered and rejected: (1) Gateway through ctx — rejected for the ownership reasons above. (2) Pluggable graph-tool abstraction with multiple candidate implementations (the now-skipped TASKS.md item) — implies ctx vouches for the interface contract across implementations, same ownership trap. (3) Optional gateway as opt-in — added complexity without removing the coupling for users who opt in; cleaner to have no gateway at all. + +**Consequence**: + +- **Pluggable graph tool interface task** (TASKS.md "Explore pluggable graph tool interface", `#added:2026-03-25-120000`) **skipped** as a direct consequence — pluggability without ownership is incoherent. +- **GitNexus stays named-by-convention** in skill text. SKILL.md instructions can reference `gitnexus.*` MCP tool names directly; agents either have the configuration or fail explicitly. +- **Architecture pipeline 4th step** (`ctx-architecture-next`, added today) is *itself* gateway-free: it consumes only the Markdown artifacts produced by the prior three steps, so the synthesis layer has no MCP dependency at all. That's the right shape for any future pipeline-completing skill: read what's on disk, write a new artifact. +- **Doctor / preflight checks** for companion-tool availability remain valid (TASKS.md line 1346, "Update `ctx doctor` to check for graph tool availability"). Checking that a peer exists is not the same as proxying through it. +- **The earlier 2026-03-12 peer-MCP decision is not superseded** — it's reinforced. This entry adds the ownership lens; the architectural reasoning from that entry still applies. + +See also: `ideas/spec-companion-intelligence.md` (the original peer-MCP design), `ideas/gitnexus-contextmode-analysis.md`, the now-skipped pluggable-interface task in TASKS.md. + +--- + ## [2026-05-23-001500] Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive **Status**: Accepted diff --git a/.context/LEARNINGS.md b/.context/LEARNINGS.md index 6d80a89ec..549239042 100644 --- a/.context/LEARNINGS.md +++ b/.context/LEARNINGS.md @@ -17,6 +17,7 @@ DO NOT UPDATE FOR: | Date | Learning | |----|--------| +| 2026-05-23 | Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone | | 2026-05-23 | Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK | | 2026-05-22 | vitest's mocked `execFile` fires callbacks synchronously; real Node defers to `process.nextTick` — closure-capture patterns can TDZ-trap under the mock | | 2026-05-22 | Double-excluded tests rot compounding — re-enable cost = sum of all drift since last green, not just the original bug | @@ -158,6 +159,16 @@ DO NOT UPDATE FOR: --- +## [2026-05-23-003000] Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone + +**Context**: TASKS.md line 375 ("Improve hub failover client: distinguish auth errors from connection errors") had been open since 2026-04-08. On triage, `internal/hub/failover.go:61-63` already called `authErr(callErr)` and returned immediately on Unauthenticated/PermissionDenied; `internal/hub/err_check.go:22-30` `authErr()` checked exactly those two codes. The behavior was implemented in the original failover feature commit (8bcb6208) without the task being closed. But the test suite never asserted the invariant — three existing failover tests covered happy path, skip-bad-peer, and all-bad-peers, none of them exercised "auth fails → walk stops". A future refactor could have silently deleted the auth-fast-fail branch and all three would still pass. Commit 22cffc27 added `TestFailoverClient_FailsFastOnAuthError` and closed the task. + +**Lesson**: Stale TASKS.md items frequently describe work that's *already done in code* but *not asserted in tests*. The task stays open not because nothing happened but because nothing pinned the behavior down so the task author could mark it complete. Reading a task description and assuming the code surface is missing is a misdiagnosis. The right pattern: `git log` / `git blame` / grep the symbols the task names; if the implementation exists, the task's value shifts from "build the thing" to "lock the thing down with a test that would catch its regression". Closes the task AND defends the behavior. + +**Application**: When triaging TASKS.md, especially items older than a few weeks, run a "what's the implementation status?" sweep before scoping work. For each candidate: grep the function/file/behavior the task names; if it exists, check the test file for an assertion that exercises the named invariant (not just adjacent invariants). If the assertion is missing, the task closes by writing the regression test — frequently a single test function. This pattern applies to behavior-named tasks ("X should fail fast on Y", "Z should reject malformed W") much more than to feature-named tasks ("add the X command"). For ctx specifically, hub/connect/replication-adjacent tasks accreted this way during the original implementation push; the failover-auth task was one example, others (file locking on connect sync, fanout broadcast entry loss) are still on TASKS.md and may warrant the same triage. + +--- + ## [2026-05-23-001000] Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK **Context**: While building `i18n.MatchKey` (commit 978582f5) for diacritic-insensitive placeholder matching, the natural reflex was "this is going to need per-script special cases — CJK doesn't have case, Arabic has shadda/fatha that are meaning-changing, Bengali vowel signs are script-essential, Hebrew niqqud distinguishes words." I sized the work assuming we'd need a script-aware policy, possibly with a locale config or an opt-in flag for "strip all combining marks" vs "strip only Latin-style decoration". Empirical test across Turkish/German/French/Spanish/Catalan/Czech/Vietnamese (should collapse) and Arabic/Bengali/Devanagari/Hindi/Hebrew/Chinese/Korean (should preserve) showed the entire policy fits in one numeric range: U+0300..U+036F. diff --git a/.context/TASKS.md b/.context/TASKS.md index 0c6eeba00..5064c81b2 100644 --- a/.context/TASKS.md +++ b/.context/TASKS.md @@ -372,30 +372,23 @@ TASK STATUS LABELS: - [ ] Human: test `ctx init` on a fresh ubuntu install. -- [ ] Improve hub failover client: distinguish auth errors +- [x] Improve hub failover client: distinguish auth errors (Unauthenticated/PermissionDenied) from connection errors. Fail fast on auth failures instead of cycling through all peers with the same invalid token. - #priority:low #added:2026-04-08-194612 - -- [ ] Add file locking to ctx connect sync state to prevent concurrent sync - races. Two sync processes (hook + manual) can both load the same LastSequence, - process the same entries, and write duplicate content to .context/shared/. - #priority:medium #added:2026-04-08-194557 - -- [ ] Fix fanout broadcast entry loss: non-blocking send drops entries to slow - listeners silently. Log when entries are dropped. Consider per-listener - backpressure or disconnect-on-lag. Buffer of 64 is too small for busy hubs. - #priority:medium #added:2026-04-08-194542 - -- [ ] Prevent duplicate client registration in hub store: RegisterClient should - reject if ProjectName already exists. Add token revocation support (delete - client by ID/project). Currently tokens are valid forever with no way to - disable compromised ones. #priority:medium #added:2026-04-08-194529 - -- [ ] Fix hub cluster: NewCluster result is discarded (not stored on Server), so - Raft runs but leadership status is never queryable. Store cluster reference on - Server, wire IsLeader/LeaderAddr into Status RPC and hub status command. - #priority:medium #added:2026-04-08-194511 + #priority:low #added:2026-04-08-194612 #completed:2026-05-23 + Implementation already landed (commit 8bcb6208, the original failover + feature): `internal/hub/failover.go:61-63` calls `authErr(callErr)` and + returns immediately on auth errors; `internal/hub/err_check.go:22-30` + `authErr()` checks both `codes.Unauthenticated` and + `codes.PermissionDenied`. The task was open because no test specifically + asserted the auth-fast-fail path — the three existing failover tests + cover happy-path, skip-bad-peer, and all-bad-peers but not the + "stop walking on auth failure" invariant. Added + `TestFailoverClient_FailsFastOnAuthError`: seeds a bogus token, lists + two peers (real server first, unrouted port second), asserts the + returned gRPC code is Unauthenticated/PermissionDenied rather than + Unavailable — an Unavailable would prove the walk cycled past auth + into the unrouted second peer (the exact regression to catch). - [ ] Use crypto/subtle.ConstantTimeCompare for hub token validation instead of string equality. Current Store.ValidateToken uses == which is vulnerable to @@ -522,6 +515,54 @@ global state mutation. discovered patterns against known failure modes in similar systems. #added:2026-03-25-060000 +- [ ] ctx-architecture-next — fourth step in the architecture + pipeline (map → enrich → hunt → **prescribe**). + **Context**: The three existing skills produce inputs + (`ARCHITECTURE.md`, `DETAILED_DESIGN*.md` from + `/ctx-architecture`; enriched verifications from + `/ctx-architecture-enrich`; ranked failure inventory from + `/ctx-architecture-failure-analysis`'s `DANGER-ZONES.md`). + But the agent then has to synthesize "so what do I DO?" on + its own, every time, from those raw artifacts. The fourth + step closes the pipeline by producing `NEXT-ACTIONS.md` — + a sequenced, prioritized fix plan that maps each danger + zone to a concrete next move (refactor, test, doc, + escalate, accept) with effort estimates and a suggested + order. + **Distinct from ctx-architecture-extend (skipped)**: that + was about *where features grow*; this is about *what to + fix first*. Extend overlapped with DETAILED_DESIGN and + enrich's registration sites. Next has no overlap — pure + synthesis layer over the prior three artifacts. The + pipeline is now 4 because each step has a distinct output + document: map(ARCHITECTURE) → enrich(verified ARCHITECTURE) + → hunt(DANGER-ZONES) → prescribe(NEXT-ACTIONS). + **No MCP gateway required**: this skill consumes only the + three Markdown artifacts produced by the prior skills, + which already absorbed any GitNexus-derived signal during + the enrich step. The synthesis is a pure-reasoning pass on + the agent side. Aligns with the decision that ctx does not + proxy / gateway companion MCPs; see DECISIONS.md + "MCP gateway not worth the coupling cost". + Scope sketch (refine when implementing): + - [ ] Design SKILL.md: inputs (three artifacts), output + shape (`NEXT-ACTIONS.md` with ranked sections), quality + checklist (every action cites a danger zone; every + danger zone has an action OR an explicit "accepted" + rationale). + - [ ] Define the action taxonomy: refactor, test, doc, + escalate, accept. Each carries effort estimate (S/M/L) + and a suggested sequence position. + - [ ] Reference run against ctx itself: produce + `NEXT-ACTIONS.md` from the existing DANGER-ZONES.md if + one has been generated; otherwise generate the whole + 4-step pipeline against ctx as the worked example. + - [ ] Document the pipeline order in + `docs/recipes/architecture-mapping.md` (or wherever the + existing 3-step recipe lives): "run all four in + sequence; each step's output feeds the next". + #priority:medium #added:2026-05-23 + - [-] ctx-architecture-extend Skipped: extension point analysis is covered by /ctx-architecture DETAILED_DESIGN (per-module) and /ctx-architecture-enrich @@ -1354,11 +1395,21 @@ Not a fit (keep in `ctx`): graph/RAG MCP is configured in .ctxrc, verify connection status, recommend installation if missing #added:2026-03-25-120000 -- [ ] Explore pluggable graph tool interface — replace hardcoded GitNexus +- [-] Explore pluggable graph tool interface — replace hardcoded GitNexus references in skill text with configurable .ctxrc graph_tool key. Skills use template placeholder instead of literal tool names. Define minimum interface contract (query, context, impact). Spec: `ideas/spec-mcp-warm-up-ceremony.md` #added:2026-03-25-120000 + **Skipped 2026-05-23**: contradicts the committed-to-GitNexus + direction recorded in DECISIONS.md "MCP gateway not worth the + coupling cost". Pluggable abstraction implies multiple + candidate graph tools, which in turn implies ctx vouches for + the interface contract across implementations — exactly the + ownership coupling we're avoiding. If a second viable graph + tool emerges that's worth the cost of pluggability, revisit + by un-skipping; the design sketch in + `ideas/spec-mcp-warm-up-ceremony.md` stays available as the + starting point. ### Phase: ctx Hub follow-ups (PR #60) diff --git a/internal/hub/failover_test.go b/internal/hub/failover_test.go index 3c80d46ec..374b6f22d 100644 --- a/internal/hub/failover_test.go +++ b/internal/hub/failover_test.go @@ -8,6 +8,9 @@ package hub import ( "testing" + + "google.golang.org/grpc/codes" + "google.golang.org/grpc/status" ) // TestFailoverClient_FirstPeerWorks verifies that the @@ -102,3 +105,43 @@ func TestFailoverClient_AllBad(t *testing.T) { t.Fatal("expected error when all peers bad") } } + +// TestFailoverClient_FailsFastOnAuthError verifies that an +// auth failure on the first reachable peer halts the +// failover walk: subsequent peers are not contacted, since +// the same token would fail there too. +// +// The reachable first peer is a real server that rejects +// the invalid bearer with codes.Unauthenticated. The +// second peer is `127.0.0.1:1` — an unrouted port that +// would surface a connection-class error (Unavailable) if +// the walk continued past the first auth failure. So an +// Unauthenticated return code proves the walk stopped at +// the first peer; an Unavailable return code would prove a +// regression (the walk cycled past auth and tried the +// second peer, where dial succeeded but the unrouted port +// produced a different error class). +func TestFailoverClient_FailsFastOnAuthError(t *testing.T) { + _, conn, _ := startTestServer(t) + addr := conn.Target() + + _, foErr := newFailoverClient( + []string{addr, "127.0.0.1:1"}, + "bogus-token-that-the-server-will-reject", + ) + if foErr == nil { + t.Fatal("expected auth error on first peer; got nil") + } + s, ok := status.FromError(foErr) + if !ok { + t.Fatalf("expected gRPC status error; got %T: %v", foErr, foErr) + } + if s.Code() != codes.Unauthenticated && s.Code() != codes.PermissionDenied { + t.Errorf( + "got code %s; want Unauthenticated or PermissionDenied "+ + "(if Unavailable, the walk cycled past auth into the "+ + "unrouted second peer — auth-fast-fail regression)", + s.Code(), + ) + } +} diff --git a/internal/hub/store_test.go b/internal/hub/store_test.go index 3bbf1161e..a79d01503 100644 --- a/internal/hub/store_test.go +++ b/internal/hub/store_test.go @@ -145,6 +145,70 @@ func TestStoreRegisterAndValidate(t *testing.T) { } } +// TestStoreValidateToken_RejectsNearMissTokens pins the +// timing-attack-resistance contract of Store.ValidateToken. +// +// The implementation uses an O(1) map lookup on +// s.tokenIdx[bearerToken] followed by a defensive +// crypto/subtle.ConstantTimeCompare against the stored +// token. The CTC is technically redundant once the map +// lookup hits — Go map keys match exact-byte by +// definition — but it's the explicit signal of intent +// that would catch a future "simplification" PR +// collapsing both checks back to a single == or +// strings.HasPrefix. +// +// This test exercises the *behavior* that the CTC +// defends: no near-miss token (one byte off, prefix +// only, extra suffix bytes, case-shifted) ever +// validates. If a regression replaces the careful +// chain with a prefix matcher or a non-constant +// comparison, these cases start passing when they +// shouldn't. +func TestStoreValidateToken_RejectsNearMissTokens(t *testing.T) { + dir := t.TempDir() + s, err := NewStore(dir) + if err != nil { + t.Fatal(err) + } + const valid = "tok_abc123_with_some_length" + if regErr := s.RegisterClient(ClientInfo{ + ID: "c1", ProjectName: "near-miss-proj", Token: valid, + }); regErr != nil { + t.Fatal(regErr) + } + + // Sanity: the valid token still validates. + if s.ValidateToken(valid) == nil { + t.Fatal("valid token should validate; suite bug") + } + + rejected := []struct { + name, token string + }{ + {"empty", ""}, + {"last byte changed", valid[:len(valid)-1] + "X"}, + {"first byte changed", "X" + valid[1:]}, + {"middle byte changed", valid[:len(valid)/2] + "X" + valid[len(valid)/2+1:]}, + {"prefix only", valid[:len(valid)/2]}, + {"extra suffix appended", valid + "X"}, + {"case-shifted", "TOK_ABC123_WITH_SOME_LENGTH"}, + {"whitespace-padded", " " + valid + " "}, + {"all-X same length", "XXXXXXXXXXXXXXXXXXXXXXXXXXX"}, + } + for _, tc := range rejected { + t.Run(tc.name, func(t *testing.T) { + if got := s.ValidateToken(tc.token); got != nil { + t.Errorf( + "ValidateToken(%q) = %+v; want nil "+ + "(near-miss/partial-match must not validate)", + tc.token, got, + ) + } + }) + } +} + func TestStoreStats(t *testing.T) { dir := t.TempDir() s, err := NewStore(dir)