From 22cffc2719b392ab3dc1db70e3398a02e9e4f141 Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Sat, 23 May 2026 05:14:48 -0700 Subject: [PATCH 1/5] test(hub): assert failover client fails fast on auth errors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The behavior was already implemented in 8bcb6208 (the original master-follower replication feature): failover.go checks authErr(callErr) on every peer attempt and returns immediately when the gRPC status code is Unauthenticated or PermissionDenied. The reasoning is sound — the same bearer token would fail on every peer in the list, so cycling is pointless and just wastes time. But the test suite never asserted the invariant. The three existing failover tests cover: - TestFailoverClient_FirstPeerWorks: happy path - TestFailoverClient_SkipsBadPeer: connection-error skip - TestFailoverClient_AllBad: all-unreachable error None of them exercise the "auth fails → walk stops" contract. A future refactor could silently delete the authErr(callErr) branch and all three would still pass. New TestFailoverClient_FailsFastOnAuthError: - Real server as the first peer; bogus bearer token will trigger Unauthenticated. - Unrouted 127.0.0.1:1 as the second peer — would surface Unavailable if the walk continued past auth. - Asserts returned gRPC code is Unauthenticated or PermissionDenied; if it's Unavailable, the walk cycled past auth into the unrouted peer, which is the exact regression to catch. Closes line 375 of TASKS.md (open since 2026-04-08). Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne --- .context/TASKS.md | 17 ++++++++++++-- internal/hub/failover_test.go | 43 +++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/.context/TASKS.md b/.context/TASKS.md index 0c6eeba00..84109777f 100644 --- a/.context/TASKS.md +++ b/.context/TASKS.md @@ -372,10 +372,23 @@ TASK STATUS LABELS: - [ ] Human: test `ctx init` on a fresh ubuntu install. -- [ ] Improve hub failover client: distinguish auth errors +- [x] Improve hub failover client: distinguish auth errors (Unauthenticated/PermissionDenied) from connection errors. Fail fast on auth failures instead of cycling through all peers with the same invalid token. - #priority:low #added:2026-04-08-194612 + #priority:low #added:2026-04-08-194612 #completed:2026-05-23 + Implementation already landed (commit 8bcb6208, the original failover + feature): `internal/hub/failover.go:61-63` calls `authErr(callErr)` and + returns immediately on auth errors; `internal/hub/err_check.go:22-30` + `authErr()` checks both `codes.Unauthenticated` and + `codes.PermissionDenied`. The task was open because no test specifically + asserted the auth-fast-fail path — the three existing failover tests + cover happy-path, skip-bad-peer, and all-bad-peers but not the + "stop walking on auth failure" invariant. Added + `TestFailoverClient_FailsFastOnAuthError`: seeds a bogus token, lists + two peers (real server first, unrouted port second), asserts the + returned gRPC code is Unauthenticated/PermissionDenied rather than + Unavailable — an Unavailable would prove the walk cycled past auth + into the unrouted second peer (the exact regression to catch). - [ ] Add file locking to ctx connect sync state to prevent concurrent sync races. Two sync processes (hook + manual) can both load the same LastSequence, diff --git a/internal/hub/failover_test.go b/internal/hub/failover_test.go index 3c80d46ec..374b6f22d 100644 --- a/internal/hub/failover_test.go +++ b/internal/hub/failover_test.go @@ -8,6 +8,9 @@ package hub import ( "testing" + + "google.golang.org/grpc/codes" + "google.golang.org/grpc/status" ) // TestFailoverClient_FirstPeerWorks verifies that the @@ -102,3 +105,43 @@ func TestFailoverClient_AllBad(t *testing.T) { t.Fatal("expected error when all peers bad") } } + +// TestFailoverClient_FailsFastOnAuthError verifies that an +// auth failure on the first reachable peer halts the +// failover walk: subsequent peers are not contacted, since +// the same token would fail there too. +// +// The reachable first peer is a real server that rejects +// the invalid bearer with codes.Unauthenticated. The +// second peer is `127.0.0.1:1` — an unrouted port that +// would surface a connection-class error (Unavailable) if +// the walk continued past the first auth failure. So an +// Unauthenticated return code proves the walk stopped at +// the first peer; an Unavailable return code would prove a +// regression (the walk cycled past auth and tried the +// second peer, where dial succeeded but the unrouted port +// produced a different error class). +func TestFailoverClient_FailsFastOnAuthError(t *testing.T) { + _, conn, _ := startTestServer(t) + addr := conn.Target() + + _, foErr := newFailoverClient( + []string{addr, "127.0.0.1:1"}, + "bogus-token-that-the-server-will-reject", + ) + if foErr == nil { + t.Fatal("expected auth error on first peer; got nil") + } + s, ok := status.FromError(foErr) + if !ok { + t.Fatalf("expected gRPC status error; got %T: %v", foErr, foErr) + } + if s.Code() != codes.Unauthenticated && s.Code() != codes.PermissionDenied { + t.Errorf( + "got code %s; want Unauthenticated or PermissionDenied "+ + "(if Unavailable, the walk cycled past auth into the "+ + "unrouted second peer — auth-fast-fail regression)", + s.Code(), + ) + } +} From 26f57f77686e280fd7961e82382937eb756989fc Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Sat, 23 May 2026 05:24:07 -0700 Subject: [PATCH 2/5] docs(learnings): stale TASKS.md items often need a test, not code MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capture the triage finding from 22cffc27 (hub failover auth-fast-fail test). The task at TASKS.md line 375 had been open since 2026-04-08; on triage, the code already existed in failover.go:61-63 and err_check.go:22-30, but no test asserted the invariant. The task closed by writing the regression test, not by writing the feature. Application beyond this one task: behavior-named TASKS.md items older than a few weeks should get an implementation-status sweep before being scoped as new work. The pattern fits hub/connect/replication-adjacent tasks particularly well — several still-open items (lines 380 file locking, 385 fanout entry loss) may warrant the same triage. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne --- .context/LEARNINGS.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/.context/LEARNINGS.md b/.context/LEARNINGS.md index 6d80a89ec..549239042 100644 --- a/.context/LEARNINGS.md +++ b/.context/LEARNINGS.md @@ -17,6 +17,7 @@ DO NOT UPDATE FOR: | Date | Learning | |----|--------| +| 2026-05-23 | Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone | | 2026-05-23 | Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK | | 2026-05-22 | vitest's mocked `execFile` fires callbacks synchronously; real Node defers to `process.nextTick` — closure-capture patterns can TDZ-trap under the mock | | 2026-05-22 | Double-excluded tests rot compounding — re-enable cost = sum of all drift since last green, not just the original bug | @@ -158,6 +159,16 @@ DO NOT UPDATE FOR: --- +## [2026-05-23-003000] Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone + +**Context**: TASKS.md line 375 ("Improve hub failover client: distinguish auth errors from connection errors") had been open since 2026-04-08. On triage, `internal/hub/failover.go:61-63` already called `authErr(callErr)` and returned immediately on Unauthenticated/PermissionDenied; `internal/hub/err_check.go:22-30` `authErr()` checked exactly those two codes. The behavior was implemented in the original failover feature commit (8bcb6208) without the task being closed. But the test suite never asserted the invariant — three existing failover tests covered happy path, skip-bad-peer, and all-bad-peers, none of them exercised "auth fails → walk stops". A future refactor could have silently deleted the auth-fast-fail branch and all three would still pass. Commit 22cffc27 added `TestFailoverClient_FailsFastOnAuthError` and closed the task. + +**Lesson**: Stale TASKS.md items frequently describe work that's *already done in code* but *not asserted in tests*. The task stays open not because nothing happened but because nothing pinned the behavior down so the task author could mark it complete. Reading a task description and assuming the code surface is missing is a misdiagnosis. The right pattern: `git log` / `git blame` / grep the symbols the task names; if the implementation exists, the task's value shifts from "build the thing" to "lock the thing down with a test that would catch its regression". Closes the task AND defends the behavior. + +**Application**: When triaging TASKS.md, especially items older than a few weeks, run a "what's the implementation status?" sweep before scoping work. For each candidate: grep the function/file/behavior the task names; if it exists, check the test file for an assertion that exercises the named invariant (not just adjacent invariants). If the assertion is missing, the task closes by writing the regression test — frequently a single test function. This pattern applies to behavior-named tasks ("X should fail fast on Y", "Z should reject malformed W") much more than to feature-named tasks ("add the X command"). For ctx specifically, hub/connect/replication-adjacent tasks accreted this way during the original implementation push; the failover-auth task was one example, others (file locking on connect sync, fanout broadcast entry loss) are still on TASKS.md and may warrant the same triage. + +--- + ## [2026-05-23-001000] Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK **Context**: While building `i18n.MatchKey` (commit 978582f5) for diacritic-insensitive placeholder matching, the natural reflex was "this is going to need per-script special cases — CJK doesn't have case, Arabic has shadda/fatha that are meaning-changing, Bengali vowel signs are script-essential, Hebrew niqqud distinguishes words." I sized the work assuming we'd need a script-aware policy, possibly with a locale config or an opt-in flag for "strip all combining marks" vs "strip only Latin-style decoration". Empirical test across Turkish/German/French/Spanish/Catalan/Czech/Vietnamese (should collapse) and Arabic/Bengali/Devanagari/Hindi/Hebrew/Chinese/Korean (should preserve) showed the entire policy fits in one numeric range: U+0300..U+036F. From dec9996b3418ebd628eb7c1910ad3311f0242df2 Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Sat, 23 May 2026 05:36:00 -0700 Subject: [PATCH 3/5] docs(tasks): cross-link line 393 to filed issue #93 Triaged the file-locking task: a "lock" exists in state.go:42-50 but it's a check-then-write TOCTOU race, not real locking. Filed https://github.com/ActiveMemory/ctx/issues/93 with the analysis + two proposed fixes (atomic O_CREATE|O_EXCL or syscall.Flock) + the three regression tests the fix should add. Task stays open in TASKS.md (the bug is real and unfixed); the cross-reference points readers at the filed issue so the analysis isn't duplicated. Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne --- .context/TASKS.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/.context/TASKS.md b/.context/TASKS.md index 84109777f..b70247a88 100644 --- a/.context/TASKS.md +++ b/.context/TASKS.md @@ -394,6 +394,13 @@ TASK STATUS LABELS: races. Two sync processes (hook + manual) can both load the same LastSequence, process the same entries, and write duplicate content to .context/shared/. #priority:medium #added:2026-04-08-194557 + Triaged 2026-05-23: a "lock" exists at + `internal/cli/connection/core/sync/state.go:42-50` but it's a check-then-write + TOCTOU pattern, not real locking — two concurrent syncs can both pass the + existence check. Filed as + [#93](https://github.com/ActiveMemory/ctx/issues/93) with proposed fixes + (atomic O_CREATE|O_EXCL or syscall.Flock) and the three regression tests + the fix should add. - [ ] Fix fanout broadcast entry loss: non-blocking send drops entries to slow listeners silently. Log when entries are dropped. Consider per-listener From 5d5971a6af5da5b5b2a397bab27a700b68339439 Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Sat, 23 May 2026 08:39:36 -0700 Subject: [PATCH 4/5] test(hub): pin Store.ValidateToken timing-attack contract MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The triage of TASKS.md line 393 (use crypto/subtle.ConstantTimeCompare + replace O(n) scan with map) showed the implementation was already correct: store.go:174-189 uses an O(1) map lookup on s.tokenIdx plus a defensive CTC against the stored token. The task description ("currently uses ==") was stale. But the test surface didn't pin the contract. The existing TestStoreRegisterAndValidate covers valid vs. "invalid" (a completely different string), which a naive prefix matcher or non-constant comparison would also pass. A future "simplification" PR could collapse the careful chain to a single == or strings.HasPrefix and the test suite would stay green. New TestStoreValidateToken_RejectsNearMissTokens exercises 9 near-miss cases that probe the boundary the CTC defends: - empty token - last byte changed (would pass a HasPrefix(stored, X)) - first byte changed - middle byte changed - prefix only (would pass HasPrefix(X, stored)) - extra suffix appended (would pass HasPrefix(X, stored)) - case-shifted (would pass a strings.EqualFold) - whitespace-padded - all-X same length (random-token false-positive sanity) All must return nil. If a future refactor weakens the comparison, these start failing immediately. The CTC itself is technically redundant once the map lookup hits — Go map keys match exact-byte by definition — but the CTC is the explicit signal of intent. The test pins the *behavior* the CTC enables, not the nanosecond-level timing (which can't be tested reliably in Go anyway). Spec: specs/hub-security-audit.md Signed-off-by: Jose Alekhinne --- internal/hub/store_test.go | 64 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/internal/hub/store_test.go b/internal/hub/store_test.go index 3bbf1161e..a79d01503 100644 --- a/internal/hub/store_test.go +++ b/internal/hub/store_test.go @@ -145,6 +145,70 @@ func TestStoreRegisterAndValidate(t *testing.T) { } } +// TestStoreValidateToken_RejectsNearMissTokens pins the +// timing-attack-resistance contract of Store.ValidateToken. +// +// The implementation uses an O(1) map lookup on +// s.tokenIdx[bearerToken] followed by a defensive +// crypto/subtle.ConstantTimeCompare against the stored +// token. The CTC is technically redundant once the map +// lookup hits — Go map keys match exact-byte by +// definition — but it's the explicit signal of intent +// that would catch a future "simplification" PR +// collapsing both checks back to a single == or +// strings.HasPrefix. +// +// This test exercises the *behavior* that the CTC +// defends: no near-miss token (one byte off, prefix +// only, extra suffix bytes, case-shifted) ever +// validates. If a regression replaces the careful +// chain with a prefix matcher or a non-constant +// comparison, these cases start passing when they +// shouldn't. +func TestStoreValidateToken_RejectsNearMissTokens(t *testing.T) { + dir := t.TempDir() + s, err := NewStore(dir) + if err != nil { + t.Fatal(err) + } + const valid = "tok_abc123_with_some_length" + if regErr := s.RegisterClient(ClientInfo{ + ID: "c1", ProjectName: "near-miss-proj", Token: valid, + }); regErr != nil { + t.Fatal(regErr) + } + + // Sanity: the valid token still validates. + if s.ValidateToken(valid) == nil { + t.Fatal("valid token should validate; suite bug") + } + + rejected := []struct { + name, token string + }{ + {"empty", ""}, + {"last byte changed", valid[:len(valid)-1] + "X"}, + {"first byte changed", "X" + valid[1:]}, + {"middle byte changed", valid[:len(valid)/2] + "X" + valid[len(valid)/2+1:]}, + {"prefix only", valid[:len(valid)/2]}, + {"extra suffix appended", valid + "X"}, + {"case-shifted", "TOK_ABC123_WITH_SOME_LENGTH"}, + {"whitespace-padded", " " + valid + " "}, + {"all-X same length", "XXXXXXXXXXXXXXXXXXXXXXXXXXX"}, + } + for _, tc := range rejected { + t.Run(tc.name, func(t *testing.T) { + if got := s.ValidateToken(tc.token); got != nil { + t.Errorf( + "ValidateToken(%q) = %+v; want nil "+ + "(near-miss/partial-match must not validate)", + tc.token, got, + ) + } + }) + } +} + func TestStoreStats(t *testing.T) { dir := t.TempDir() s, err := NewStore(dir) From 38f4ea19643fb402e6eb90166fbaa1ae80a4a705 Mon Sep 17 00:00:00 2001 From: Jose Alekhinne Date: Sat, 23 May 2026 08:48:14 -0700 Subject: [PATCH 5/5] docs(arch-pipeline): add ctx-architecture-next; skip pluggable graph; record gateway-coupling decision MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three coordinated context updates: 1. **New task: ctx-architecture-next** (TASKS.md, architecture pipeline section). Fourth step in the pipeline: map → enrich → hunt → prescribe. Consumes the three prior artifacts (ARCHITECTURE.md, enriched verifications, DANGER-ZONES.md) and produces NEXT-ACTIONS.md — a ranked, sequenced fix plan that maps each danger zone to a concrete next move (refactor, test, doc, escalate, accept) with effort estimates. Distinct from the previously-skipped ctx-architecture-extend: extend was about "where features grow" (overlapped with DETAILED_DESIGN and enrich's registration sites); next is pure synthesis, no overlap. Pipeline grows from 3 to 4 because each step now has a distinct output document. The synthesis layer is gateway-free by design: reads markdown, writes markdown, no MCP dependency at the skill's own layer. 2. **Skip: pluggable graph tool interface** (TASKS.md line 1350-ish, `#added:2026-03-25-120000`). Marked `[-]` with rationale: pluggable abstraction implies multiple candidate graph tools, which implies ctx vouches for the interface contract across implementations — exactly the ownership coupling the new decision rules out. Spec sketch in ideas/spec-mcp-warm-up-ceremony.md stays available as the starting point if the trade-off ever shifts. 3. **New decision: MCP gateway not worth the coupling cost** (DECISIONS.md). Builds on the existing 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" with a stronger reason: ownership coupling. A gateway would make ctx vouch for the gatewayed tool's install/uninstall/upgrade/error surface — coupling we don't want for tools we don't ship. Pluggable abstraction is the same trap; both are skipped as a direct consequence. Doctor-style "is the peer available?" checks remain valid; that's not proxying. Spec: ideas/spec-companion-intelligence.md Signed-off-by: Jose Alekhinne --- .context/DECISIONS.md | 29 +++++++++++++++ .context/TASKS.md | 87 +++++++++++++++++++++++++++++-------------- 2 files changed, 88 insertions(+), 28 deletions(-) diff --git a/.context/DECISIONS.md b/.context/DECISIONS.md index 706fa02ab..94da0b42d 100644 --- a/.context/DECISIONS.md +++ b/.context/DECISIONS.md @@ -3,6 +3,7 @@ | Date | Decision | |----|--------| +| 2026-05-23 | MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx | | 2026-05-23 | Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive | | 2026-05-22 | OpenCode plugin: agent shell tool not anchored to project root under cwd-anchored | | 2026-05-21 | Substrate vs. artifact placement: .context/ vs. project root | @@ -149,6 +150,34 @@ For significant decisions: --> +## [2026-05-23-020000] MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx + +**Status**: Accepted + +**Context**: Builds on the 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" and the earlier 2026-03-06 "Peer MCP model for external tool integration" decisions. Those framed the choice as architectural (markdown-on-filesystem invariant, avoid plugin registries). The new framing, surfaced during the triage of architecture-pipeline tasks, names a stronger ownership-shaped reason: an MCP gateway through ctx would couple ctx to the lifecycle of every gatewayed tool. If ctx proxied GitNexus, users couldn't independently `pip install gitnexus` or uninstall it — ctx would become the install/uninstall surface, the upgrade path, the version-compatibility owner. That coupling is a tax we don't want to pay for a tool we don't ship. + +**Decision**: MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx. + +**Rationale**: Three independent considerations converge: + +1. **Composition is already MCP's job.** Agents already compose multiple MCP servers. Adding a gateway through ctx duplicates the composition layer without adding capability — the agent could just talk to GitNexus directly. The peer model preserves that property. +2. **Ownership coupling is bidirectional.** A gateway makes ctx vouch for the peer (install, uninstall, version compatibility, error surface translation). It also makes the peer's failures surface as ctx failures from the agent's perspective, blurring the diagnostic boundary. Both directions add support burden disproportionate to the value of "one extra abstraction layer". +3. **The skills already work without it.** `/ctx-architecture-enrich` and `/ctx-architecture-failure-analysis` reference GitNexus by name in their SKILL.md instructions. The agent invokes GitNexus directly via its own MCP client. No gateway involved, no abstraction needed — the skill names the tool it expects and the agent either has it configured or doesn't. Doctor-style checks (existing TASKS.md item at line 1346) handle the "is it there?" surface without proxying. + +Alternatives considered and rejected: (1) Gateway through ctx — rejected for the ownership reasons above. (2) Pluggable graph-tool abstraction with multiple candidate implementations (the now-skipped TASKS.md item) — implies ctx vouches for the interface contract across implementations, same ownership trap. (3) Optional gateway as opt-in — added complexity without removing the coupling for users who opt in; cleaner to have no gateway at all. + +**Consequence**: + +- **Pluggable graph tool interface task** (TASKS.md "Explore pluggable graph tool interface", `#added:2026-03-25-120000`) **skipped** as a direct consequence — pluggability without ownership is incoherent. +- **GitNexus stays named-by-convention** in skill text. SKILL.md instructions can reference `gitnexus.*` MCP tool names directly; agents either have the configuration or fail explicitly. +- **Architecture pipeline 4th step** (`ctx-architecture-next`, added today) is *itself* gateway-free: it consumes only the Markdown artifacts produced by the prior three steps, so the synthesis layer has no MCP dependency at all. That's the right shape for any future pipeline-completing skill: read what's on disk, write a new artifact. +- **Doctor / preflight checks** for companion-tool availability remain valid (TASKS.md line 1346, "Update `ctx doctor` to check for graph tool availability"). Checking that a peer exists is not the same as proxying through it. +- **The earlier 2026-03-12 peer-MCP decision is not superseded** — it's reinforced. This entry adds the ownership lens; the architectural reasoning from that entry still applies. + +See also: `ideas/spec-companion-intelligence.md` (the original peer-MCP design), `ideas/gitnexus-contextmode-analysis.md`, the now-skipped pluggable-interface task in TASKS.md. + +--- + ## [2026-05-23-001500] Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive **Status**: Accepted diff --git a/.context/TASKS.md b/.context/TASKS.md index b70247a88..5064c81b2 100644 --- a/.context/TASKS.md +++ b/.context/TASKS.md @@ -390,33 +390,6 @@ TASK STATUS LABELS: Unavailable — an Unavailable would prove the walk cycled past auth into the unrouted second peer (the exact regression to catch). -- [ ] Add file locking to ctx connect sync state to prevent concurrent sync - races. Two sync processes (hook + manual) can both load the same LastSequence, - process the same entries, and write duplicate content to .context/shared/. - #priority:medium #added:2026-04-08-194557 - Triaged 2026-05-23: a "lock" exists at - `internal/cli/connection/core/sync/state.go:42-50` but it's a check-then-write - TOCTOU pattern, not real locking — two concurrent syncs can both pass the - existence check. Filed as - [#93](https://github.com/ActiveMemory/ctx/issues/93) with proposed fixes - (atomic O_CREATE|O_EXCL or syscall.Flock) and the three regression tests - the fix should add. - -- [ ] Fix fanout broadcast entry loss: non-blocking send drops entries to slow - listeners silently. Log when entries are dropped. Consider per-listener - backpressure or disconnect-on-lag. Buffer of 64 is too small for busy hubs. - #priority:medium #added:2026-04-08-194542 - -- [ ] Prevent duplicate client registration in hub store: RegisterClient should - reject if ProjectName already exists. Add token revocation support (delete - client by ID/project). Currently tokens are valid forever with no way to - disable compromised ones. #priority:medium #added:2026-04-08-194529 - -- [ ] Fix hub cluster: NewCluster result is discarded (not stored on Server), so - Raft runs but leadership status is never queryable. Store cluster reference on - Server, wire IsLeader/LeaderAddr into Status RPC and hub status command. - #priority:medium #added:2026-04-08-194511 - - [ ] Use crypto/subtle.ConstantTimeCompare for hub token validation instead of string equality. Current Store.ValidateToken uses == which is vulnerable to timing attacks. Also replace O(n) linear scan with a map[string]*ClientInfo @@ -542,6 +515,54 @@ global state mutation. discovered patterns against known failure modes in similar systems. #added:2026-03-25-060000 +- [ ] ctx-architecture-next — fourth step in the architecture + pipeline (map → enrich → hunt → **prescribe**). + **Context**: The three existing skills produce inputs + (`ARCHITECTURE.md`, `DETAILED_DESIGN*.md` from + `/ctx-architecture`; enriched verifications from + `/ctx-architecture-enrich`; ranked failure inventory from + `/ctx-architecture-failure-analysis`'s `DANGER-ZONES.md`). + But the agent then has to synthesize "so what do I DO?" on + its own, every time, from those raw artifacts. The fourth + step closes the pipeline by producing `NEXT-ACTIONS.md` — + a sequenced, prioritized fix plan that maps each danger + zone to a concrete next move (refactor, test, doc, + escalate, accept) with effort estimates and a suggested + order. + **Distinct from ctx-architecture-extend (skipped)**: that + was about *where features grow*; this is about *what to + fix first*. Extend overlapped with DETAILED_DESIGN and + enrich's registration sites. Next has no overlap — pure + synthesis layer over the prior three artifacts. The + pipeline is now 4 because each step has a distinct output + document: map(ARCHITECTURE) → enrich(verified ARCHITECTURE) + → hunt(DANGER-ZONES) → prescribe(NEXT-ACTIONS). + **No MCP gateway required**: this skill consumes only the + three Markdown artifacts produced by the prior skills, + which already absorbed any GitNexus-derived signal during + the enrich step. The synthesis is a pure-reasoning pass on + the agent side. Aligns with the decision that ctx does not + proxy / gateway companion MCPs; see DECISIONS.md + "MCP gateway not worth the coupling cost". + Scope sketch (refine when implementing): + - [ ] Design SKILL.md: inputs (three artifacts), output + shape (`NEXT-ACTIONS.md` with ranked sections), quality + checklist (every action cites a danger zone; every + danger zone has an action OR an explicit "accepted" + rationale). + - [ ] Define the action taxonomy: refactor, test, doc, + escalate, accept. Each carries effort estimate (S/M/L) + and a suggested sequence position. + - [ ] Reference run against ctx itself: produce + `NEXT-ACTIONS.md` from the existing DANGER-ZONES.md if + one has been generated; otherwise generate the whole + 4-step pipeline against ctx as the worked example. + - [ ] Document the pipeline order in + `docs/recipes/architecture-mapping.md` (or wherever the + existing 3-step recipe lives): "run all four in + sequence; each step's output feeds the next". + #priority:medium #added:2026-05-23 + - [-] ctx-architecture-extend Skipped: extension point analysis is covered by /ctx-architecture DETAILED_DESIGN (per-module) and /ctx-architecture-enrich @@ -1374,11 +1395,21 @@ Not a fit (keep in `ctx`): graph/RAG MCP is configured in .ctxrc, verify connection status, recommend installation if missing #added:2026-03-25-120000 -- [ ] Explore pluggable graph tool interface — replace hardcoded GitNexus +- [-] Explore pluggable graph tool interface — replace hardcoded GitNexus references in skill text with configurable .ctxrc graph_tool key. Skills use template placeholder instead of literal tool names. Define minimum interface contract (query, context, impact). Spec: `ideas/spec-mcp-warm-up-ceremony.md` #added:2026-03-25-120000 + **Skipped 2026-05-23**: contradicts the committed-to-GitNexus + direction recorded in DECISIONS.md "MCP gateway not worth the + coupling cost". Pluggable abstraction implies multiple + candidate graph tools, which in turn implies ctx vouches for + the interface contract across implementations — exactly the + ownership coupling we're avoiding. If a second viable graph + tool emerges that's worth the cost of pluggability, revisit + by un-skipping; the design sketch in + `ideas/spec-mcp-warm-up-ceremony.md` stays available as the + starting point. ### Phase: ctx Hub follow-ups (PR #60)