Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .context/DECISIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
<!-- INDEX:START -->
| Date | Decision |
|----|--------|
| 2026-05-23 | MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx |
| 2026-05-23 | Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive |
| 2026-05-22 | OpenCode plugin: agent shell tool not anchored to project root under cwd-anchored |
| 2026-05-21 | Substrate vs. artifact placement: .context/ vs. project root |
Expand Down Expand Up @@ -149,6 +150,34 @@ For significant decisions:

-->

## [2026-05-23-020000] MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx

**Status**: Accepted

**Context**: Builds on the 2026-03-12 "Recommend companion RAGs as peer MCP servers not bridge through ctx" and the earlier 2026-03-06 "Peer MCP model for external tool integration" decisions. Those framed the choice as architectural (markdown-on-filesystem invariant, avoid plugin registries). The new framing, surfaced during the triage of architecture-pipeline tasks, names a stronger ownership-shaped reason: an MCP gateway through ctx would couple ctx to the lifecycle of every gatewayed tool. If ctx proxied GitNexus, users couldn't independently `pip install gitnexus` or uninstall it — ctx would become the install/uninstall surface, the upgrade path, the version-compatibility owner. That coupling is a tax we don't want to pay for a tool we don't ship.

**Decision**: MCP gateway not worth the coupling cost; companion tools stay peer-MCP and remain not-vouched-for-by-ctx.

**Rationale**: Three independent considerations converge:

1. **Composition is already MCP's job.** Agents already compose multiple MCP servers. Adding a gateway through ctx duplicates the composition layer without adding capability — the agent could just talk to GitNexus directly. The peer model preserves that property.
2. **Ownership coupling is bidirectional.** A gateway makes ctx vouch for the peer (install, uninstall, version compatibility, error surface translation). It also makes the peer's failures surface as ctx failures from the agent's perspective, blurring the diagnostic boundary. Both directions add support burden disproportionate to the value of "one extra abstraction layer".
3. **The skills already work without it.** `/ctx-architecture-enrich` and `/ctx-architecture-failure-analysis` reference GitNexus by name in their SKILL.md instructions. The agent invokes GitNexus directly via its own MCP client. No gateway involved, no abstraction needed — the skill names the tool it expects and the agent either has it configured or doesn't. Doctor-style checks (existing TASKS.md item at line 1346) handle the "is it there?" surface without proxying.

Alternatives considered and rejected: (1) Gateway through ctx — rejected for the ownership reasons above. (2) Pluggable graph-tool abstraction with multiple candidate implementations (the now-skipped TASKS.md item) — implies ctx vouches for the interface contract across implementations, same ownership trap. (3) Optional gateway as opt-in — added complexity without removing the coupling for users who opt in; cleaner to have no gateway at all.

**Consequence**:

- **Pluggable graph tool interface task** (TASKS.md "Explore pluggable graph tool interface", `#added:2026-03-25-120000`) **skipped** as a direct consequence — pluggability without ownership is incoherent.
- **GitNexus stays named-by-convention** in skill text. SKILL.md instructions can reference `gitnexus.*` MCP tool names directly; agents either have the configuration or fail explicitly.
- **Architecture pipeline 4th step** (`ctx-architecture-next`, added today) is *itself* gateway-free: it consumes only the Markdown artifacts produced by the prior three steps, so the synthesis layer has no MCP dependency at all. That's the right shape for any future pipeline-completing skill: read what's on disk, write a new artifact.
- **Doctor / preflight checks** for companion-tool availability remain valid (TASKS.md line 1346, "Update `ctx doctor` to check for graph tool availability"). Checking that a peer exists is not the same as proxying through it.
- **The earlier 2026-03-12 peer-MCP decision is not superseded** — it's reinforced. This entry adds the ownership lens; the architectural reasoning from that entry still applies.

See also: `ideas/spec-companion-intelligence.md` (the original peer-MCP design), `ideas/gitnexus-contextmode-analysis.md`, the now-skipped pluggable-interface task in TASKS.md.

---

## [2026-05-23-001500] Keep `i18n.Fold` strict; add `i18n.MatchKey` as the separate diacritic-insensitive primitive

**Status**: Accepted
Expand Down
11 changes: 11 additions & 0 deletions .context/LEARNINGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ DO NOT UPDATE FOR:
<!-- INDEX:START -->
| Date | Learning |
|----|--------|
| 2026-05-23 | Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone |
| 2026-05-23 | Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK |
| 2026-05-22 | vitest's mocked `execFile` fires callbacks synchronously; real Node defers to `process.nextTick` — closure-capture patterns can TDZ-trap under the mock |
| 2026-05-22 | Double-excluded tests rot compounding — re-enable cost = sum of all drift since last green, not just the original bug |
Expand Down Expand Up @@ -158,6 +159,16 @@ DO NOT UPDATE FOR:

---

## [2026-05-23-003000] Closing a stale TASKS.md item often means writing the test, not the code — verify before assuming the work is undone

**Context**: TASKS.md line 375 ("Improve hub failover client: distinguish auth errors from connection errors") had been open since 2026-04-08. On triage, `internal/hub/failover.go:61-63` already called `authErr(callErr)` and returned immediately on Unauthenticated/PermissionDenied; `internal/hub/err_check.go:22-30` `authErr()` checked exactly those two codes. The behavior was implemented in the original failover feature commit (8bcb6208) without the task being closed. But the test suite never asserted the invariant — three existing failover tests covered happy path, skip-bad-peer, and all-bad-peers, none of them exercised "auth fails → walk stops". A future refactor could have silently deleted the auth-fast-fail branch and all three would still pass. Commit 22cffc27 added `TestFailoverClient_FailsFastOnAuthError` and closed the task.

**Lesson**: Stale TASKS.md items frequently describe work that's *already done in code* but *not asserted in tests*. The task stays open not because nothing happened but because nothing pinned the behavior down so the task author could mark it complete. Reading a task description and assuming the code surface is missing is a misdiagnosis. The right pattern: `git log` / `git blame` / grep the symbols the task names; if the implementation exists, the task's value shifts from "build the thing" to "lock the thing down with a test that would catch its regression". Closes the task AND defends the behavior.

**Application**: When triaging TASKS.md, especially items older than a few weeks, run a "what's the implementation status?" sweep before scoping work. For each candidate: grep the function/file/behavior the task names; if it exists, check the test file for an assertion that exercises the named invariant (not just adjacent invariants). If the assertion is missing, the task closes by writing the regression test — frequently a single test function. This pattern applies to behavior-named tasks ("X should fail fast on Y", "Z should reject malformed W") much more than to feature-named tasks ("add the X command"). For ctx specifically, hub/connect/replication-adjacent tasks accreted this way during the original implementation push; the failover-auth task was one example, others (file locking on connect sync, fanout broadcast entry loss) are still on TASKS.md and may warrant the same triage.

---

## [2026-05-23-001000] Unicode block separation makes diacritic-stripping surgical — no per-script handling needed for Arabic/Indic/Hebrew/CJK

**Context**: While building `i18n.MatchKey` (commit 978582f5) for diacritic-insensitive placeholder matching, the natural reflex was "this is going to need per-script special cases — CJK doesn't have case, Arabic has shadda/fatha that are meaning-changing, Bengali vowel signs are script-essential, Hebrew niqqud distinguishes words." I sized the work assuming we'd need a script-aware policy, possibly with a locale config or an opt-in flag for "strip all combining marks" vs "strip only Latin-style decoration". Empirical test across Turkish/German/French/Spanish/Catalan/Czech/Vietnamese (should collapse) and Arabic/Bengali/Devanagari/Hindi/Hebrew/Chinese/Korean (should preserve) showed the entire policy fits in one numeric range: U+0300..U+036F.
Expand Down
97 changes: 74 additions & 23 deletions .context/TASKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,30 +372,23 @@ TASK STATUS LABELS:

- [ ] Human: test `ctx init` on a fresh ubuntu install.

- [ ] Improve hub failover client: distinguish auth errors
- [x] Improve hub failover client: distinguish auth errors
(Unauthenticated/PermissionDenied) from connection errors. Fail fast on auth
failures instead of cycling through all peers with the same invalid token.
#priority:low #added:2026-04-08-194612

- [ ] Add file locking to ctx connect sync state to prevent concurrent sync
races. Two sync processes (hook + manual) can both load the same LastSequence,
process the same entries, and write duplicate content to .context/shared/.
#priority:medium #added:2026-04-08-194557

- [ ] Fix fanout broadcast entry loss: non-blocking send drops entries to slow
listeners silently. Log when entries are dropped. Consider per-listener
backpressure or disconnect-on-lag. Buffer of 64 is too small for busy hubs.
#priority:medium #added:2026-04-08-194542

- [ ] Prevent duplicate client registration in hub store: RegisterClient should
reject if ProjectName already exists. Add token revocation support (delete
client by ID/project). Currently tokens are valid forever with no way to
disable compromised ones. #priority:medium #added:2026-04-08-194529

- [ ] Fix hub cluster: NewCluster result is discarded (not stored on Server), so
Raft runs but leadership status is never queryable. Store cluster reference on
Server, wire IsLeader/LeaderAddr into Status RPC and hub status command.
#priority:medium #added:2026-04-08-194511
#priority:low #added:2026-04-08-194612 #completed:2026-05-23
Implementation already landed (commit 8bcb6208, the original failover
feature): `internal/hub/failover.go:61-63` calls `authErr(callErr)` and
returns immediately on auth errors; `internal/hub/err_check.go:22-30`
`authErr()` checks both `codes.Unauthenticated` and
`codes.PermissionDenied`. The task was open because no test specifically
asserted the auth-fast-fail path — the three existing failover tests
cover happy-path, skip-bad-peer, and all-bad-peers but not the
"stop walking on auth failure" invariant. Added
`TestFailoverClient_FailsFastOnAuthError`: seeds a bogus token, lists
two peers (real server first, unrouted port second), asserts the
returned gRPC code is Unauthenticated/PermissionDenied rather than
Unavailable — an Unavailable would prove the walk cycled past auth
into the unrouted second peer (the exact regression to catch).

- [ ] Use crypto/subtle.ConstantTimeCompare for hub token validation instead of
string equality. Current Store.ValidateToken uses == which is vulnerable to
Expand Down Expand Up @@ -522,6 +515,54 @@ global state mutation.
discovered patterns against known failure modes in similar
systems. #added:2026-03-25-060000

- [ ] ctx-architecture-next — fourth step in the architecture
pipeline (map → enrich → hunt → **prescribe**).
**Context**: The three existing skills produce inputs
(`ARCHITECTURE.md`, `DETAILED_DESIGN*.md` from
`/ctx-architecture`; enriched verifications from
`/ctx-architecture-enrich`; ranked failure inventory from
`/ctx-architecture-failure-analysis`'s `DANGER-ZONES.md`).
But the agent then has to synthesize "so what do I DO?" on
its own, every time, from those raw artifacts. The fourth
step closes the pipeline by producing `NEXT-ACTIONS.md` —
a sequenced, prioritized fix plan that maps each danger
zone to a concrete next move (refactor, test, doc,
escalate, accept) with effort estimates and a suggested
order.
**Distinct from ctx-architecture-extend (skipped)**: that
was about *where features grow*; this is about *what to
fix first*. Extend overlapped with DETAILED_DESIGN and
enrich's registration sites. Next has no overlap — pure
synthesis layer over the prior three artifacts. The
pipeline is now 4 because each step has a distinct output
document: map(ARCHITECTURE) → enrich(verified ARCHITECTURE)
→ hunt(DANGER-ZONES) → prescribe(NEXT-ACTIONS).
**No MCP gateway required**: this skill consumes only the
three Markdown artifacts produced by the prior skills,
which already absorbed any GitNexus-derived signal during
the enrich step. The synthesis is a pure-reasoning pass on
the agent side. Aligns with the decision that ctx does not
proxy / gateway companion MCPs; see DECISIONS.md
"MCP gateway not worth the coupling cost".
Scope sketch (refine when implementing):
- [ ] Design SKILL.md: inputs (three artifacts), output
shape (`NEXT-ACTIONS.md` with ranked sections), quality
checklist (every action cites a danger zone; every
danger zone has an action OR an explicit "accepted"
rationale).
- [ ] Define the action taxonomy: refactor, test, doc,
escalate, accept. Each carries effort estimate (S/M/L)
and a suggested sequence position.
- [ ] Reference run against ctx itself: produce
`NEXT-ACTIONS.md` from the existing DANGER-ZONES.md if
one has been generated; otherwise generate the whole
4-step pipeline against ctx as the worked example.
- [ ] Document the pipeline order in
`docs/recipes/architecture-mapping.md` (or wherever the
existing 3-step recipe lives): "run all four in
sequence; each step's output feeds the next".
#priority:medium #added:2026-05-23

- [-] ctx-architecture-extend
Skipped: extension point analysis is covered by /ctx-architecture
DETAILED_DESIGN (per-module) and /ctx-architecture-enrich
Expand Down Expand Up @@ -1354,11 +1395,21 @@ Not a fit (keep in `ctx`):
graph/RAG MCP is configured in .ctxrc, verify connection status, recommend
installation if missing #added:2026-03-25-120000

- [ ] Explore pluggable graph tool interface — replace hardcoded GitNexus
- [-] Explore pluggable graph tool interface — replace hardcoded GitNexus
references in skill text with configurable .ctxrc graph_tool key. Skills use
template placeholder instead of literal tool names. Define minimum interface
contract (query, context, impact). Spec:
`ideas/spec-mcp-warm-up-ceremony.md` #added:2026-03-25-120000
**Skipped 2026-05-23**: contradicts the committed-to-GitNexus
direction recorded in DECISIONS.md "MCP gateway not worth the
coupling cost". Pluggable abstraction implies multiple
candidate graph tools, which in turn implies ctx vouches for
the interface contract across implementations — exactly the
ownership coupling we're avoiding. If a second viable graph
tool emerges that's worth the cost of pluggability, revisit
by un-skipping; the design sketch in
`ideas/spec-mcp-warm-up-ceremony.md` stays available as the
starting point.

### Phase: ctx Hub follow-ups (PR #60)

Expand Down
43 changes: 43 additions & 0 deletions internal/hub/failover_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ package hub

import (
"testing"

"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
)

// TestFailoverClient_FirstPeerWorks verifies that the
Expand Down Expand Up @@ -102,3 +105,43 @@ func TestFailoverClient_AllBad(t *testing.T) {
t.Fatal("expected error when all peers bad")
}
}

// TestFailoverClient_FailsFastOnAuthError verifies that an
// auth failure on the first reachable peer halts the
// failover walk: subsequent peers are not contacted, since
// the same token would fail there too.
//
// The reachable first peer is a real server that rejects
// the invalid bearer with codes.Unauthenticated. The
// second peer is `127.0.0.1:1` — an unrouted port that
// would surface a connection-class error (Unavailable) if
// the walk continued past the first auth failure. So an
// Unauthenticated return code proves the walk stopped at
// the first peer; an Unavailable return code would prove a
// regression (the walk cycled past auth and tried the
// second peer, where dial succeeded but the unrouted port
// produced a different error class).
func TestFailoverClient_FailsFastOnAuthError(t *testing.T) {
_, conn, _ := startTestServer(t)
addr := conn.Target()

_, foErr := newFailoverClient(
[]string{addr, "127.0.0.1:1"},
"bogus-token-that-the-server-will-reject",
)
if foErr == nil {
t.Fatal("expected auth error on first peer; got nil")
}
s, ok := status.FromError(foErr)
if !ok {
t.Fatalf("expected gRPC status error; got %T: %v", foErr, foErr)
}
if s.Code() != codes.Unauthenticated && s.Code() != codes.PermissionDenied {
t.Errorf(
"got code %s; want Unauthenticated or PermissionDenied "+
"(if Unavailable, the walk cycled past auth into the "+
"unrouted second peer — auth-fast-fail regression)",
s.Code(),
)
}
}
64 changes: 64 additions & 0 deletions internal/hub/store_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,70 @@ func TestStoreRegisterAndValidate(t *testing.T) {
}
}

// TestStoreValidateToken_RejectsNearMissTokens pins the
// timing-attack-resistance contract of Store.ValidateToken.
//
// The implementation uses an O(1) map lookup on
// s.tokenIdx[bearerToken] followed by a defensive
// crypto/subtle.ConstantTimeCompare against the stored
// token. The CTC is technically redundant once the map
// lookup hits — Go map keys match exact-byte by
// definition — but it's the explicit signal of intent
// that would catch a future "simplification" PR
// collapsing both checks back to a single == or
// strings.HasPrefix.
//
// This test exercises the *behavior* that the CTC
// defends: no near-miss token (one byte off, prefix
// only, extra suffix bytes, case-shifted) ever
// validates. If a regression replaces the careful
// chain with a prefix matcher or a non-constant
// comparison, these cases start passing when they
// shouldn't.
func TestStoreValidateToken_RejectsNearMissTokens(t *testing.T) {
dir := t.TempDir()
s, err := NewStore(dir)
if err != nil {
t.Fatal(err)
}
const valid = "tok_abc123_with_some_length"
if regErr := s.RegisterClient(ClientInfo{
ID: "c1", ProjectName: "near-miss-proj", Token: valid,
}); regErr != nil {
t.Fatal(regErr)
}

// Sanity: the valid token still validates.
if s.ValidateToken(valid) == nil {
t.Fatal("valid token should validate; suite bug")
}

rejected := []struct {
name, token string
}{
{"empty", ""},
{"last byte changed", valid[:len(valid)-1] + "X"},
{"first byte changed", "X" + valid[1:]},
{"middle byte changed", valid[:len(valid)/2] + "X" + valid[len(valid)/2+1:]},
{"prefix only", valid[:len(valid)/2]},
{"extra suffix appended", valid + "X"},
{"case-shifted", "TOK_ABC123_WITH_SOME_LENGTH"},
{"whitespace-padded", " " + valid + " "},
{"all-X same length", "XXXXXXXXXXXXXXXXXXXXXXXXXXX"},
}
for _, tc := range rejected {
t.Run(tc.name, func(t *testing.T) {
if got := s.ValidateToken(tc.token); got != nil {
t.Errorf(
"ValidateToken(%q) = %+v; want nil "+
"(near-miss/partial-match must not validate)",
tc.token, got,
)
}
})
}
}

func TestStoreStats(t *testing.T) {
dir := t.TempDir()
s, err := NewStore(dir)
Expand Down
Loading