Skip to content

feat(kernel): §M11 session-liveness gate + §M12 plumbing hook#30

Merged
MSD21091969 merged 3 commits into
masterfrom
round11/pr3-m11-liveness-gate
Apr 21, 2026
Merged

feat(kernel): §M11 session-liveness gate + §M12 plumbing hook#30
MSD21091969 merged 3 commits into
masterfrom
round11/pr3-m11-liveness-gate

Conversation

@MSD21091969
Copy link
Copy Markdown
Collaborator

Summary

Ships §M11 (kernel-liveness as session-occupancy) per doctrine in kb/research/kernel/20260417-t187-kernel-proper.md and the design answers Guido landed on ffs0#33. Plumbs §M12's AdminScopeRewrite call site as dormant so PR 4 becomes a pure-additive diff.

Closes the three design questions from the T=171 M11/M12 implementation plan:

Q Answer shipped
Q1 — where does session URN come from? Explicit session_urn field on envelope wins; unambiguous reverse-lookup falls back; ambiguous/absent rejects
Q2 — heartbeat rewrites: above or below the liveness line? Below. Allowlist + SeedIfAbsent bypass
Q3 — PR 4 admin-scope classifier boundary Plumbed dormant (returns false). PR 4 fills in

Changes

graph/rewrite.go — envelope surface

New optional field SessionURN URN with json tag session_urn. Additive, backward-compatible (omitempty). Clients that omit it fall through the reverse-lookup path; clients where one actor drives multiple sessions MUST set it.

operad/session_context.go — resolvers + classifiers (new)

  • ResolveSessionForEnvelope(state, env) returns a structured ResolveSessionResult with one of six kinds: ActorIsSession, Explicit, Inferred, ExplicitMismatch, Ambiguous (populates Candidates), Absent.
  • SystemInternalEnvelope(env) allowlist: kernel-URN actors (sweep WF13, reactive type-drift, admin-authority MUTATEs) + ADD of infrastructure types (user, workstation, kernel).
  • AdminScopeRewrite(env, state) returns false — PR 3 plumbing hook. PR 4 fills it in.

kernel/liveness.go — the gate (new)

Runtime.checkLiveness(env) called from Apply + ApplyProgram before operad validation so failure paths are short. Registry-less mode is a no-op. Error messages cite §M11 / §M12 and name the failure mode.

kernel/runtime.go — integration

  • Apply routes through applyWithOptions(env, applyOptions{}). Public API unchanged.
  • SeedIfAbsent calls applyWithOptions(env, applyOptions{skipLiveness: true}) — bootstrap runs before any session exists, so requiring occupancy would deadlock the first run. The only structural bypass; all other emitters hit the allowlist or pass the resolver.

Replay unaffected (prospective-only invariant)

fold.Replay does not call checkLiveness — same pattern as PR 1. Pre-PR-3 persisted envelopes (no SessionURN field) rebuild state identically. TestReplay_PreservesPreM11Rewrites pins the invariant.

Tests (30 added)

operad/session_context_test.go (19):

  • Resolver: ActorIsSession, Explicit (OK + 3 mismatch variants), Inferred, Ambiguous (with Candidates), Absent
  • Classifier: kernel/sweep actors allowlisted, infrastructure ADDs allowlisted, user-actor non-infra ADD/LINK NOT allowlisted
  • AdminScopeRewrite dormant across three representative envelopes

kernel/liveness_test.go (11):

  • Apply paths for Explicit, Inferred, Ambiguous (rejected), Absent (rejected), ExplicitMismatch (rejected)
  • Allowlist: kernel-actor, infrastructure ADD
  • SeedIfAbsent bypass
  • ApplyProgram atomic rejection when one envelope fails liveness
  • fold.Replay invariant for pre-M11 logs
  • Registry-less no-op
go build ./...  # clean
go test ./...   # all packages pass

Stacking

  • Independent of feat(operad): RotateSessionOccupant helper (§M19 rotation) #29 per Guido's plan §4 (liveness reads ResolveSessionOccupant; rotation writes via RotateSessionOccupant). No merge conflict.
  • Unblocks PR 4 (§M12 fills in AdminScopeRewrite + extends classifier for kernel type ADDs per Guido's flag).
  • Kernel seed landing post-merge: since the allowlist covers infrastructure ADDs + kernel actors, and SeedIfAbsent bypasses structurally, the post-restart seed flow works without additional changes.

Doctrine references

  • kb/research/kernel/20260417-t187-kernel-proper.md §§M11, M12
  • kb/research/kernel/20260421-t171-m11-m12-implementation-plan.md
  • ffs0#33 (handoff thread), specifically Guido's design-question answers

🤖 Generated with Claude Code

Implements §M11 (kernel-liveness as session-occupancy) per doctrine in
kb/research/kernel/20260417-t187-kernel-proper.md and the design answers
Guido landed on ffs0#33. Plumbs §M12's AdminScopeRewrite call site as
dormant so PR 4 becomes a pure-additive diff.

Closes the three design questions from the T=171 M11/M12 implementation
plan:

  Q1 — session_urn on envelope: explicit field wins, unambiguous reverse-
       lookup falls back, ambiguous or absent rejects.
  Q2 — heartbeat line: below. Allowlist covers sweep WF13, kernel-actor
       emissions, and infrastructure ADDs (user/workstation/kernel).
       SeedIfAbsent additionally bypasses liveness structurally so
       bootstrap from zero state works on every fresh kernel.
  Q3 — admin-scope classifier: dormant in PR 3 (returns false). PR 4
       fills the logic for authority_scope=kernel MUTATEs on non-kernel
       nodes + ontology-governed type touches.

Envelope surface (graph/rewrite.go):

- New optional SessionURN URN field with json tag "session_urn". Additive
  and backward-compatible: clients that omit it fall through the reverse-
  lookup path; clients where one actor drives multiple sessions must
  set it. Docstring updated with §M11 reference.

Resolver (operad/session_context.go):

- ResolveSessionForEnvelope(state, env) returns a structured
  ResolveSessionResult with one of six Kind values:
    - ResolveSessionActorIsSession  — actor's node.TypeID == "session"
    - ResolveSessionExplicit        — env.SessionURN verified
    - ResolveSessionInferred        — unambiguous reverse-lookup
    - ResolveSessionExplicitMismatch
    - ResolveSessionAmbiguous       (Candidates populated)
    - ResolveSessionAbsent
  Called by the kernel liveness gate; exported so PR 4's §M12 pass can
  reuse the same resolution for capability walks.

- SystemInternalEnvelope(env) allowlist classifier:
    - kernel-URN actors (sweep WF13, reactive type-drift, admin-authority
      MUTATEs)
    - ADD of infrastructure types (user, workstation, kernel)
  Kept conservative; additions weaken §M11.

- AdminScopeRewrite(env, state) returns false — PR 3 plumbing hook for
  §M12. PR 4 fills in authority_scope=kernel MUTATE detection and
  ontology-governed-type touches.

Gate (kernel/liveness.go):

- Runtime.checkLiveness(env) is called from Apply and ApplyProgram BEFORE
  operad validation so failure paths are short.
- Registry-less mode: no-op. Matches existing validator shape.
- Error messages cite §M11 / §M12 and name the failure mode so log
  readers can trace the doctrine path.

SeedIfAbsent bypass (kernel/runtime.go):

- Apply now routes through applyWithOptions(env, applyOptions{}). Public
  API unchanged.
- SeedIfAbsent calls applyWithOptions(env, applyOptions{skipLiveness: true}).
  Bootstrap runs before any session exists, so requiring occupancy would
  deadlock the first run. Only exception; all other emitters either hit
  the allowlist or pass the resolver.

Replay unaffected (prospective-only invariant):

- fold.Replay does not call checkLiveness (same pattern as PR 1).
  Pre-PR-3 persisted envelopes (no SessionURN field) rebuild state
  identically. Test pins the invariant.

Tests (30 added):

  operad/session_context_test.go — 19:
    - Resolver: ActorIsSession, Explicit (OK + three mismatch variants),
      Inferred, Ambiguous (Candidates populated), Absent
    - Classifier: kernel/sweep actors, infrastructure ADDs, user-actor
      non-infra ADD/LINK (both rejected)
    - AdminScopeRewrite dormant across three representative envelopes

  kernel/liveness_test.go — 11:
    - Apply paths for Explicit, Inferred, Ambiguous (rejected), Absent
      (rejected), ExplicitMismatch (rejected)
    - Allowlist: kernel-actor and infrastructure ADD
    - SeedIfAbsent bypass
    - ApplyProgram atomic rejection when one envelope fails liveness
    - fold.Replay invariant for pre-M11 logs
    - Registry-less no-op

Build + test clean:
  go build ./...
  go test ./...

Stacking: PR 3 is independent of #29 per Guido's §4 plan. Merge order
within round 11: this PR → PR 4 (§M12 fills in AdminScopeRewrite +
extends classifier) → round close.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 21, 2026 13:37
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the §M11 liveness gate and §M12 admin-capability hooks to enforce session occupancy requirements for kernel rewrites. It introduces a SessionURN field to the Envelope struct and adds logic to resolve session contexts, with exemptions for system-internal and bootstrap operations. Review feedback highlights critical issues including missing constant and function definitions that will prevent compilation, as well as data races and logic errors in the Runtime where liveness checks are performed without adequate locking or consideration for state changes within atomic programs.

Comment on lines +3 to +5
import (
"moos/kernel/internal/graph"
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The constants hasOccupantSrcPort and isOccupantOfTgtPort are used later in this file (lines 114 and 135) but are not defined anywhere in the package. This will cause a compilation error. Please define them here to match the system's port naming convention.

Suggested change
import (
"moos/kernel/internal/graph"
)
import (
"moos/kernel/internal/graph"
)
const (
hasOccupantSrcPort = "has-occupant"
isOccupantOfTgtPort = "is-occupant-of"
)

//
// Returns nil on pass, a fmt.Errorf wrapping the failure mode on reject.
// Error messages name the doctrine section so log readers can trace back.
func (rt *Runtime) checkLiveness(env graph.Envelope) error {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To avoid race conditions and support ApplyProgram's evolving state, checkLiveness should accept the state as an argument instead of accessing rt.state directly. This allows the caller to provide a thread-safe snapshot or a working state.

Suggested change
func (rt *Runtime) checkLiveness(env graph.Envelope) error {
func (rt *Runtime) checkLiveness(state graph.GraphState, env graph.Envelope) error {


// Step 2 — resolve session context. Pass the live state so reverse
// has-occupant walks see the most recent seat assignments.
res := operad.ResolveSessionForEnvelope(rt.state, env)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Use the state parameter instead of rt.state to ensure consistency and support ApplyProgram's working state.

Suggested change
res := operad.ResolveSessionForEnvelope(rt.state, env)
res := operad.ResolveSessionForEnvelope(state, env)

Comment on lines +78 to +79
if operad.AdminScopeRewrite(env, rt.state) {
if !operad.CheckAdminCapability(rt.state, env.Actor) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Use the state parameter instead of rt.state. Additionally, operad.CheckAdminCapability is not defined in the operad package in this PR, which will cause a compilation error. A stub must be provided even if the full implementation is deferred to PR 4.

Suggested change
if operad.AdminScopeRewrite(env, rt.state) {
if !operad.CheckAdminCapability(rt.state, env.Actor) {
if operad.AdminScopeRewrite(env, state) {
if !operad.CheckAdminCapability(state, env.Actor) {

Comment thread internal/kernel/runtime.go Outdated
// session context fails fast without paying the structural-validation
// cost. The check is registry-aware: registry-less mode passes through.
if !opts.skipLiveness {
if err := rt.checkLiveness(env); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Calling checkLiveness here is a data race because it accesses rt.state without holding rt.mu. Since rt.state is mutated under the write lock in other goroutines, this check must be moved inside the lock (after line 112) and updated to use the current state snapshot.

// single session-less envelope fails the whole batch fast.
func (rt *Runtime) ApplyProgram(envelopes []graph.Envelope) ([]graph.EvalResult, error) {
for _, env := range envelopes {
if err := rt.checkLiveness(env); err != nil {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There are two significant issues here: 1) Accessing rt.state inside checkLiveness without a lock is a data race. 2) Checking liveness for all envelopes against the initial state breaks programs that create a session and use it in the same batch (e.g., ADD session followed by an ADD using that session). This check must be moved into the workingState loop (under the lock) to ensure it validates against the state as it evolves sequentially.

@github-project-automation github-project-automation Bot moved this to Todo in mo:os Apr 21, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements §M11 session-liveness gating in the kernel (with a prospective-only invariant for replay) and adds §M12 “admin-scope” plumbing as a dormant hook, while extending the rewrite envelope surface with an optional session_urn field to support unambiguous session resolution.

Changes:

  • Add Envelope.SessionURN (json:"session_urn,omitempty") to support explicit session context on rewrites.
  • Introduce pure operad helpers to resolve session context and classify system-internal envelopes; add a dormant AdminScopeRewrite hook.
  • Integrate a new Runtime.checkLiveness gate into Apply/ApplyProgram, with SeedIfAbsent bypassing liveness for bootstrap seeding; add targeted tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/graph/rewrite.go Adds optional SessionURN field to the envelope JSON surface and documents its semantics.
internal/operad/session_context.go Implements session resolution + allowlist classifier and provides a dormant §M12 admin-scope classifier hook.
internal/operad/session_context_test.go Adds unit tests covering resolver outcomes and classifier behavior (incl. dormant §M12 hook).
internal/kernel/liveness.go Adds Runtime.checkLiveness implementing §M11 gate + §M12 hook integration.
internal/kernel/runtime.go Wires liveness into Apply/ApplyProgram and adds internal applyWithOptions + SeedIfAbsent bypass.
internal/kernel/liveness_test.go Adds tests for liveness acceptance/rejection paths, allowlist behavior, program atomic rejection, replay invariance, and registry-less mode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +76 to +80
func ResolveSessionForEnvelope(state graph.GraphState, env graph.Envelope) ResolveSessionResult {
// Case 1 — actor is itself a session node. No hop needed.
if actorNode, ok := state.Nodes[env.Actor]; ok && actorNode.TypeID == "session" {
return ResolveSessionResult{Kind: ResolveSessionActorIsSession, SessionURN: env.Actor}
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolveSessionForEnvelope treats env.Actor being a session node as an automatic success (ResolveSessionActorIsSession) without verifying that the session is actually occupied (has a canonical WF19 has-occupant relation). Because kernel.checkLiveness treats this Kind as a pass, an unoccupied session node can currently emit user-space rewrites and bypass §M11’s “session-occupancy” gate. If session-actor envelopes are meant to be allowed, the resolver (or the liveness gate) should still require the session to be occupied; otherwise return Absent/ExplicitMismatch for unoccupied sessions.

Copilot uses AI. Check for mistakes.
Comment thread internal/kernel/runtime.go Outdated
// session context fails fast without paying the structural-validation
// cost. The check is registry-aware: registry-less mode passes through.
if !opts.skipLiveness {
if err := rt.checkLiveness(env); err != nil {
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applyWithOptions calls rt.checkLiveness(env) before taking rt.mu but checkLiveness reads rt.state (maps + indexes). This can race with concurrent Apply/ApplyProgram calls and lead to concurrent map read and map write panics under load. Fix by taking rt.mu.RLock() around the liveness check (or have checkLiveness operate on a snapshot/cloned state captured under RLock) before proceeding to validation/locking.

Suggested change
if err := rt.checkLiveness(env); err != nil {
rt.mu.RLock()
err := rt.checkLiveness(env)
rt.mu.RUnlock()
if err != nil {

Copilot uses AI. Check for mistakes.
Comment on lines 174 to 181
func (rt *Runtime) ApplyProgram(envelopes []graph.Envelope) ([]graph.EvalResult, error) {
for _, env := range envelopes {
if err := rt.checkLiveness(env); err != nil {
return nil, err
}
if err := rt.validate(env); err != nil {
return nil, err
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApplyProgram performs rt.checkLiveness(env) for each envelope before acquiring rt.mu, but checkLiveness reads rt.state maps. If any other goroutine is applying/persisting at the same time, this introduces a data race and can panic with concurrent map access. Consider taking rt.mu.RLock() for the whole preflight loop (liveness + validate) or capturing a state snapshot under RLock and running all liveness checks against that snapshot.

Copilot uses AI. Check for mistakes.
Addresses all three Copilot line-level findings on #30.

(1) Unoccupied-session-as-actor bypass (session_context.go:80):

ResolveSessionForEnvelope previously returned ResolveSessionActorIsSession
unconditionally whenever env.Actor was a session node. An orphan session
(node exists but has no has-occupant relation) could therefore emit
user-space rewrites and sail past §M11. Fixed by requiring the session
itself to have at least one canonical has-occupant relation pointing at
a principal — mirrors CheckAdminCapability's hop pattern. Unoccupied
session-as-actor now returns ResolveSessionAbsent; the kernel gate
rejects with the same §M11 message as the general absent case.

New helper sessionHasAnyOccupant reuses ResolveSessionOccupant so the
principal-type check (user|agent) stays in one place.

(2) Apply: state-read race with concurrent writers (runtime.go:102):

applyWithOptions called rt.checkLiveness before acquiring rt.mu, and
checkLiveness reads rt.state.Nodes / Relations / indexes. Under
concurrent writes this is a data race with potential "concurrent map
read and map write" panics. Wrapped the checkLiveness call in
rt.mu.RLock() / RUnlock() so state reads are synchronised. The short
window between RUnlock and Lock is acceptable — fold.Evaluate under
the write-lock is the authoritative apply-time check for structural
invariants.

(3) ApplyProgram: same race across batch preflight (runtime.go:181):

The preflight loop called checkLiveness on every envelope against
rt.state without a lock. Wrapped the entire preflight loop in a single
RLock / RUnlock so liveness observations across the batch are
consistent with each other. Release before acquiring Lock for the
apply body.

No lock-upgrade is attempted — Go's sync.RWMutex does not support it —
and reactive / sweep paths are unaffected because they use their own
lock discipline (applyReactiveLocked is called with Lock already held).

Tests (2 added):

- operad.TestResolveSessionForEnvelope_ActorIsSession_Unoccupied — pins
  the resolver fix. Unoccupied session returns Absent, not ActorIsSession.
- kernel.TestApply_M11_UnoccupiedSessionAsActor_Rejected — integration:
  end-to-end Apply rejects the envelope with §M11 error.
- kernel.TestApply_M11_OccupiedSessionAsActor_Accepted — positive pair:
  an occupied session-as-actor still passes, covering the kernel-internal
  session-heartbeat / turn-count path.

go build ./...                    # clean
go test ./...                     # all packages pass

Note: go test -race skipped locally (CGO / gcc not available on this
Windows host). Fix correctness argued by construction: RLock held for
all state reads in checkLiveness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MSD21091969
Copy link
Copy Markdown
Collaborator Author

Review fixup pushed as 6e67893. All 3 Copilot findings addressed.

(1) session_context.go:80 — unoccupied-session-as-actor bypass

Real correctness bug. Previously ResolveSessionForEnvelope returned ResolveSessionActorIsSession any time env.Actor was a session node, without checking whether that session was occupied. An orphan session (node exists, no has-occupant) could emit user-space rewrites and bypass §M11 entirely.

Fixed by requiring the session to have at least one canonical has-occupant relation pointing at a principal. Mirrors CheckAdminCapability's hop pattern. Unoccupied session-as-actor now returns ResolveSessionAbsent; the kernel gate rejects with the same §M11 error as the general absent case.

New helper sessionHasAnyOccupant reuses ResolveSessionOccupant so the principal-type check (user|agent) stays in one place.

(2) runtime.go:102applyWithOptions state-read race

applyWithOptions called rt.checkLiveness(env) before acquiring rt.mu; checkLiveness reads rt.state.Nodes / Relations / indexes. Under concurrent writers this is a genuine data race with potential "concurrent map read and map write" panics.

Wrapped the checkLiveness call in rt.mu.RLock() / RUnlock(). The window between RUnlock and Lock is acceptable — fold.Evaluate under the write-lock is the authoritative apply-time check. No lock-upgrade attempted (Go's sync.RWMutex doesn't support it).

(3) runtime.go:181ApplyProgram same race across batch preflight

Wrapped the entire preflight loop in a single RLock / RUnlock. Liveness observations across the batch are now consistent with each other. Released before acquiring Lock for the apply body.

applyReactiveLocked and sweep paths are unaffected — they use their own lock discipline (reactive is called with Lock already held; sweep goes through ApplyProgram from a separate goroutine).

Tests

Three new, all pass:

  • operad.TestResolveSessionForEnvelope_ActorIsSession_Unoccupied — pins the resolver fix
  • kernel.TestApply_M11_UnoccupiedSessionAsActor_Rejected — integration, end-to-end rejection
  • kernel.TestApply_M11_OccupiedSessionAsActor_Accepted — positive pair; occupied session-as-actor still passes (covers kernel-internal session-heartbeat path)
go build ./...   # clean
go test ./...    # all packages pass

go test -race skipped locally (CGO/gcc not available on this Windows host). Correctness argued by construction: RLock held for all state reads in checkLiveness. Happy to run race tests if CI has a Linux runner with race support.

Ready for re-review.

Copy link
Copy Markdown
Collaborator Author

@MSD21091969 MSD21091969 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on 6e67893 (the fixup commit). Gemini + Copilot reviews were posted against d31251b (initial), before your 12-minute-later fixup that closed both HIGH-priority concerns. The current head addresses everything substantive.

Verification of prior concerns

Reviewer Flag Status on 6e67893
Gemini HIGH session_context.go:5 "Constants not defined, compile error" Stale. hasOccupantSrcPort / isOccupantOfTgtPort were defined by moos-kernel#29 at occupancy.go:43 and merged to master at 13:21 — 15 min before PR 30 opened. Your branch picks them up via master. Verified via grep -rn hasOccupantSrcPort internal/operad/ on local master post-pull.
Gemini HIGH liveness.go:43/60/79 "Data race — reads rt.state without lock" Fixed. applyWithOptions now does rt.mu.RLock() / checkLiveness / rt.mu.RUnlock() around the check. The comment is explicit about the release-before-write-lock pattern (RWMutex no upgrade).
Gemini HIGH runtime.go:102/196 "Data race + ApplyProgram against initial state" Fixed for the race. ApplyProgram holds rt.mu.RLock() for the entire preflight loop (liveness + validate). Initial-state-vs-working-state is a design call I'll raise separately below.
Gemini HIGH liveness.go:79 "CheckAdminCapability not defined" Stale. Exists at occupancy.go:142 on master (pre-round-11, confirmed by the M11/M12 plan §1 helper table).
Copilot session_context.go:91 "ActorIsSession bypass lets orphan sessions emit rewrites" Fixed. Resolver now returns ResolveSessionAbsent when an actor-session has no canonical has-occupant. Test TestApply_M11_UnoccupiedSessionAsActor_Rejected + positive-pair TestApply_M11_OccupiedSessionAsActor_Accepted pin the invariant. Comment on ResolveSessionForEnvelope §1 is explicit about the reasoning. Nice close.
Copilot runtime.go:102/203 "Data race on rt.state in pre-lock checkLiveness" Fixed. Same RLock pattern.

Request Gemini + Copilot to re-run on 6e67893 if you want their LGTM on record; the concerns they raised don't re-apply.

Core design — all right

  • session_urn field (graph/rewrite.go) — additive + omitempty, the shape I recommended. Backward-compatible with clients that don't set it, which covers pre-PR-30 envelopes at replay and external clients during migration.
  • Structured resolver (ResolveSessionResult + 6-kind enum) — the Candidates field on Ambiguous is a good touch; lets the error message list what the caller could set session_urn to. Error surface quality matters here.
  • Allowlist + SeedIfAbsent bypass — kernel-URN actors, infrastructure ADDs, seed bootstrap. That's the three classes I'd have asked for. The skipLiveness option kept internal (not exported) is the right ergonomic call.
  • AdminScopeRewrite plumbed dormant — returning false in PR 3 and letting PR 4 fill it in means PR 4 is a pure-additive surface. Keeps each PR's review scope small.
  • fold.Replay invariantTestReplay_PreservesPreM11Rewrites pins prospective-only. Same discipline as PR 1 / PR 27.

One design question — ApplyProgram preflight vs working state

Gemini's runtime.go:196 included a semantic concern beyond the race: "Checking liveness for all envelopes against the initial state breaks programs that create a session and use it in the same batch."

I think this is a non-issue under the current doctrine — here's why, and I want your read:

The resolver checks env.SessionURN or env.Actor's session context — those refer to the emitter's session (the session the rewrite runs under), not the session being created or modified. So a batch like:

seq N   ADD session:foo              actor=user:sam, session_urn=sam.governance
seq N+1 LINK ... on session:foo      actor=user:sam, session_urn=sam.governance

Both envelopes' session context is sam.governance (exists, occupied). The newly-ADDed session:foo is the target of the rewrites, not the emitter's context. Initial-state check passes for both.

The scenario that would break:

seq N   ADD session:bar              actor=user:sam, session_urn=sam.governance
seq N+1 LINK ...                      actor=user:sam, session_urn=session:bar   # <- references just-created session

That's unusual (why would a batch create a session and immediately attribute emits to it?) — typically the kernel creates the session, then in a later round agents attribute envelopes to it. But it's not impossible: atomic-pair creation of session + first occupant is exactly this shape, and the seed flow does it today via skipLiveness.

My read: keep the initial-state check. Document the constraint: "an envelope's session_urn must reference a session that exists in state at the start of the batch; same-batch session creation is a bootstrap-via-seed-only pattern, protected by skipLiveness". Add a test pinning the rejection path. That's cheaper than refactoring ApplyProgram to thread working-state through the preflight.

If you disagree — e.g. you want to support atomic session-birth+first-occupant outside the seed path — we'd need to thread workingState through preflight, but I don't see a driving use case yet. Your call.

Minor doctrine polish — non-blocking

The comment on ResolveSessionForEnvelope (the one explaining the unoccupied-session-as-actor rejection) is excellent. Suggest mirror-ing the reasoning into the doctrine note at kb/research/kernel/20260421-t171-m11-m12-implementation-plan.md §2 — "Session-as-actor is permitted only when that session is itself occupied; mirrors §M12's hop-through-has-occupant pattern." One line in running-state's round-11 section too when round-close fires. Not blocking PR 30.

Merge path

Ready when you are. PR 4 (§M12 admin-cap fill-in) can open on top of this. Standing by for either.

— Guido, session:sam.governance

Guido flagged in #30 review that the ApplyProgram preflight checks §M11
against the state at batch start, not a working state that evolves
envelope-by-envelope. Per his suggested resolution — keep the initial-
state discipline (simpler, not batch-order-dependent) and document the
constraint explicitly — this test pins the rejection path for an
intra-batch session reference:

  env 1: ADD session:newborn        (passes under emitter=governance)
  env 2: session_urn=session:newborn (rejected: not in initial state)

If the preflight ever starts threading a working state through, this
test must be adjusted deliberately. The rejection behavior is the
design, not an accident.

Doctrine note tightened at
ffs0/kb/research/kernel/20260421-t171-m11-m12-implementation-plan.md §2
with:

- Session-as-actor rule (only permitted when session is itself occupied;
  mirrors §M12's hop pattern)
- ApplyProgram initial-state-check rule (explicit rationale for why
  working-state-through-preflight is NOT the design)

Test-only addition here; doctrine commit on ffs0 separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MSD21091969 MSD21091969 merged commit 6dbcbf9 into master Apr 21, 2026
@MSD21091969 MSD21091969 deleted the round11/pr3-m11-liveness-gate branch April 21, 2026 15:12
@github-project-automation github-project-automation Bot moved this from Todo to Done in mo:os Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants