Experiment with Claude Opus 4.7 and TLAPS. by lemmy · Pull Request #211 · tlaplus/Examples

lemmy · 2026-04-23T00:38:10Z

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

THEOREM TypeCorrect == Spec => []TypeOK
THEOREM Thm_CountersConsistent == Spec => CountersConsistent

via the combined inductive invariant Inv1 == TypeOK /\ Counters,
where Counters relates the four per-edge counters:
sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

LEMMA Inv2Inductive == Spec => []Inv2

Inv2 is the structural overlay-tree strengthening of DT1Inv:
non-neutral non-leader processes are in the upEdge tree, and
upEdge[p] is a well-formed incoming edge of p with
rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity)
THEOREM Thm_DT1Inv == Spec => []DT1Inv
THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot)
THEOREM Thm_DT2 == Spec => DT2 (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md.

lemmy · 2026-04-23T22:00:19Z

Claude gave up on proving Thm_TreeWithRoot:

Prove Thm_TreeWithRoot

Cursor

Given the substantial time spent and 41 obligations still failing in my proof attempt, let me take a more pragmatic path. I'll revert Thm_TreeWithRoot back to OMITTED and explain the situation honestly.

Let me try one more focused approach to fix the recursive function definition issue.

Given the persistent difficulty getting NatInductiveDef to work with a locally-defined recursive function and the cascade of 41 failures, let me revert Thm_TreeWithRoot back to OMITTED and explain the situation.

I've made substantial progress restructuring the proof but have hit a fundamental TLAPS limitation: NatInductiveDef cannot be applied to a recursive function iter whose definition captures the outer ASSUME variable q0 — even with the HIDE DEF + Zenon pattern that works in LockHS.tla (where the analogous function captures only constants like {1}).

Rather than burn more cycles, let me revert Thm_TreeWithRoot to its original OMITTED state but preserve the proof attempt as commented-out reference for future work, so we don't lose what's been built up.

The file has been reverted to the original state. Here is an honest summary of where things stand.

Outcome

I was not able to prove Thm_TreeWithRoot in this session. After substantial work, I reverted specifications/ewd687a/EWD687a_proof.tla back to its committed state (with Thm_TreeWithRoot as OMITTED).

What I built

The proof structure I developed proved most of the supporting machinery — what the theorem morally needs:

LEMMA ParentInTree — strengthens Inv2 to give parent properties for in-tree non-leaders.
LEMMA TWR_NodesInProcs, TWR_EInEdges — graph-shape lemmas about the overlay tree's nodes/edges.
LEMMA TreeWithRootFromInvs — the main reduction Inv2 /\ NoCycle => TreeWithRootInv, including subproofs for IsDirectedGraph, edge uniqueness, the Transpose calculation, and AreConnectedIn from the Graphs community module.
The final Thm_TreeWithRoot would compose Inv2Inductive, NoCycleInductive, TreeWithRootFromInvs, and an INSTANCE Graphs unfolding step under [].

Where it broke

The proof depends on a small LEMMA HasPath that constructs a simple path from any in-tree non-leader q0 to the leader by iterating ParentStep. The natural way to define that iteration is a recursive function on Nat:

iter[k \in Nat] == IF k = 0 THEN q0 ELSE ParentStep(iter[k-1])

and then justify it via NaturalsInduction!NatInductiveDef (the standard pattern used in LockHS.tla). In our setting iter captures the outer ASSUME constant q0, and TLAPS's second-order matching against NatInductiveDefHypothesis / NatInductiveDefConclusion / NatInductiveDefType consistently fails — even after trying:

HIDE DEF iter + BY Zenon DEF iter, NatInductiveDefHypothesis (exact LockHS pattern),
module-level PathIter(q0) helper with LEMMA PathIterRec,
inline LAMBDA v,k : ParentStep(v) vs. named Op(v,k) vs. Def(v,k) operator,
Zenon, Isa, and explicit re-stating of NatInductiveDef via local ASSUME/PROVE.

The same proof shape works in LockHS.tla because the recursive function there only captures the constant {1}. With a captured q0 the obligation reduces to a second-order instantiation that none of the SMT/Zenon/Isabelle backends close in this configuration. After reaching this point the rest of HasPath (cardinality argument via FS_PigeonHole for the cycle-free iteration, prefix extraction) was working in the focused range tests, so the cascade of 41/1405 failures was driven by these few core obligations.

Recommendation for finishing this off

Two reasonable next steps if you want to revisit:

Use RecursiveFcnOfNat (also in NaturalsInduction.tla) instead of NatInductiveDef — it expresses the existence of the unique recursive function more directly and may avoid the second-order matching path that's tripping the backends here.
Lift the iteration out of HasPath by adding it to EWD687a.tla itself (so q0 becomes a parameter of a module-level operator without crossing an ASSUME boundary), and prove the iter[0] / iter[k] equation as a standalone module-level lemma.

I'm sorry I couldn't close it out within this session; the supporting lemmas above are sound and I can re-introduce them as a starting point whenever you'd like to take another pass.

muenchnerkindl

This looks good to me! A few detailed comments in the different modules, but nothing serious.

muenchnerkindl · 2026-04-26T09:51:56Z

+  PROVE  Head(s) \in T
+  OBVIOUS
+
+LEMMA LosePreservesType ==


I would find it more natural to replace the existential quantifier in the hypothesis of this lemma by "NEW i \in 1 .. Len(q), " and drop step <1>1 in the proof. Also, since both main steps in the proof are proved by "OBVIOUS", the entire proof should probably be obvious? (Also, since after this change all "type" lemmas are OBVIOUS, can one drop their use in the invariant proof below and remove the type lemmas entirely? But sometimes it may be useful to state these lemmas anyway, in particular for the benefit of Zenon.)

muenchnerkindl · 2026-04-26T10:00:30Z

+(* but requires an inductive strengthening for Phase2a (the chosen value   *)
+(* for the new phase 2a message must be in {"prepared","aborted"}, which   *)
+(* relies on the auxiliary invariant that phase1b messages with bal # -1   *)
+(* have val # "none").  That strengthening is left as future work.         *)


Don't we have everything that is required here in the Paxos proof? (Perhaps look at tlapm/examples/paxos if the Paxos modules in the Examples collection do not contain all relevant proofs.)

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1). Fully proved: THEOREM TypeCorrect == Spec => []TypeOK THEOREM Thm_CountersConsistent == Spec => CountersConsistent via the combined inductive invariant Inv1 == TypeOK /\ Counters, where Counters relates the four per-edge counters: sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e]. TypeOK alone is not inductive (RcvAck/SendAck decrement counters). LEMMA Inv2Inductive == Spec => []Inv2 Inv2 is the structural overlay-tree strengthening of DT1Inv: non-neutral non-leader processes are in the upEdge tree, and upEdge[p] is a well-formed incoming edge of p with rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action. Left OMITTED (future work): LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity) THEOREM Thm_DT1Inv == Spec => []DT1Inv THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot) THEOREM Thm_DT2 == Spec => DT2 (liveness) Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Discharge the previously OMITTED LEMMA DT1FromInv2 by introducing an auxiliary acyclicity invariant on the overlay tree, so the chain Spec => []DT1Inv goes through end-to-end. New invariant: NoCycle == \A S \in SUBSET (Procs \ {Leader}) : ~ (/\ S # {} /\ \A q \in S : InTree(q) /\ upEdge[q][1] \in S) i.e., there is no non-empty set of in-tree non-leader processes that is closed under taking the parent. Equivalently, every in-tree process can reach the leader by following upEdge. New lemmas: LEMMA NoCycleInit == Init => NoCycle LEMMA NoCycleStep == Inv2 /\ NoCycle /\ [Next]_vars => NoCycle' LEMMA NoCycleInductive == Spec => []NoCycle Inductiveness of NoCycle is by case analysis over Next. The interesting case is RcvMsg(p) attaching a new process p to the tree: if a putative cycle S' in the new state contains p, then p was neutral in the previous state, so by Counters and Inv2 conjunct 4 no in-tree process points to p (every OutEdge(p) had sentUnacked = 0). Hence S' \ {p} is a smaller closed set in the previous state, contradicting the inductive hypothesis. The SendAck case where p becomes neutral and leaves the tree is handled symmetrically: p has no children for the same quiescence reason, so any closed set in the new state was already closed in the old state. Discharge of the chain step: LEMMA DT1FromInv2 == Inv2 /\ NoCycle => DT1Inv Assume neutral(Leader) and a non-leader p0 with ~neutral(p0). The set S == {q \in Procs \ {Leader} : ~neutral(q)} is non-empty, and by Inv2 conjuncts 3-4 plus Counters it is closed under the parent function (sentUnacked[upEdge[q]] >= 1 forces the parent to be non-neutral, and neutral(Leader) keeps it out of {Leader}). This contradicts NoCycle, so all non-leader processes are neutral. Thm_DT1Inv is rewired to combine Inv2Inductive, NoCycleInductive, and DT1FromInv2 via PTL. Drafted by Claude Opus 4.7. All 642 TLAPS obligations discharged in ~30s. Only Thm_TreeWithRoot and Thm_DT2 remain OMITTED. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Catalogs every THEOREM in specifications/ and orders the 123 unproved or OMITTED ones into five difficulty bands (T/E/M/H/VH), easiest-first within each band. Each entry lists path:line plus a one-line rationale. The 107 already-proved theorems are tabulated for completeness. Methodology bands: T Trivial ~half day single-action spec, type invariant E Easy half-2 days obvious inductive invariant, no liveness M Moderate 2-10 days real strengthening, small algebra H Hard 1-4 weeks refinement, distributed safety, WF liveness VH Very Hard >= 1 month well-founded liveness, real-time, foundational Intended as a roadmap for contributors picking off proof obligations in order of ascending effort. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Adds HourClock_proof.tla in each of the four directories that ship a copy of HourClock.tla (HourClock/, Composing/, Liveness/, RealTime/). Each proof is the canonical 3-step inductive-invariant + PTL combo: <1>1. HCini => HCini OBVIOUS <1>2. HCini /\ [HCnxt]_hr => HCini' BY DEF HCini, HCnxt <1>. QED BY <1>1, <1>2, PTL DEF HC Each module passes 6 obligations in well under a second; manifest entries record a 1s budget. The textbook spec files themselves are not touched. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…variant Adds companion proof modules for the producer/consumer-handshake spec in three forms: - AsynchronousInterface/Channel.tla (record-valued chan variable) - Composing/Channel.tla (identical copy used by the Composing chapter examples) - FIFO/Channel.tla (identical copy reused by FIFO) - AsynchronousInterface/AsynchInterface.tla (three flat variables val/rdy/ack instead of a record) In each case TypeInvariant is directly inductive: the two actions flip {0,1} bits and store a Data value, so the proof is one BY DEF on the Send/Rcv definitions plus the standard PTL combinator. 6 obligations per module, all dispatched in under a second. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

TypeInvariant alone is not inductive: the Do(p) action accesses buf[p].adr / buf[p].op / buf[p].val, which is only well defined when buf[p] \in MReq. We strengthen with BufConsistent == /\ \A p \in Proc : ctl[p] = "rdy" => buf[p] \in Val \cup {NoVal} /\ \A p \in Proc : ctl[p] = "busy" => buf[p] \in MReq /\ \A p \in Proc : ctl[p] = "done" => buf[p] \in Val \cup {NoVal} and prove ISpec => []Inv with Inv == TypeInvariant /\ BufConsistent. The Do case splits on buf[p].op \in {"Rd","Wr"} to discharge mem' and buf' typing. 52 obligations per module, ~2 seconds each. The same proof file is replicated to each of the four directories that ship a copy of InternalMemory.tla (CachingMemory, Composing, Liveness, RealTime). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…nvariant InnerFIFO/InnerFIFO_proof.tla: Spec => []TypeInvariant where TypeInvariant unfolds the In/Out Channel invariants together with q \in Seq(Message). Five action cases plus stutter; relies on the Channel TypeInvariant being inductive (proved in the sibling Channel_proof.tla). 22 obligations. TLC/AlternatingBit_proof.tla: ABSpec => []ABTypeInv. The interesting bits are three short helper lemmas (Append/Tail/Head preserve Seq(T) typing, plus a hand-rolled LosePreservesType for the message-loss action that picks an index i \in 1..Len(q) and slices q around it). 48 obligations, ~12s. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

VTypeOK /\ VConsistent is directly inductive: each of the four actions (Issue, Transfer, Redeem, Cancel) takes a voucher through the phantom -> valid -> {redeemed, cancelled} sequence in lockstep with the lifecycle machine init -> working -> done. Six obligations. The four Voucher{Issue,Cancel,Redeem,Transfer} TPC-style siblings are not yet covered; their VTPConsistent invariant requires the message-sequencing strengthening that "Issue" and "Abort" never co-exist in msgs (tracked as Band M in PROOF_DIFFICULTY.md). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…variants TCommit_proof.tla: TCSpec => [](TCTypeOK /\ TCConsistent). The two parts are inductive together: TCConsistent (no rm aborted while another rm is committed) is preserved because Decide's two cases require canCommit (no rm aborted) or notCommitted (no rm committed) respectively. TwoPhase_proof.tla: TPSpec => []TPTypeOK. Eight action cases plus stutter; standard inductive type proof. PaxosCommit_proof.tla: PCSpec => PCTypeOK as literally stated (initial state). The unnamed ASSUME of PaxosCommit.tla is re-stated under a name so the fact "0 \in Ballot" can be referenced explicitly. The strictly stronger PCSpec => []PCTypeOK is left as future work; it requires the auxiliary invariant that phase1b messages with bal # -1 carry val # "none" (so Phase2a's chosen value is in {"prepared","aborted"}). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Spec => [](TypeInvariant /\ TxLifecycle) for the snapshot-isolation key-value store. The two parts are inductive together: TxLifecycle captures both the open-transaction invariant (changes the snapshot missed are recorded in missed) and the closed-transaction invariant (snapshot/written/missed are reset on RollbackTx and CloseTx). Six action cases plus stutter, 28 obligations, ~3s. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

TLAPS' backends sometimes fail to find facts contributed by an unnamed ASSUME, so two of the proofs in this branch had been working around that by re-stating the spec's anonymous ASSUME under a fresh name inside the proof file (PaxosCommitAssumptions in transaction_commit/PaxosCommit_proof.tla and EdgeFacts in ewd687a/EWD687a_proof.tla). Move each name into the spec itself and drop the duplicates from the proof files. The conjuncts of each ASSUME are unchanged; only a name is attached. The named-ASSUME pattern is already used by sibling specs such as Paxos/Voting.tla and ewd840/EWD840.tla. Affected pairs: specifications/transaction_commit/PaxosCommit.tla + ASSUME PaxosCommitAssumptions == ... specifications/transaction_commit/PaxosCommit_proof.tla - duplicate restatement removed; existing references unchanged. specifications/ewd687a/EWD687a.tla + ASSUME EdgeFacts == ... specifications/ewd687a/EWD687a_proof.tla - duplicate restatement removed. (The third anonymous ASSUME in this branch -- in PaxosHowToWinATuringAward/Voting.tla -- is named in its own introducing commit. The auxiliary ASSUME ProcsFinite == IsFiniteSet(Procs) in EWD687a_proof.tla is *not* a restatement -- it is a genuinely new hypothesis added by the proof for the chain-of-upEdges argument -- and stays in the proof file.) Both proofs re-checked with TLAPS: - PaxosCommit_proof.tla: 5 obligations - EWD687a_proof.tla: 642 obligations Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Three of the proof files defined an inductive invariant Inv equal to the body of the box of the theorem they were about to prove, then restated that body verbatim on the right-hand side of the theorem. Replace each "Spec => [](X /\ Y)" with "Spec => []Inv" so the relationship "this theorem is exactly Inv being inductive" is syntactically obvious and there is one place to keep in sync. Affected files: transaction_commit/TCommit_proof.tla - THEOREM TCorrect == TCSpec => [](TCTypeOK /\ TCConsistent) + THEOREM TCorrect == TCSpec => []Inv byihive/VoucherLifeCycle_proof.tla - THEOREM Spec_TypeOK_Consistent == VSpec => [](VTypeOK /\ VConsistent) + THEOREM Spec_TypeOK_Consistent == VSpec => []Inv KeyValueStore/KeyValueStore_proof.tla - THEOREM TypeAndLifecycle == Spec => [](TypeInvariant /\ TxLifecycle) + THEOREM TypeAndLifecycle == Spec => []Inv The original theorem statement (with the conjunction expanded) is preserved in the doc comment at the top of each proof file so the mapping back to the spec's theorem remains discoverable. Other proof files in the branch use Inv without restating its body on the antecedent already (e.g. SpecifyingSystems/.../InternalMemory_proof.tla proves "LEMMA InvInductive == ISpec => []Inv" and exposes the spec-shaped corollary via a separate "THEOREM TypeCorrect == ISpec => []TypeInvariant BY InvInductive, PTL DEF Inv"). This commit brings the three remaining files in line with that idiom. All three proofs re-checked with TLAPS: - TCommit_proof.tla: 16 obligations - VoucherLifeCycle_proof.tla: 6 obligations - KeyValueStore_proof.tla: 28 obligations Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Marks the three Voting.tla theorems proven in 2d4a6d2 as [done] (OneValuePerBallot => OneVote, VotesSafeImpliesConsistency, ShowsSafety), records short proof-shape notes for each, and annotates Voting.Invariance with the SafeAtMonotone deferral. Updates the "Recent progress" section to reflect the second-round count. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…sableThm, ShowsSafety The "How to Win a Turing Award" Voting.tla is essentially the same spec as Paxos/Voting.tla, so the proofs port directly. Voting_proof.tla adds: - AllSafeAtZero (Band E) -- BY DEF SafeAt; SafeAt(0,_) is a universal over the empty range. - ChoosableThm (Band E) -- ChosenAt picks a Quorum that witnesses NoneOtherChoosableAt. - ShowsSafety (Band M) -- trichotomy on the witness ballot c from ShowsSafeAt. The Quorum ASSUME in Voting.tla previously had no name, which TLAPS' backends sometimes fail to find when discharging obligations. Name it QuorumAssumption in Voting.tla -- the conjuncts are unchanged -- so the proof can reference it explicitly via BY QuorumAssumption. This matches the named-ASSUME pattern already used by the sibling Paxos/Voting.tla. ShowsSafety is conditioned on the conjunct invariant Inv (rather than its three constituents separately), so the proof first extracts TypeOK, VotesSafe, OneValuePerBallot from Inv at the SUFFICES boundary. Helpers proven en route: OneValuePerBallotApply (OneValuePerBallot re-packaged as ASSUME/PROVE), QuorumIntersect, plus QuorumNonEmpty. 119 obligations, ~15s warm. Invariance and Implementation (the Band-H refinement) remain deferred for the same SafeAtMonotone-shaped reason as in Paxos/Voting_proof.tla. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…Mutex Two Band-M safety theorems: SimpleAllocator => []TypeInvariant SimpleAllocator => []ResourceMutex TypeInvariant is directly inductive (Request, Allocate, Return all preserve [Clients -> SUBSET Resources] for unsat and alloc). ResourceMutex follows from the inductive strengthening Inv == TypeInvariant /\ ResourceMutex. The interesting case is Allocate(c, S): - S \subseteq available (from Allocate's preconditions), - available == Resources \ UNION {alloc[c''] : c''} => S \cap alloc[c''] = {} for every c''. Three sub-cases on whether c1 = c, c2 = c, or both differ from c. 94 obligations, ~1s warm. The four liveness theorems (ClientsWillReturn, ClientsWillObtain, InfOftenSatisfied for both SimpleAllocator and SimpleAllocator2) remain in Bands H/VH and are deferred. README's Resource Allocator row gains a (partial-proof) marker. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

…urceMutex Two Band-M safety theorems: Allocator => []TypeInvariant Allocator => []ResourceMutex TypeInvariant adds sched \in Seq(Clients) over the SimpleAllocator shape. Two of the actions touch sched: - Allocate(c, S) sets sched' = Drop(sched, i) when S = unsat[c]. DropType lemma uses SubSeqProperties from the standard library (in tlapm/library/SequenceTheorems.tla) to show SubSeq(s, m, n) \in Seq(T) when 1 <= m and n <= Len(s). - Schedule does sched' = sched \o sq for sq \in PermSeqs(toSchedule). PermSeqsType (sq \in Seq(toSchedule)) is left OMITTED because it requires induction on the recursive PermSeqs definition. This is the only OMITTED piece in the proof and is documented inline and in PROOF_DIFFICULTY.md. ResourceMutex preservation follows the same disjoint-from-available argument as in SimpleAllocator; sched/network changes are orthogonal to the alloc-mutex argument. 160 obligations, ~6s warm. The Band-M Allocator => []AllocatorInvariant remains a substantial multi-conjunct lift (the schedule structure plus the PrevResources analysis) and is deferred. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Specification => []TypeInvariant on the message-passing implementation of the scheduling allocator. The proof reuses the SchedulingAllocator-level Drop / PermSeqs typing helpers (re-stated locally via the Sched! instance) and extends them with the type-checks for the additional client-side variables (requests, holding) and the network of in-flight messages. The Messages set is the obvious union over request/allocate/return typed records. 74 obligations, ~5s warm. []ResourceMutex (the *client-side* mutex on holding) and []Invariant remain deferred: both depend on Sched!AllocatorInvariant plus the network/holding interplay invariant alloc[c] = holding[c] \cup AllocsInTransit(c) \cup ReturnsInTransit(c). Once the Sched-level AllocatorInvariant lands, those should follow. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Adds [done] markers and short proof-shape notes for the six new M-band proofs and two Band-E helpers landed in 39e50be, bd8225b, 0477b44, 4a60f28: - PaxosHowToWinATuringAward/Voting AllSafeAtZero, ChoosableThm, ShowsSafety - allocator/SimpleAllocator []TypeInvariant, []ResourceMutex - allocator/SchedulingAllocator []TypeInvariant, []ResourceMutex - allocator/AllocatorImplementation []TypeInvariant Also annotates the deferred items (SchedulingAllocator []AllocatorInvariant, AllocatorImplementation []ResourceMutex / []Invariant) with the dependency chain that's blocking each, and extends the "Recent progress" section to summarize the round-three batch. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

The original TwoPhase.tla states two theorems: TPSpec => []TPTypeOK (Band E, already proved) TPSpec => TC!TCSpec (Band H, full refinement -- still open) This adds the Band-M middle ground: TPSpec => []TC!TCConsistent -- no two RMs end up "committed" and "aborted". TC!TCConsistent is a safety corollary of TC!TCSpec but provable directly with a strengthened state invariant, without the refinement-mapping machinery the full Band-H refinement would need. The strengthening (10 conjuncts) tracks the message-sequencing facts explaining why the TM-broadcast Commit/Abort decisions are mutually exclusive and how each RM's local state correlates with what is on the wire: - ~ (CommitMsg \in msgs /\ AbortMsg \in msgs) (decision mutex) - tmState <-> CommitMsg/AbortMsg presence - tmPrepared \subseteq RMs that sent "Prepared" - PrepMsg(rm) \in msgs => rmState[rm] # "working" - rmState[rm] = "committed" => CommitMsg \in msgs - CommitMsg \in msgs => every rm sent PrepMsg (preserved from TMCommit's tmPrepared = RM precondition) - rmState[rm] = "aborted" => AbortMsg \in msgs (via RMRcvAbortMsg) \/ PrepMsg(rm) \notin msgs (via RMChooseToAbort) The chain that closes: rmState[rm2] = "committed" => CommitMsg \in msgs => AbortMsg \notin msgs and PrepMsg(rm1) \in msgs => rm1 cannot be "aborted" via either disjunct of conjunct 10. Per AGENT_PROMPT.md and Konnov/Kuppe/Merz arXiv:2211.07216 Sec. 3.2, the candidate Inv was first validated with Apalache before TLAPS. The Apalache MC lives in MCTwoPhaseApa.tla; it widens the heterogeneous Message record union ([type:{"Prepared"}, rm:RM] \cup [type:{"Commit","Abort"}]) into a single uniform record type with a sentinel "_" rm field for decision messages, so Apalache's row-typed records can represent it. Three queries pass on a 3-RM instance: apalache-mc check --init=TPInit --next=TPNext --inv=Inv --length=0 apalache-mc check --init=InvInit --next=TPNext --inv=Inv --length=1 apalache-mc check --init=InvInit --next=TPNext --inv=TCConsistent --length=0 The TLAPS proof followed without iteration -- 67 obligations, ~3s warm. The pre-existing TPSpec => []TPTypeOK proof is unchanged; manifest entries record the Apalache MC and a generous 30s budget for the now-extended proof. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de> Made-with: Cursor

…rectnessThm Two small hygienic edits to the existing companion proof modules so the spec-file THEOREM stubs in `Reachable.tla` and `ParReach.tla` are covered by named, TLAPS-callable theorems: ReachableProofs.tla - THEOREM Spec => []((pc = "Done") => (marked = Reachable)) + THEOREM Thm4 == Spec => []((pc = "Done") => (marked = Reachable)) + THEOREM PartialCorrectnessThm == Spec => []PartialCorrectness BY Thm4, PTL DEF PartialCorrectness ParReachProofs.tla - THEOREM Spec => R!Init /\ [][R!Next]_R!vars + THEOREM RefinementSafety == Spec => R!Init /\ [][R!Next]_R!vars `PartialCorrectnessThm` makes the spec-stub form (under the `PartialCorrectness` defined name) a TLAPS-callable named theorem; `RefinementSafety` does the same for the parallel-algorithm refinement-safety result. The literal THEOREM stubs in the original spec files remain unproven in their own modules, since proving them in-place would require those (textbook-style) spec files to import the companion proof modules -- a circular dependency. The actually-unproven theorems in this directory are: - Reachable.tla:209 termination (Band H, liveness via well-founded measure on the lex pair <<|Reachable \ marked|, |vroot|>>) - ParReach.tla:235 fairness refinement (Band H, lifting per-process WF_vars(p(self)) to WF_R!vars(R!Next) under the refinement mapping) - ParReach.tla:223 Spec => Refines (depends on the above) These remain deferred; PROOF_DIFFICULTY.md is updated to reflect the new state. Verified: ReachableProofs (76 obligations, +3 over previous 73) and ParReachProofs (52 obligations, unchanged) both pass with TLAPS in under 5 seconds with -I .../CommunityModules/modules. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Inv1 (= TypeOK /\ Counters) is already proved inductive, so threading it through the preservation steps of the dependent lemmas avoids re-proving those facts as part of Inv2. Inv2 is now just the two structural overlay-tree invariants; Inv2Inductive, NoCycleStep, NoCycleInductive, DT1FromInv2 and Thm_DT1Inv take Inv1 (and, where relevant, Inv1') as an extra hypothesis and discharge it via Inv1Inductive in the final PTL step. Also simplified a number of proof steps: - Collapsed "SUFFICES P => FALSE" + "ASSUME P PROVE FALSE" into a single "SUFFICES ASSUME P PROVE FALSE" in the p # Leader and ~neutral(p) sub-proofs. - Dropped the trivial intermediate <2>2. \A q \in S : InTree(q) /\ upEdge[q][1] \in S BY <2>1 in the four upEdge'-unchanged cases of NoCycleStep. - Compressed linear chains in DT1FromInv2 (S # {} from <1>contra directly; upEdge[q][1] \in S directly from neutral asymmetry) and in NoCycleStep <3>caseB. - Combined consecutive SUFFICES (universal -> arbitrary element) into single ASSUME/PROVE SUFFICES. A SUFFICES ASSUME P PROVE Q step at intermediate proof depth does not automatically expose P to later sibling BY hints; the SUFFICES step name has to be cited explicitly (matching the existing <4>16 idiom in NoCycleStep). Re-checked with TLAPS: all 613 obligations proved. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

ahelwer · 2026-04-26T18:11:23Z

+
+### Recent progress
+
+A first round added 18 Band-T/E proofs in companion `_proof.tla` files;


What is a Band-T/E proof?

Trivial & Easy - The AI came up with it when tasked to categorize the proofs based on difficulty.

lemmy added the enhancement label Apr 23, 2026

lemmy force-pushed the mku-tlaips branch from 188777f to 45da2e0 Compare April 23, 2026 01:08

lemmy requested a review from muenchnerkindl April 23, 2026 16:45

muenchnerkindl approved these changes Apr 26, 2026

View reviewed changes

Markus Kuppe and others added 12 commits April 26, 2026 08:48

lemmy force-pushed the mku-tlaips branch from 857dd9f to 2e0c6e6 Compare April 26, 2026 15:51

lemmy and others added 8 commits April 26, 2026 08:55

lemmy force-pushed the mku-tlaips branch from 6f89776 to ef4ecd3 Compare April 26, 2026 17:13

ahelwer reviewed Apr 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with Claude Opus 4.7 and TLAPS.#211

Experiment with Claude Opus 4.7 and TLAPS.#211
lemmy wants to merge 21 commits intomasterfrom
mku-tlaips

lemmy commented Apr 23, 2026

Uh oh!

lemmy commented Apr 23, 2026

Uh oh!

muenchnerkindl left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

muenchnerkindl Apr 26, 2026

Uh oh!

muenchnerkindl Apr 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahelwer Apr 26, 2026

Uh oh!

lemmy Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants


		### Recent progress

		A first round added 18 Band-T/E proofs in companion `_proof.tla` files;

Conversation

lemmy commented Apr 23, 2026

Uh oh!

lemmy commented Apr 23, 2026

Outcome

What I built

Where it broke

Recommendation for finishing this off

Uh oh!

muenchnerkindl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

muenchnerkindl Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

muenchnerkindl Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahelwer Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

lemmy Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants