Skip to content

Experiment with Claude Opus 4.7 and TLAPS.#211

Draft
lemmy wants to merge 21 commits intomasterfrom
mku-tlaips
Draft

Experiment with Claude Opus 4.7 and TLAPS.#211
lemmy wants to merge 21 commits intomasterfrom
mku-tlaips

Conversation

@lemmy
Copy link
Copy Markdown
Member

@lemmy lemmy commented Apr 23, 2026

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

THEOREM TypeCorrect == Spec => []TypeOK
THEOREM Thm_CountersConsistent == Spec => CountersConsistent

via the combined inductive invariant Inv1 == TypeOK /\ Counters,
where Counters relates the four per-edge counters:
sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

LEMMA Inv2Inductive == Spec => []Inv2

Inv2 is the structural overlay-tree strengthening of DT1Inv:
non-neutral non-leader processes are in the upEdge tree, and
upEdge[p] is a well-formed incoming edge of p with
rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity)
THEOREM Thm_DT1Inv == Spec => []DT1Inv
THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot)
THEOREM Thm_DT2 == Spec => DT2 (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md.

@lemmy
Copy link
Copy Markdown
Member Author

lemmy commented Apr 23, 2026

Claude gave up on proving Thm_TreeWithRoot:

Prove Thm_TreeWithRoot


Cursor

Given the substantial time spent and 41 obligations still failing in my proof attempt, let me take a more pragmatic path. I'll revert Thm_TreeWithRoot back to OMITTED and explain the situation honestly.

Let me try one more focused approach to fix the recursive function definition issue.

Given the persistent difficulty getting NatInductiveDef to work with a locally-defined recursive function and the cascade of 41 failures, let me revert Thm_TreeWithRoot back to OMITTED and explain the situation.

I've made substantial progress restructuring the proof but have hit a fundamental TLAPS limitation: NatInductiveDef cannot be applied to a recursive function iter whose definition captures the outer ASSUME variable q0 — even with the HIDE DEF + Zenon pattern that works in LockHS.tla (where the analogous function captures only constants like {1}).

Rather than burn more cycles, let me revert Thm_TreeWithRoot to its original OMITTED state but preserve the proof attempt as commented-out reference for future work, so we don't lose what's been built up.

The file has been reverted to the original state. Here is an honest summary of where things stand.

Outcome

I was not able to prove Thm_TreeWithRoot in this session. After substantial work, I reverted specifications/ewd687a/EWD687a_proof.tla back to its committed state (with Thm_TreeWithRoot as OMITTED).

What I built

The proof structure I developed proved most of the supporting machinery — what the theorem morally needs:

  • LEMMA ParentInTree — strengthens Inv2 to give parent properties for in-tree non-leaders.
  • LEMMA TWR_NodesInProcs, TWR_EInEdges — graph-shape lemmas about the overlay tree's nodes/edges.
  • LEMMA TreeWithRootFromInvs — the main reduction Inv2 /\ NoCycle => TreeWithRootInv, including subproofs for IsDirectedGraph, edge uniqueness, the Transpose calculation, and AreConnectedIn from the Graphs community module.
  • The final Thm_TreeWithRoot would compose Inv2Inductive, NoCycleInductive, TreeWithRootFromInvs, and an INSTANCE Graphs unfolding step under [].

Where it broke

The proof depends on a small LEMMA HasPath that constructs a simple path from any in-tree non-leader q0 to the leader by iterating ParentStep. The natural way to define that iteration is a recursive function on Nat:

iter[k \in Nat] == IF k = 0 THEN q0 ELSE ParentStep(iter[k-1])

and then justify it via NaturalsInduction!NatInductiveDef (the standard pattern used in LockHS.tla). In our setting iter captures the outer ASSUME constant q0, and TLAPS's second-order matching against NatInductiveDefHypothesis / NatInductiveDefConclusion / NatInductiveDefType consistently fails — even after trying:

  • HIDE DEF iter + BY Zenon DEF iter, NatInductiveDefHypothesis (exact LockHS pattern),
  • module-level PathIter(q0) helper with LEMMA PathIterRec,
  • inline LAMBDA v,k : ParentStep(v) vs. named Op(v,k) vs. Def(v,k) operator,
  • Zenon, Isa, and explicit re-stating of NatInductiveDef via local ASSUME/PROVE.

The same proof shape works in LockHS.tla because the recursive function there only captures the constant {1}. With a captured q0 the obligation reduces to a second-order instantiation that none of the SMT/Zenon/Isabelle backends close in this configuration. After reaching this point the rest of HasPath (cardinality argument via FS_PigeonHole for the cycle-free iteration, prefix extraction) was working in the focused range tests, so the cascade of 41/1405 failures was driven by these few core obligations.

Recommendation for finishing this off

Two reasonable next steps if you want to revisit:

  1. Use RecursiveFcnOfNat (also in NaturalsInduction.tla) instead of NatInductiveDef — it expresses the existence of the unique recursive function more directly and may avoid the second-order matching path that's tripping the backends here.
  2. Lift the iteration out of HasPath by adding it to EWD687a.tla itself (so q0 becomes a parameter of a module-level operator without crossing an ASSUME boundary), and prove the iter[0] / iter[k] equation as a standalone module-level lemma.

I'm sorry I couldn't close it out within this session; the supporting lemmas above are sound and I can re-introduce them as a starting point whenever you'd like to take another pass.

Copy link
Copy Markdown
Collaborator

@muenchnerkindl muenchnerkindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! A few detailed comments in the different modules, but nothing serious.

Comment thread specifications/byihive/VoucherLifeCycle_proof.tla Outdated
Comment thread specifications/ewd687a/EWD687a_proof.tla Outdated
Comment thread specifications/ewd687a/EWD687a_proof.tla Outdated
Comment thread specifications/ewd687a/EWD687a_proof.tla
Comment thread specifications/ewd687a/EWD687a_proof.tla
PROVE Head(s) \in T
OBVIOUS

LEMMA LosePreservesType ==
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it more natural to replace the existential quantifier in the hypothesis of this lemma by "NEW i \in 1 .. Len(q), " and drop step <1>1 in the proof. Also, since both main steps in the proof are proved by "OBVIOUS", the entire proof should probably be obvious? (Also, since after this change all "type" lemmas are OBVIOUS, can one drop their use in the invariant proof below and remove the type lemmas entirely? But sometimes it may be useful to state these lemmas anyway, in particular for the benefit of Zenon.)

(* but requires an inductive strengthening for Phase2a (the chosen value *)
(* for the new phase 2a message must be in {"prepared","aborted"}, which *)
(* relies on the auxiliary invariant that phase1b messages with bal # -1 *)
(* have val # "none"). That strengthening is left as future work. *)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we have everything that is required here in the Paxos proof? (Perhaps look at tlapm/examples/paxos if the Paxos modules in the Examples collection do not contain all relevant proofs.)

Comment thread specifications/transaction_commit/PaxosCommit_proof.tla Outdated
Comment thread specifications/transaction_commit/TCommit_proof.tla Outdated
Comment thread specifications/transaction_commit/TwoPhase_proof.tla
Markus Kuppe and others added 12 commits April 26, 2026 08:48
Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations
discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

  THEOREM TypeCorrect            == Spec => []TypeOK
  THEOREM Thm_CountersConsistent == Spec => CountersConsistent

  via the combined inductive invariant Inv1 == TypeOK /\ Counters,
  where Counters relates the four per-edge counters:
    sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
  TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

  LEMMA Inv2Inductive == Spec => []Inv2

  Inv2 is the structural overlay-tree strengthening of DT1Inv:
  non-neutral non-leader processes are in the upEdge tree, and
  upEdge[p] is a well-formed incoming edge of p with
  rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

  LEMMA   DT1FromInv2      == Inv2 => DT1Inv          (chain/acyclicity)
  THEOREM Thm_DT1Inv       == Spec => []DT1Inv
  THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot    (IsTreeWithRoot)
  THEOREM Thm_DT2          == Spec => DT2             (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked
up by the data-driven Check proofs step in .github/workflows/CI.yml)
and flip the TLAPS Proof column in README.md.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Discharge the previously OMITTED LEMMA DT1FromInv2 by introducing an
auxiliary acyclicity invariant on the overlay tree, so the chain
Spec => []DT1Inv goes through end-to-end.

New invariant:

  NoCycle == \A S \in SUBSET (Procs \ {Leader}) :
                ~ (/\ S # {}
                   /\ \A q \in S : InTree(q) /\ upEdge[q][1] \in S)

i.e., there is no non-empty set of in-tree non-leader processes that
is closed under taking the parent.  Equivalently, every in-tree
process can reach the leader by following upEdge.

New lemmas:

  LEMMA NoCycleInit        == Init => NoCycle
  LEMMA NoCycleStep        == Inv2 /\ NoCycle /\ [Next]_vars => NoCycle'
  LEMMA NoCycleInductive   == Spec => []NoCycle

Inductiveness of NoCycle is by case analysis over Next.  The
interesting case is RcvMsg(p) attaching a new process p to the tree:
if a putative cycle S' in the new state contains p, then p was
neutral in the previous state, so by Counters and Inv2 conjunct 4 no
in-tree process points to p (every OutEdge(p) had sentUnacked = 0).
Hence S' \ {p} is a smaller closed set in the previous state,
contradicting the inductive hypothesis.  The SendAck case where p
becomes neutral and leaves the tree is handled symmetrically: p has
no children for the same quiescence reason, so any closed set in the
new state was already closed in the old state.

Discharge of the chain step:

  LEMMA DT1FromInv2 == Inv2 /\ NoCycle => DT1Inv

Assume neutral(Leader) and a non-leader p0 with ~neutral(p0).  The
set S == {q \in Procs \ {Leader} : ~neutral(q)} is non-empty, and by
Inv2 conjuncts 3-4 plus Counters it is closed under the parent
function (sentUnacked[upEdge[q]] >= 1 forces the parent to be
non-neutral, and neutral(Leader) keeps it out of {Leader}).  This
contradicts NoCycle, so all non-leader processes are neutral.

Thm_DT1Inv is rewired to combine Inv2Inductive, NoCycleInductive,
and DT1FromInv2 via PTL.

Drafted by Claude Opus 4.7.  All 642 TLAPS obligations discharged
in ~30s.  Only Thm_TreeWithRoot and Thm_DT2 remain OMITTED.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Catalogs every THEOREM in specifications/ and orders the 123 unproved
or OMITTED ones into five difficulty bands (T/E/M/H/VH), easiest-first
within each band.  Each entry lists path:line plus a one-line rationale.
The 107 already-proved theorems are tabulated for completeness.

Methodology bands:

  T  Trivial        ~half day      single-action spec, type invariant
  E  Easy           half-2 days    obvious inductive invariant, no liveness
  M  Moderate       2-10 days      real strengthening, small algebra
  H  Hard           1-4 weeks      refinement, distributed safety, WF liveness
  VH Very Hard      >= 1 month     well-founded liveness, real-time, foundational

Intended as a roadmap for contributors picking off proof obligations
in order of ascending effort.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Adds HourClock_proof.tla in each of the four directories that ship a
copy of HourClock.tla (HourClock/, Composing/, Liveness/, RealTime/).
Each proof is the canonical 3-step inductive-invariant + PTL combo:

  <1>1. HCini => HCini                     OBVIOUS
  <1>2. HCini /\ [HCnxt]_hr => HCini'      BY DEF HCini, HCnxt
  <1>.  QED                                BY <1>1, <1>2, PTL DEF HC

Each module passes 6 obligations in well under a second; manifest
entries record a 1s budget.  The textbook spec files themselves are
not touched.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…variant

Adds companion proof modules for the producer/consumer-handshake spec
in three forms:

  - AsynchronousInterface/Channel.tla  (record-valued chan variable)
  - Composing/Channel.tla              (identical copy used by the
                                        Composing chapter examples)
  - FIFO/Channel.tla                   (identical copy reused by FIFO)
  - AsynchronousInterface/AsynchInterface.tla  (three flat variables
                                        val/rdy/ack instead of a record)

In each case TypeInvariant is directly inductive: the two actions
flip {0,1} bits and store a Data value, so the proof is one BY DEF on
the Send/Rcv definitions plus the standard PTL combinator.  6
obligations per module, all dispatched in under a second.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
TypeInvariant alone is not inductive: the Do(p) action accesses
buf[p].adr / buf[p].op / buf[p].val, which is only well defined when
buf[p] \in MReq.  We strengthen with

  BufConsistent ==
    /\ \A p \in Proc : ctl[p] = "rdy"  => buf[p] \in Val \cup {NoVal}
    /\ \A p \in Proc : ctl[p] = "busy" => buf[p] \in MReq
    /\ \A p \in Proc : ctl[p] = "done" => buf[p] \in Val \cup {NoVal}

and prove ISpec => []Inv with Inv == TypeInvariant /\ BufConsistent.
The Do case splits on buf[p].op \in {"Rd","Wr"} to discharge mem' and
buf' typing.  52 obligations per module, ~2 seconds each.

The same proof file is replicated to each of the four directories that
ship a copy of InternalMemory.tla (CachingMemory, Composing, Liveness,
RealTime).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…nvariant

InnerFIFO/InnerFIFO_proof.tla:
  Spec => []TypeInvariant where TypeInvariant unfolds the In/Out Channel
  invariants together with q \in Seq(Message).  Five action cases plus
  stutter; relies on the Channel TypeInvariant being inductive (proved
  in the sibling Channel_proof.tla).  22 obligations.

TLC/AlternatingBit_proof.tla:
  ABSpec => []ABTypeInv.  The interesting bits are three short helper
  lemmas (Append/Tail/Head preserve Seq(T) typing, plus a hand-rolled
  LosePreservesType for the message-loss action that picks an index
  i \in 1..Len(q) and slices q around it).  48 obligations, ~12s.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
VTypeOK /\ VConsistent is directly inductive: each of the four actions
(Issue, Transfer, Redeem, Cancel) takes a voucher through the
phantom -> valid -> {redeemed, cancelled} sequence in lockstep with
the lifecycle machine init -> working -> done.  Six obligations.

The four Voucher{Issue,Cancel,Redeem,Transfer} TPC-style siblings are
not yet covered; their VTPConsistent invariant requires the
message-sequencing strengthening that "Issue" and "Abort" never
co-exist in msgs (tracked as Band M in PROOF_DIFFICULTY.md).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…variants

TCommit_proof.tla:
  TCSpec => [](TCTypeOK /\ TCConsistent).  The two parts are inductive
  together: TCConsistent (no rm aborted while another rm is committed)
  is preserved because Decide's two cases require canCommit (no rm
  aborted) or notCommitted (no rm committed) respectively.

TwoPhase_proof.tla:
  TPSpec => []TPTypeOK.  Eight action cases plus stutter; standard
  inductive type proof.

PaxosCommit_proof.tla:
  PCSpec => PCTypeOK as literally stated (initial state).  The
  unnamed ASSUME of PaxosCommit.tla is re-stated under a name so the
  fact "0 \in Ballot" can be referenced explicitly.  The strictly
  stronger PCSpec => []PCTypeOK is left as future work; it requires
  the auxiliary invariant that phase1b messages with bal # -1 carry
  val # "none" (so Phase2a's chosen value is in {"prepared","aborted"}).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Spec => [](TypeInvariant /\ TxLifecycle) for the snapshot-isolation
key-value store.  The two parts are inductive together: TxLifecycle
captures both the open-transaction invariant (changes the snapshot
missed are recorded in missed) and the closed-transaction invariant
(snapshot/written/missed are reset on RollbackTx and CloseTx).  Six
action cases plus stutter, 28 obligations, ~3s.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
TLAPS' backends sometimes fail to find facts contributed by an
unnamed ASSUME, so two of the proofs in this branch had been
working around that by re-stating the spec's anonymous ASSUME under
a fresh name inside the proof file (PaxosCommitAssumptions in
transaction_commit/PaxosCommit_proof.tla and EdgeFacts in
ewd687a/EWD687a_proof.tla).

Move each name into the spec itself and drop the duplicates from
the proof files.  The conjuncts of each ASSUME are unchanged; only
a name is attached.  The named-ASSUME pattern is already used by
sibling specs such as Paxos/Voting.tla and ewd840/EWD840.tla.

Affected pairs:

  specifications/transaction_commit/PaxosCommit.tla
    + ASSUME PaxosCommitAssumptions == ...
  specifications/transaction_commit/PaxosCommit_proof.tla
    - duplicate restatement removed; existing references unchanged.

  specifications/ewd687a/EWD687a.tla
    + ASSUME EdgeFacts == ...
  specifications/ewd687a/EWD687a_proof.tla
    - duplicate restatement removed.

(The third anonymous ASSUME in this branch -- in
PaxosHowToWinATuringAward/Voting.tla -- is named in its own
introducing commit.  The auxiliary ASSUME ProcsFinite == IsFiniteSet(Procs)
in EWD687a_proof.tla is *not* a restatement -- it is a genuinely new
hypothesis added by the proof for the chain-of-upEdges argument --
and stays in the proof file.)

Both proofs re-checked with TLAPS:
  - PaxosCommit_proof.tla:    5  obligations
  - EWD687a_proof.tla:        642 obligations

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Three of the proof files defined an inductive invariant Inv equal to
the body of the box of the theorem they were about to prove, then
restated that body verbatim on the right-hand side of the theorem.
Replace each "Spec => [](X /\ Y)" with "Spec => []Inv" so the
relationship "this theorem is exactly Inv being inductive" is
syntactically obvious and there is one place to keep in sync.

Affected files:

  transaction_commit/TCommit_proof.tla
    - THEOREM TCorrect == TCSpec => [](TCTypeOK /\ TCConsistent)
    + THEOREM TCorrect == TCSpec => []Inv

  byihive/VoucherLifeCycle_proof.tla
    - THEOREM Spec_TypeOK_Consistent == VSpec => [](VTypeOK /\ VConsistent)
    + THEOREM Spec_TypeOK_Consistent == VSpec => []Inv

  KeyValueStore/KeyValueStore_proof.tla
    - THEOREM TypeAndLifecycle == Spec => [](TypeInvariant /\ TxLifecycle)
    + THEOREM TypeAndLifecycle == Spec => []Inv

The original theorem statement (with the conjunction expanded) is
preserved in the doc comment at the top of each proof file so the
mapping back to the spec's theorem remains discoverable.

Other proof files in the branch use Inv without restating its body
on the antecedent already (e.g. SpecifyingSystems/.../InternalMemory_proof.tla
proves "LEMMA InvInductive == ISpec => []Inv" and exposes the
spec-shaped corollary via a separate "THEOREM TypeCorrect ==
ISpec => []TypeInvariant  BY InvInductive, PTL DEF Inv").  This
commit brings the three remaining files in line with that idiom.

All three proofs re-checked with TLAPS:
  - TCommit_proof.tla:           16  obligations
  - VoucherLifeCycle_proof.tla:  6   obligations
  - KeyValueStore_proof.tla:     28  obligations

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
lemmy and others added 8 commits April 26, 2026 08:55
Marks the three Voting.tla theorems proven in 2d4a6d2 as [done]
(OneValuePerBallot => OneVote, VotesSafeImpliesConsistency,
ShowsSafety), records short proof-shape notes for each, and annotates
Voting.Invariance with the SafeAtMonotone deferral.  Updates the
"Recent progress" section to reflect the second-round count.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…sableThm, ShowsSafety

The "How to Win a Turing Award" Voting.tla is essentially the same spec
as Paxos/Voting.tla, so the proofs port directly.  Voting_proof.tla
adds:

  - AllSafeAtZero      (Band E)  -- BY DEF SafeAt; SafeAt(0,_) is a
                                    universal over the empty range.
  - ChoosableThm       (Band E)  -- ChosenAt picks a Quorum that
                                    witnesses NoneOtherChoosableAt.
  - ShowsSafety        (Band M)  -- trichotomy on the witness ballot c
                                    from ShowsSafeAt.

The Quorum ASSUME in Voting.tla previously had no name, which TLAPS'
backends sometimes fail to find when discharging obligations.  Name
it QuorumAssumption in Voting.tla -- the conjuncts are unchanged --
so the proof can reference it explicitly via BY QuorumAssumption.
This matches the named-ASSUME pattern already used by the sibling
Paxos/Voting.tla.  ShowsSafety is conditioned on the conjunct
invariant Inv (rather than its three constituents separately), so
the proof first extracts TypeOK, VotesSafe, OneValuePerBallot from
Inv at the SUFFICES boundary.

Helpers proven en route: OneValuePerBallotApply (OneValuePerBallot
re-packaged as ASSUME/PROVE), QuorumIntersect, plus QuorumNonEmpty.
119 obligations, ~15s warm.

Invariance and Implementation (the Band-H refinement) remain
deferred for the same SafeAtMonotone-shaped reason as in
Paxos/Voting_proof.tla.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…Mutex

Two Band-M safety theorems:

  SimpleAllocator => []TypeInvariant
  SimpleAllocator => []ResourceMutex

TypeInvariant is directly inductive (Request, Allocate, Return all
preserve [Clients -> SUBSET Resources] for unsat and alloc).

ResourceMutex follows from the inductive strengthening
Inv == TypeInvariant /\ ResourceMutex.  The interesting case is
Allocate(c, S):

  - S \subseteq available (from Allocate's preconditions),
  - available == Resources \ UNION {alloc[c''] : c''}
       => S \cap alloc[c''] = {} for every c''.

Three sub-cases on whether c1 = c, c2 = c, or both differ from c.

94 obligations, ~1s warm.

The four liveness theorems (ClientsWillReturn, ClientsWillObtain,
InfOftenSatisfied for both SimpleAllocator and SimpleAllocator2)
remain in Bands H/VH and are deferred.

README's Resource Allocator row gains a (partial-proof) marker.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
…urceMutex

Two Band-M safety theorems:

  Allocator => []TypeInvariant
  Allocator => []ResourceMutex

TypeInvariant adds sched \in Seq(Clients) over the SimpleAllocator
shape.  Two of the actions touch sched:

  - Allocate(c, S) sets sched' = Drop(sched, i) when S = unsat[c].
    DropType lemma uses SubSeqProperties from the standard library
    (in tlapm/library/SequenceTheorems.tla) to show
    SubSeq(s, m, n) \in Seq(T) when 1 <= m and n <= Len(s).
  - Schedule does sched' = sched \o sq for sq \in PermSeqs(toSchedule).
    PermSeqsType (sq \in Seq(toSchedule)) is left OMITTED because it
    requires induction on the recursive PermSeqs definition.  This is
    the only OMITTED piece in the proof and is documented inline and
    in PROOF_DIFFICULTY.md.

ResourceMutex preservation follows the same disjoint-from-available
argument as in SimpleAllocator; sched/network changes are orthogonal
to the alloc-mutex argument.

160 obligations, ~6s warm.

The Band-M Allocator => []AllocatorInvariant remains a substantial
multi-conjunct lift (the schedule structure plus the PrevResources
analysis) and is deferred.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Specification => []TypeInvariant on the message-passing
implementation of the scheduling allocator.

The proof reuses the SchedulingAllocator-level Drop / PermSeqs
typing helpers (re-stated locally via the Sched! instance) and
extends them with the type-checks for the additional client-side
variables (requests, holding) and the network of in-flight messages.
The Messages set is the obvious union over request/allocate/return
typed records.

74 obligations, ~5s warm.

[]ResourceMutex (the *client-side* mutex on holding) and []Invariant
remain deferred: both depend on Sched!AllocatorInvariant plus the
network/holding interplay invariant
  alloc[c] = holding[c] \cup AllocsInTransit(c) \cup ReturnsInTransit(c).
Once the Sched-level AllocatorInvariant lands, those should follow.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Adds [done] markers and short proof-shape notes for the six new
M-band proofs and two Band-E helpers landed in 39e50be, bd8225b,
0477b44, 4a60f28:

  - PaxosHowToWinATuringAward/Voting AllSafeAtZero, ChoosableThm,
    ShowsSafety
  - allocator/SimpleAllocator []TypeInvariant, []ResourceMutex
  - allocator/SchedulingAllocator []TypeInvariant, []ResourceMutex
  - allocator/AllocatorImplementation []TypeInvariant

Also annotates the deferred items (SchedulingAllocator
[]AllocatorInvariant, AllocatorImplementation []ResourceMutex /
[]Invariant) with the dependency chain that's blocking each, and
extends the "Recent progress" section to summarize the round-three
batch.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
The original TwoPhase.tla states two theorems:

  TPSpec => []TPTypeOK         (Band E, already proved)
  TPSpec => TC!TCSpec          (Band H, full refinement -- still open)

This adds the Band-M middle ground: TPSpec => []TC!TCConsistent --
no two RMs end up "committed" and "aborted".  TC!TCConsistent is a
safety corollary of TC!TCSpec but provable directly with a
strengthened state invariant, without the refinement-mapping
machinery the full Band-H refinement would need.

The strengthening (10 conjuncts) tracks the message-sequencing facts
explaining why the TM-broadcast Commit/Abort decisions are mutually
exclusive and how each RM's local state correlates with what is on
the wire:

  -  ~ (CommitMsg \in msgs /\ AbortMsg \in msgs)             (decision mutex)
  -  tmState <-> CommitMsg/AbortMsg presence
  -  tmPrepared \subseteq RMs that sent "Prepared"
  -  PrepMsg(rm) \in msgs => rmState[rm] # "working"
  -  rmState[rm] = "committed" => CommitMsg \in msgs
  -  CommitMsg \in msgs => every rm sent PrepMsg                (preserved
                                                              from TMCommit's
                                                              tmPrepared = RM
                                                              precondition)
  -  rmState[rm] = "aborted" => AbortMsg \in msgs (via RMRcvAbortMsg)
                              \/ PrepMsg(rm) \notin msgs   (via RMChooseToAbort)

The chain that closes: rmState[rm2] = "committed" => CommitMsg \in msgs
=> AbortMsg \notin msgs and PrepMsg(rm1) \in msgs => rm1 cannot be
"aborted" via either disjunct of conjunct 10.

Per AGENT_PROMPT.md and Konnov/Kuppe/Merz arXiv:2211.07216 Sec. 3.2,
the candidate Inv was first validated with Apalache before TLAPS.
The Apalache MC lives in MCTwoPhaseApa.tla; it widens the
heterogeneous Message record union ([type:{"Prepared"}, rm:RM] \cup
[type:{"Commit","Abort"}]) into a single uniform record type with a
sentinel "_" rm field for decision messages, so Apalache's row-typed
records can represent it.  Three queries pass on a 3-RM instance:

  apalache-mc check --init=TPInit  --next=TPNext --inv=Inv          --length=0
  apalache-mc check --init=InvInit --next=TPNext --inv=Inv          --length=1
  apalache-mc check --init=InvInit --next=TPNext --inv=TCConsistent --length=0

The TLAPS proof followed without iteration -- 67 obligations, ~3s
warm.  The pre-existing TPSpec => []TPTypeOK proof is unchanged;
manifest entries record the Apalache MC and a generous 30s budget
for the now-extended proof.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Made-with: Cursor
…rectnessThm

Two small hygienic edits to the existing companion proof modules so the
spec-file THEOREM stubs in `Reachable.tla` and `ParReach.tla` are
covered by named, TLAPS-callable theorems:

  ReachableProofs.tla
    -  THEOREM Spec => []((pc = "Done") => (marked = Reachable))
    +  THEOREM Thm4 == Spec => []((pc = "Done") => (marked = Reachable))
    +  THEOREM PartialCorrectnessThm == Spec => []PartialCorrectness
       BY Thm4, PTL DEF PartialCorrectness

  ParReachProofs.tla
    -  THEOREM Spec => R!Init /\ [][R!Next]_R!vars
    +  THEOREM RefinementSafety == Spec => R!Init /\ [][R!Next]_R!vars

`PartialCorrectnessThm` makes the spec-stub form (under the
`PartialCorrectness` defined name) a TLAPS-callable named theorem;
`RefinementSafety` does the same for the parallel-algorithm
refinement-safety result.  The literal THEOREM stubs in the original
spec files remain unproven in their own modules, since proving them
in-place would require those (textbook-style) spec files to import the
companion proof modules -- a circular dependency.

The actually-unproven theorems in this directory are:

  - Reachable.tla:209 termination (Band H, liveness via well-founded
    measure on the lex pair <<|Reachable \ marked|, |vroot|>>)
  - ParReach.tla:235 fairness refinement (Band H, lifting per-process
    WF_vars(p(self)) to WF_R!vars(R!Next) under the refinement mapping)
  - ParReach.tla:223 Spec => Refines (depends on the above)

These remain deferred; PROOF_DIFFICULTY.md is updated to reflect the
new state.

Verified: ReachableProofs (76 obligations, +3 over previous 73) and
ParReachProofs (52 obligations, unchanged) both pass with TLAPS in
under 5 seconds with -I .../CommunityModules/modules.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Inv1 (= TypeOK /\ Counters) is already proved inductive, so threading it
through the preservation steps of the dependent lemmas avoids re-proving
those facts as part of Inv2.  Inv2 is now just the two structural
overlay-tree invariants; Inv2Inductive, NoCycleStep, NoCycleInductive,
DT1FromInv2 and Thm_DT1Inv take Inv1 (and, where relevant, Inv1') as an
extra hypothesis and discharge it via Inv1Inductive in the final PTL
step.

Also simplified a number of proof steps:

  - Collapsed "SUFFICES P => FALSE" + "ASSUME P PROVE FALSE" into a
    single "SUFFICES ASSUME P PROVE FALSE" in the p # Leader and
    ~neutral(p) sub-proofs.
  - Dropped the trivial intermediate
      <2>2. \A q \in S : InTree(q) /\ upEdge[q][1] \in S  BY <2>1
    in the four upEdge'-unchanged cases of NoCycleStep.
  - Compressed linear chains in DT1FromInv2 (S # {} from <1>contra
    directly; upEdge[q][1] \in S directly from neutral asymmetry) and
    in NoCycleStep <3>caseB.
  - Combined consecutive SUFFICES (universal -> arbitrary element)
    into single ASSUME/PROVE SUFFICES.

A SUFFICES ASSUME P PROVE Q step at intermediate proof depth does not
automatically expose P to later sibling BY hints; the SUFFICES step
name has to be cited explicitly (matching the existing <4>16 idiom in
NoCycleStep).  Re-checked with TLAPS: all 613 obligations proved.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Comment thread PROOF_DIFFICULTY.md

### Recent progress

A first round added 18 Band-T/E proofs in companion `_proof.tla` files;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a Band-T/E proof?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial & Easy - The AI came up with it when tasked to categorize the proofs based on difficulty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants