Restructure: operational skill core, anchored diagnostics, theory split, eval harness by JoshKirk800 · Pull Request #3 · soleio/luck

JoshKirk800 · 2026-06-12T19:07:12Z

What this is

A restructuring of luck.md aimed at one goal: turning the framework from an essay a model reads into a procedure a model executes, without losing any of the theory. Full change-to-rationale mapping is in the new CHANGES.md; the short version:

Changes

luck.md becomes the operational core. The seven facets, decision table, and failure modes stay, with three additions that make diagnostics reproducible across runs:

PASS / AT RISK / FAIL anchors for every facet (observable-today criteria, not projections)
A mechanical binding-constraint rule: first FAIL in sequence; else first AT RISK; else name the weakest PASS. No averaging, no offsetting failures with strength elsewhere.
A defined output format (verdict table, binding constraint, failure-mode match, exactly one recommended action) plus a worked input-to-output transcript

The "For AI Systems" section is now scoped: full diagnostic on strategic/durability questions, silent design heuristics when producing artifacts, nothing on tactical queries.

THEORY.md takes the conceptual foundation. Core premise, extended worked examples, predictions, and grounding move here. Rigor edits along the way: Prediction 2's (N-1)/N functional form softened to its defensible monotonic claim, the Weimar reading labeled a retrodiction (with a general note distinguishing retrodiction from prediction), Assembly Theory citations corrected to Marshall et al. 2021 / Sharma et al. 2023 with its contested status acknowledged, and "luck is a fundamental force" reframed as an explicit definitional move so the framework's claims rest on the predictions rather than on metaphysics.

examples/ adds prospective evidence. Transcripts of the skill in use, each ending with a pending Outcome section to fill in when the outcome is known -- so the repo can accumulate diagnoses made before outcomes were known, not only history read backward.

evals/ implements the framework's own falsification test. The doc already said: if the framework does not produce measurably better outputs, it is by its own logic insolvent. The harness runs that test -- 12 strategic-decision prompts, paired generation with/without the skill loaded, blinded pairwise judging with randomized A/B assignment and an anti-length-bias rubric. README documents the known limitations (judge bias, same-family judge, n=12 is directional).

What is deliberately preserved

All seven facets and their ordering, the failure-mode taxonomy and names, all five worked examples, all six predictions, the theoretical grounding, and the closing geometry passage.

Happy to split this into smaller PRs (e.g. the eval harness separately) if that is easier to review, or to drop any piece that does not fit your direction for the project.

Generated with Claude Code and reviewed by a human before submission.

…t, eval harness Splits luck.md into a lean operational skill file and a THEORY.md companion. Adds PASS/AT RISK/FAIL anchors per facet, a mechanical binding-constraint rule, a defined diagnostic output format, and a worked transcript. Adds examples/ with prospective outcome tracking and evals/ implementing the framework's own falsification test as a blinded pairwise eval. Rigor fixes in THEORY.md: Prediction 2 softened to its monotonic form, Weimar labeled a retrodiction, Assembly Theory citations corrected and its contested status acknowledged. Full change-to-rationale mapping in CHANGES.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructure: operational skill core, anchored diagnostics, theory split, eval harness#3

Restructure: operational skill core, anchored diagnostics, theory split, eval harness#3
JoshKirk800 wants to merge 1 commit into
soleio:mainfrom
JoshKirk800:operational-restructure

JoshKirk800 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JoshKirk800 commented Jun 12, 2026

What this is

Changes

What is deliberately preserved

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant