ReproProof is a lightweight, portable agent skill for auditing research claims, evidence boundaries, and reproducibility readiness before paper submission, thesis defense, benchmark release, or artifact publication.
It helps researchers catch unsupported claims, missing experiment details, and reproducibility gaps before their paper or code goes public.
It helps answer one practical question:
Are the claims in this research artifact actually supported by the reviewed evidence, and is the reproduction path clear enough for another researcher to inspect?
ReproProof is intentionally narrow. It does not write papers, simulate peer review, run experiments, or guarantee reproducibility. It produces a bounded audit based only on the materials provided.
Copy reproproof/ into your agent skill directory, then ask your agent:
Use ReproProof to audit this paper draft. Identify unsupported claims, missing reproduction details, and figure/table inconsistencies.
For a repository plus paper:
Use ReproProof to review this paper, README, configs, and result files as an artifact evaluation package.
ReproProof's main artifact is an Evidence Boundary Ledger:
| ID | Claim | Evidence found | Assessment | Risk |
|---|---|---|---|---|
| C1 | "Our method improves accuracy across all benchmark tasks." | Table 1 reports higher mean scores on four tasks. | Partially supported | High |
| C2 | "The method is computationally efficient." | No runtime, memory, or hardware comparison is provided. | Unsupported | High |
| C3 | "The ablation confirms the routing module is necessary." | Ablation table removes the routing module but reports one run only. | Partially supported | Medium |
See complete examples:
- Claim-evidence alignment
- Unsupported or overstated scientific claims
- Missing reproduction details
- Experimental protocol clarity
- Figure, table, caption, text, README, code, and config consistency
- Artifact evaluation and code release readiness
The core output is an Evidence Boundary Ledger: a claim-by-claim record of what the reviewed materials support, partially support, overstate, fail to support, contradict, or leave not assessable.
- Researchers preparing a paper submission
- PhD students checking a thesis or defense report
- Authors releasing benchmark results or research code
- Teams preparing artifact evaluation packages
- Reviewers or collaborators checking whether claims are evidence-bound
.
|-- reproproof/
| |-- SKILL.md
| |-- README.md
| |-- references/
| | |-- audit-scope-routing.md
| | `-- reproducibility-checklist.md
| `-- templates/
| `-- audit-report.md
|-- docs/
| |-- comparison.md
| |-- launch-kit.md
| `-- release-checklist.md
|-- examples/
| `-- reproproof/
| |-- artifact-package-audit.md
| `-- paper-only-audit.md
|-- .github/
| |-- ISSUE_TEMPLATE/
| `-- workflows/
|-- scripts/
| `-- validate_skills.py
|-- CITATION.cff
|-- CONTRIBUTING.md
|-- LICENSE
|-- SECURITY.md
`-- README.md
Copy the reproproof/ directory into any agent runtime or project that supports skill-style folders with a SKILL.md entry point.
skills/
`-- reproproof/
|-- SKILL.md
|-- references/
`-- templates/
For Cursor-style project installation, copy reproproof/ to:
your-project/.cursor/skills/reproproof/
For personal installation, copy it to:
~/.cursor/skills/reproproof/
Use ReproProof to audit this paper draft for reproducibility readiness.
Check whether the benchmark claims in this report are supported by the tables, logs, and README.
Review this repository and paper as an artifact evaluation package.
Quick audits include:
- Overall readiness
- Top risks
- Claim-evidence gaps
- Missing reproduction details
- Immediate fixes
Full audits follow reproproof/templates/audit-report.md and include:
- Executive summary
- Audit boundary
- Evidence Boundary Ledger
- Severity-ranked findings
- Reproducibility checklist
- Consistency checks
- Safer wording suggestions for overclaims
- Release readiness actions
See examples/reproproof for sample outputs.
ReproProof is adjacent to larger academic-agent and reproducibility projects such as:
- sistm/AI4Reproducibility
- brycewang-stanford/Auto-Empirical-Research-Skills
- Academic research skill collections and artifact-evaluation templates
ReproProof's contribution is narrower: it is a final readiness pass that keeps claims tied to reviewed evidence and separates unavailable material from actual defects.
If ReproProof helps your paper, benchmark, or artifact release, consider starring the repository or citing it with the metadata in CITATION.cff.
Launch copy and community-post templates are available in docs/launch-kit.md.
Run the lightweight validation script:
python scripts/validate_skills.pyOn Windows, if python resolves to the Microsoft Store launcher, use:
py scripts\validate_skills.pyMIT License. See LICENSE.