An MCP server that lets an AI agent gate its own work before it claims "done": deterministic checks, then an independent refute-first review, then a tamper-evident honest receipt.
Agents that grade their own homework ship low-quality output. agent-gate turns that discipline into tools an agent must actually pass: a fail-closed checklist and an append-only, hash-chained receipts ledger. It is Fleet Mode, my agent-orchestration doctrine, made into a runnable tool. Receipts over hype, enforced by the data structures.
π§© One layer of a five-repo cost-governance stack for operating AI agents cost-efficiently; bow is the flagship that runs every layer in production.
agent: "done!" -> verify_gate(evidence) -> { passed: false, blocking: ["independent_refute_review", "no_secrets"] }
The expensive failures in agent systems are the silent ones: a model update degrades output, a change quietly breaks a workflow, an agent declares success while the work is wrong. The fix is not a smarter model. It is a gate the agent cannot talk its way past:
- Fail-closed. A check counts as satisfied only if it is explicitly true. Missing proof is not proof. (Mirrors a promotion gate, not an informal check.)
- Tamper-evident receipts. Every decision is recorded as
(decision, metric, value, verdict)linked into a sha256 chain. Edit or delete any past receipt andverify_chain()returns false. The honest log is enforced by the structure, not by good intentions. - Human-gated by default. "Any irreversible/outward act got human approval" is a required check. Agents draft, humans approve.
| Tool | What it does |
|---|---|
gate_checklist(name="ship") |
Returns the checklist the agent must satisfy before claiming done. |
verify_gate(evidence, name="ship") |
Evaluates evidence fail-closed and returns {passed, blocking}. |
record_receipt(decision, metric, value, verdict) |
Appends an honest, hash-chained receipt; returns it. |
read_receipts() |
Returns every receipt plus whether the chain is intact. |
The default ship gate encodes Fleet Mode: deterministic_checks_pass, independent_refute_review, no_secrets, human_gated_if_irreversible, honest_receipt_logged.
pip install mcp-agent-gate # or: pip install -e . (from source)Add it to your MCP client (Claude Desktop / Claude Code) config:
{
"mcpServers": {
"agent-gate": { "command": "python", "args": ["-m", "agent_gate.server"] }
}
}Now your agent can call verify_gate(...) before it tells you it is finished, and you get a tamper-evident trail of what it decided. Receipts persist to ~/.agent-gate/receipts.jsonl (override with AGENT_GATE_LEDGER).
from agent_gate.gate import DEFAULT_SHIP_GATE
from agent_gate.ledger import Ledger
res = DEFAULT_SHIP_GATE.evaluate({
"deterministic_checks_pass": True,
"independent_refute_review": True,
"no_secrets": True,
"human_gated_if_irreversible": True,
# honest_receipt_logged missing -> fail-closed
})
print(res.passed, res.blocking) # False ['honest_receipt_logged']
led = Ledger("receipts.jsonl")
led.append(decision="ship v0.1", metric="tests", value="pass", verdict="shipped")
print(led.verify_chain()) # True (until someone edits the log)- Tested, stdlib-only core.
agent_gate/gate.py(fail-closed checklist) andagent_gate/ledger.py(hash-chained receipts) are pure stdlib: fast to read, fast to trust.agent_gate/server.pyis a thin MCP adapter over them (the one runtime dependency:mcp). - Tests pass on Python 3.11-3.13 (see CI). The MCP tools are tested by calling them, not just importing.
pip install -e ".[dev]" && python -m pytest -qRun it yourself: PYTHONPATH=. python3 examples/demo.py
------------------------------------------------------------
1. Agent claims done β but two checks are missing
------------------------------------------------------------
{
"passed": false,
"blocking": [
"human_gated_if_irreversible",
"honest_receipt_logged"
]
}
------------------------------------------------------------
2. Agent satisfies all five checks
------------------------------------------------------------
{
"passed": true,
"blocking": []
}
------------------------------------------------------------
3. Record a hash-chained receipt
------------------------------------------------------------
{
"seq": 1,
"decision": "ship v0.1",
"verdict": "shipped",
"hash": "015202a168512f15..."
}
{
"seq": 2,
"decision": "deploy",
"verdict": "approved",
"hash": "9533d304d4dd07e5..."
}
------------------------------------------------------------
4. Verify the chain β edit receipts.jsonl to see this flip to False
------------------------------------------------------------
chain_intact: True
agent-gate is about not shipping unverified work, so the repository holds itself to the same bar:
- Coverage-gated test matrix β
ci.ymlruns pytest on Python 3.11β3.13 and fails the build if line coverage drops below the threshold (currently 97% covered). - CodeQL β static analysis (
security-extended) runs on every push, PR, and weekly; findings surface in the Security tab. - Pinned supply chain β every GitHub Action is pinned to a full commit SHA; Dependabot keeps those pins and the Python deps current.
- Branch protection β
mainrequires the CI and CodeQL checks to pass before a merge. - Disclosure policy β see SECURITY.md.
See CONTRIBUTING.md.
Built by Jeff Otterson (Jott2121). agent-gate operationalizes the gating discipline from bow (an autonomous all-Claude chief-of-staff agent) and the Fleet Mode doctrine. Siblings in the same line: rag-guard and agent-cost-attribution. MIT licensed.

