@synapt/eval

Domain-agnostic eval framework for AI applications. Measure retrieval quality, generation accuracy, and policy compliance across any vertical.

Install

pip install synapt-eval

Or install from source:

pip install git+https://github.com/synapt-dev/eval.git

Quick Start

import asyncio
from synapt_eval import Fixture, EvalResult, CategoryMetrics
from synapt_eval.adapters import RetrievalAdapter, RetrievalCandidate
from synapt_eval.scoring import precision_at_k, recall_at_k
from synapt_eval.report_card import compose_report_card, generate_markdown

class MyRetrieval(RetrievalAdapter):
    async def retrieve(self, query: str, k: int = 10) -> list[RetrievalCandidate]:
        # Connect your vector store here
        return [RetrievalCandidate(id="doc1", score=0.95)]

# Run eval and generate report
results = [EvalResult(
    category="retrieval",
    metrics=CategoryMetrics(p_at_5=0.85, r_at_10=0.72, n=50),
)]
card = compose_report_card(results, run_id="my-first-eval")
print(generate_markdown(card))

See docs/quickstart.md for a complete walkthrough and examples/ for runnable code.

Architecture

synapt-eval separates the eval framework (scoring, review, reporting) from domain-specific adapters (your retrieval backend, your generation pipeline, your fixtures).

Layer              Module                      Purpose
-------            ------                      -------
Types              synapt_eval.types           Core data types (Fixture, EvalResult, CategoryMetrics)
Scoring            synapt_eval.scoring         Precision@K, Recall@K, Kendall's Tau
Adapters           synapt_eval.adapters        Customer-facing ABCs (Retrieval, Generation, Judge, Fixture)
Runner             synapt_eval.runner          Eval execution, orchestration, PR gate
Reviewer           synapt_eval.reviewer        Verdict framework, predicate chains, LLM judge bridge
Suggestion Engine  synapt_eval.suggestion_engine  Rule-based actionable recommendations
Report Card        synapt_eval.report_card     Markdown + JSON report generation
Trending           synapt_eval.trending        Self-hosted JSON history store + delta computation
CLI                synapt_eval.cli             Command-line viewer (synapt-eval trending)
Actions            synapt_eval.actions         GitHub Actions PR-gate adapter

Features

Feature	Description
Scoring primitives	Precision@K, Recall@K, Kendall's Tau rank correlation
Adapter pattern	Plug in any retrieval/generation backend via ABCs
Reviewer SDK	Composable predicate chains + LLM judge integration
Suggestion engine	10 baseline rules with decorator pattern for custom rules
Report card	Markdown + JSON output with schema versioning
PR gate	Regression detection with configurable thresholds
Trending	Self-hosted history store with CLI viewer
GitHub Action	`uses: synapt-dev/eval@v0.1.0` for CI integration

GitHub Action

Add eval gating to your PR workflow:

- name: Run eval
  run: python my_eval_script.py --output results.json

- name: PR Gate
  uses: synapt-dev/eval@v0.1.0
  with:
    results-path: results.json
    baseline-path: baseline.json
    threshold: "0.05"
    fail-on: error

The action posts a report card comment on the PR and fails the workflow on regressions. See docs/pr-gate.md for full configuration.

CLI

# View eval trending history
synapt-eval trending --path .synapt-eval/history --format text

# Output as markdown or JSON
synapt-eval trending --format markdown
synapt-eval trending --format json --limit 5

Documentation

Guide	Description
Quickstart	End-to-end retrieval eval in 60 lines
Adapter API	Writing custom adapters
Reviewer Framework	Custom reviewers + judge integration
PR Gate	GitHub Actions CI integration
Suggestions	Writing custom suggestion rules
Trending	Self-hosted trending CLI

Examples

Runnable examples in examples/:

retrieval-eval -- mock retrieval backend + fixtures + report card
generation-eval -- mock generation pipeline + judge
full-pipeline -- combined retrieval + generation + reviewer + suggestions

Pro Tier

Want vertical-specific eval packs, a hosted dashboard, or SOC2 attestations? Visit synapt.dev for synapt-eval Pro.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src/synapt_eval		src/synapt_eval
tests		tests
ts		ts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
action.yml		action.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@synapt/eval

Install

Quick Start

Architecture

Features

GitHub Action

CLI

Documentation

Examples

Pro Tier

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

@synapt/eval

Install

Quick Start

Architecture

Features

GitHub Action

CLI

Documentation

Examples

Pro Tier

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages