Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@ This project follows [Semantic Versioning](https://semver.org/). From **v1.0.0**

## Unreleased

### Added

- **`flightdeck pricing check`** — reports **`flightdeck-bundled-*`** snapshot age vs **`--max-age-days`** (default **90**); **`--fail`** for CI. **`release diff`** / **`POST /v1/diff`** append **`pricing.warnings`** when bundled snapshots exceed the same age threshold.
- **`flightdeck.integrations.telemetry.configure_otel_tracing()`** — optional OTLP HTTP **`TracerProvider`** wiring when the **`telemetry`** extra is installed (see **`docs/sdk-integrations.md`**).
- **SDK:** **`flightdeck.sdk.http_common`** shared serializers and retry policy; parity tests keep sync/async clients aligned. **`pytest-cov`** no longer omits **`sdk/client.py`**.

### Changed

- **`[project.optional-dependencies] dev`:** **`ruff`** is **`>=0.15,<0.16`** (was an exact patch pin) so **`pip install`** / shared venvs can resolve alongside other tools; **`uv sync --frozen`** still follows **`uv.lock`**. **`docs/troubleshooting.md`** notes checking **`uv.lock`** for the resolved **`0.15.x`** wheel.
- **Docs / positioning:** README local-first and ICP copy; bundled pricing cadence, vendor pricing URLs in YAML comments, and **`docs/pricing-catalog.md`** / **ROADMAP** / **RELEASE_NOTES** staleness commitments.

## 1.2.0 - 2026-05-03

### Breaking
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Ship AI agents safely with release diffs, runtime evidence, and policy gates.**

FlightDeck is **local-first** (CLI + SQLite + optional **`flightdeck serve`** UI): run evidence, pricing tables, and the ledger **stay on disk in your environment** by default—**no trace or billing payload is sent to FlightDeck as a vendor**. That posture matters for **regulated**, **air-gapped**, and **data-sovereignty** teams that cannot ship telemetry to a third-party SaaS observability backend. It is not an agent framework, prompt IDE, tracing dashboard, or gateway — it is where **what shipped**, **what ran**, **what it cost**, and **whether promote is allowed** are recorded and compared.
FlightDeck is **local-first** (CLI + SQLite + optional **`flightdeck serve`** UI): run evidence, pricing tables, and the ledger **stay on disk in your environment** by default—**no data leaves your infrastructure** for FlightDeck’s own product telemetry (there is no vendor backend that ingests your runs or tariffs). **No trace or billing payload is sent to FlightDeck as a vendor.** That posture matters for **regulated**, **air-gapped**, and **data-sovereignty** teams that cannot ship telemetry to a third-party SaaS observability backend. It is not an agent framework, prompt IDE, tracing dashboard, or gateway — it is where **what shipped**, **what ran**, **what it cost**, and **whether promote is allowed** are recorded and compared.

## In ~20 seconds

Expand All @@ -17,10 +17,10 @@ You ship a candidate whose **system prompt drifts by a handful of tokens**; unde

## Who should use this?

- **Primary buyer / ICP:** **Platform or ML engineering teams** (often **5–30** people) at **growth-stage** companies shipping **two or more** **LLM agents** to production—especially teams that already had a **cost** or **regression** incident from a **prompt** or **model** change and need a **governed** promote path.
- **Growth-stage ICP:** **Platform or ML engineering teams** (often **5–30** people) at companies shipping **two or more** **LLM agents** to production—especially after a **cost** or **regression** incident from a **prompt** or **model** change—who need a **fast, governed** promote path without standing up a hosted observability product.
- **Regulated / enterprise ICP:** **Platform and SRE teams** in **healthcare, fintech, and similar** environments where buying criteria center on **data residency**, **audit trails**, and **provable control** over evidence and pricing data—**local-first** defaults and optional self-hosted **`flightdeck serve`** match a compliance-led evaluation, not only a velocity-led one.
- Teams that **version agent builds** (prompts, tools, model pins) and need a **durable audit trail**.
- Engineers who want **one command** to answer “is this candidate safe to roll forward?” with **numbers**, not gut feel.
- **Healthcare, fintech, and enterprise** operators who **cannot** default to sending traces or cost data to a **hosted** observability vendor—**local-first** evidence and pricing imports are the default integration model.
- Anyone who has outgrown **ad hoc** folder diffs or **spreadsheet** promote checklists.

## How FlightDeck fits your stack
Expand Down Expand Up @@ -83,7 +83,7 @@ Not implemented yet:
- hosted control plane
- automated traffic routing
- tool-cost pricing
- OpenTelemetry import/export mapping (optional **`uv sync --extra telemetry`** or **`pip install 'flightdeck-ai[telemetry]'`** for future work)
- OpenTelemetry: optional **`telemetry`** extra installs OTLP-capable SDK packages; call **`flightdeck.integrations.telemetry.configure_otel_tracing()`** once to wire an OTLP span exporter to **your** collector (see **[docs/sdk-integrations.md](docs/sdk-integrations.md)**)

Shipped locally:

Expand Down Expand Up @@ -128,7 +128,7 @@ Or use the bash wrapper (Git Bash / WSL on Windows):
./scripts/smoke.sh
```

**Bundled pricing (default `init`):** **`flightdeck init`** migrates the ledger, imports **OpenAI**, **Anthropic**, and **Google** (Gemini-class) tables at **`pricing_version` `flightdeck-bundled-2026-05`**, and writes **`.flightdeck/pricing-catalog.yaml`** with **`pricing_catalog_path`** set in **`flightdeck.yaml`**. In **`release.yaml`**, set **`spec.pricing_reference`** to `{ provider: openai | anthropic | google, pricing_version: flightdeck-bundled-2026-05 }` to get **per-table** and **catalog** cost lines on diffs without authoring YAML. These rates are a **convenience snapshot**, not live vendor billing—**`flightdeck pricing import`** your own files for production. Use **`flightdeck init --no-bundled-pricing`** for an empty ledger.
**Bundled pricing (default `init`):** **`flightdeck init`** migrates the ledger, imports **OpenAI**, **Anthropic**, and **Google** (Gemini-class) tables at **`pricing_version` `flightdeck-bundled-2026-05`**, and writes **`.flightdeck/pricing-catalog.yaml`** with **`pricing_catalog_path`** set in **`flightdeck.yaml`**. In **`release.yaml`**, set **`spec.pricing_reference`** to `{ provider: openai | anthropic | google, pricing_version: flightdeck-bundled-2026-05 }` to get **per-table** and **catalog** cost lines on diffs without authoring YAML. These rates are a **convenience snapshot**, not live vendor billing—**`flightdeck pricing import`** your own files for production. Use **`flightdeck init --no-bundled-pricing`** for an empty ledger. Official list-pricing URLs are referenced in comments atop the bundled YAML under **`src/flightdeck/bundled_pricing/`**. **`flightdeck pricing check`** flags bundled snapshots older than **90 days** (use **`--fail`** in CI); **`release diff`** adds **`pricing.warnings`** for the same condition so cost lines do not go silently stale. **Release policy:** bundled tables are **refreshed with each minor release** when vendor public list pricing changes materially (see **[ROADMAP.md](ROADMAP.md)**).

Or walk through the **full quickstart** (policy + **two** custom tariffs for the **~31%** narrative—same flow CI runs):

Expand Down
6 changes: 6 additions & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ High-level notes for **shipping FlightDeck**. Detailed history: **[CHANGELOG.md]

Narrative docs (including the CLI reference) are maintained on **[github.com/flightdeckdev/flightdeck](https://github.com/flightdeckdev/flightdeck)** `main`; this file and **`schemas/`** ship in minimal clones.

## Unreleased (in development)

- **Bundled pricing hygiene:** **`flightdeck pricing check`** reports **`flightdeck-bundled-*`** snapshot age vs **`--max-age-days`** (default **90**); **`--fail`** exits non-zero for CI. **`release diff`** / **`POST /v1/diff`** append **`pricing.warnings`** for the same staleness rule so cost signals do not go silently wrong. Bundled YAML gains vendor **official pricing** URL comments; docs and **ROADMAP** state a **minor-release refresh** cadence for the bundled snapshot when list prices move materially.
- **Contributor tooling:** **`[project.optional-dependencies] dev`** uses **`ruff>=0.15,<0.16`** (see **`CHANGELOG.md`**).
- **Telemetry extra:** optional **`flightdeck.integrations.telemetry.configure_otel_tracing()`** wires OTLP span export to **your** backend (see **`docs/sdk-integrations.md`**).

## v1.2.0 — Python 3.11+, protected ingest and reads, bundled pricing, Postgres, integrations

Minor release (see **[CHANGELOG.md](CHANGELOG.md)** for the full list).
Expand Down
1 change: 1 addition & 0 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ This document is **strategy and ordering**, not a second changelog. It goes from
- **Evidence ingestion:** `runs ingest` from JSONL/JSON arrays plus stable `POST /v1/events` (`schemas/v1/`); **`GET /v1/runs`**, **`runs list`**, optional **`trace_id`** filter, and **`runs export`** (JSONL) for operator forensics.
- **Local API + UI:** `flightdeck serve` routes and shipped web bundle under `src/flightdeck/server/static/`; surfaces summarized in **Web UI and operator experience** below.
- **SDK and tooling:** Python sync/async clients with retries/batching and `flightdeck-quickstart-verify`.
- **Bundled default pricing:** convenience **`flightdeck-bundled-YYYY-MM`** tables from **`flightdeck init`**; **refreshed on each minor release** when upstream public list pricing changes materially, with **`flightdeck pricing check`** / diff **`pricing.warnings`** guarding silent staleness (operators still **`pricing import`** for production truth).
- **Operator references:** CI examples, deploy/Compose guidance, Helm and fleet examples under `examples/`.

---
Expand Down
6 changes: 6 additions & 0 deletions docs/pricing-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ for the side you want priced. For **Gemini-class** models, use **`provider: goog
release runtime and pricing reference. For production accuracy, **`flightdeck pricing import`**
your own YAML (and optionally **`--replace`** with **`--reason`**).

Bundled table YAML in the wheel includes **comment links** to each provider’s official list-pricing page so you can spot-check rates between FlightDeck releases.

**Staleness guardrails:** list prices change often. Run **`flightdeck pricing check`** to see whether any **`flightdeck-bundled-*`** table in the ledger is older than **`--max-age-days`** (default **90**); pass **`--fail`** for CI. **`flightdeck release diff`** and **`POST /v1/diff`** add **`pricing.warnings`** when baseline or candidate **`pricing_version`** is a stale bundled snapshot so economics do not look authoritative after the snapshot has aged out.

**Maintainer cadence:** the bundled snapshot is **updated on each minor release** when vendor public list pricing changes materially (see **[ROADMAP.md](../ROADMAP.md)**). Operators in production should still treat **`flightdeck pricing import`** as the source of truth.

## Relationship to `pricing.prices`

On a diff, **`pricing.prices`** (when present) reflects **per-side imported tables** for the resolved baseline/candidate
Expand Down
22 changes: 22 additions & 0 deletions docs/sdk-integrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,32 @@ product surface for orchestration.
| **`integrations-temporal`** | Install **`temporalio`** next to FlightDeck when your worker shares a venv |
| **`integrations-openai-agents`** | **`openai-agents`** for result-shape experiments |
| **`integrations-ci`** | Meta-extra for CI: LangChain + Temporal + OpenAI Agents resolution |
| **`telemetry`** | OpenTelemetry SDK + OTLP exporter packages; wire with **`flightdeck.integrations.telemetry.configure_otel_tracing()`** (see below) |
| **`all`** | Convenience bundle including **`telemetry`** |

There is **no** **`crewai`** extra on the distribution. Use **`crewai_bridge.run_event_from_crew_token_totals`**
with totals you collect from CrewAI (or install **`crewai`** only in your application environment).

## OpenTelemetry (`telemetry` extra)

Install **`flightdeck-ai[telemetry]`** (or **`uv sync --extra telemetry`**), then once per process:

```python
from flightdeck.integrations.telemetry import configure_otel_tracing

configure_otel_tracing()
```

This registers an OpenTelemetry **SDK** `TracerProvider` with an **OTLP HTTP** span exporter and
batch processor. Set **`OTEL_EXPORTER_OTLP_ENDPOINT`** (for example
`http://127.0.0.1:4318/v1/traces`) and optional **`OTEL_EXPORTER_OTLP_HEADERS`** /
**`OTEL_SERVICE_NAME`** as documented for **`opentelemetry-exporter-otlp`**. Spans are sent to
**your** collector, not to FlightDeck as a vendor. A second call is a no-op unless you pass
**`force=True`** (rebinds the provider—use sparingly in tests).

FlightDeck does not auto-instrument **`httpx`** or the Python SDK; create spans in your app or
attach upstream auto-instrumentation if you need request-level traces.

## Trust boundaries

Anyone who can reach **`POST /v1/events`** can append ledger rows. Keep **`flightdeck serve`**
Expand Down
4 changes: 3 additions & 1 deletion docs/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

`flightdeck.sdk` is a thin HTTP client for emitting runtime evidence and triggering release
actions against a running `flightdeck serve` instance. It ships with the same SemVer as the
CLI; see [RELEASE_NOTES.md](../RELEASE_NOTES.md) for stability expectations.
CLI; see [RELEASE_NOTES.md](../RELEASE_NOTES.md) for stability expectations. Internally,
**`flightdeck.sdk.http_common`** holds shared URL/header helpers, JSON/query serializers, and
retry loops so **`FlightdeckClient`** (sync) and **`AsyncFlightdeckClient`** stay wire-identical.

For most workflows the CLI is sufficient. Use the SDK when you need to:

Expand Down
4 changes: 2 additions & 2 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ Set `FLIGHTDECK_USE_SYSTEM_TEMP=1` to force pytest to use the OS default path ag
Run `uv run python -m ruff check --fix src tests` to apply auto-fixable issues. For
remaining errors, read the rule code (e.g. `E501`, `F401`) in the output and fix manually.

Check what ruff version CI uses: `uv run python -m ruff --version` (must match `ruff==0.15.12`
pinned in `pyproject.toml [project.optional-dependencies] dev`).
Check what ruff version CI uses: `uv run python -m ruff --version` (CI resolves **`ruff>=0.15,<0.16`**
from `pyproject.toml [project.optional-dependencies] dev`; see **`uv.lock`** for the exact wheel).

---

Expand Down
4 changes: 1 addition & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ all = [
dev = [
"pytest>=7.0",
"pytest-cov>=4.0",
"ruff==0.15.12",
"ruff>=0.15,<0.16",
]
postgres = [
"psycopg[binary]>=3.2",
Expand Down Expand Up @@ -98,8 +98,6 @@ source = ["src/flightdeck"]
omit = [
"src/flightdeck/quickstart_smoke.py",
"src/flightdeck/integrations/*",
# Thin HTTP wrapper; core policy/diff/storage coverage is the governance bar.
"src/flightdeck/sdk/client.py",
]

[tool.coverage.report]
Expand Down
1 change: 1 addition & 0 deletions src/flightdeck/bundled_pricing/anthropic.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# FlightDeck bundled snapshot — illustrative public list prices, not live vendor APIs.
# For production accuracy, run: flightdeck pricing import <your_authoritative.yaml>
# Snapshot id: flightdeck-bundled-2026-05 (see README / docs/pricing-catalog.md).
# Official vendor list pricing (verify before trusting this file): https://www.anthropic.com/pricing
provider: anthropic
pricing_version: flightdeck-bundled-2026-05
entries:
Expand Down
2 changes: 2 additions & 0 deletions src/flightdeck/bundled_pricing/catalog.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Bundled PricingCatalog — maps bundled pricing tables to one comparable slot per tier.
# api_version / kind per schemas/v1/pricing_catalog.schema.json
# Compare bundled rows with vendor pages: https://openai.com/api/pricing/
# https://www.anthropic.com/pricing https://ai.google.dev/gemini-api/docs/pricing
api_version: v1
kind: PricingCatalog
catalog_version: flightdeck-bundled-2026-05
Expand Down
1 change: 1 addition & 0 deletions src/flightdeck/bundled_pricing/google.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# Provider key "google" is the supported convention for Gemini-class models in release.yaml.
# For production accuracy, run: flightdeck pricing import <your_authoritative.yaml>
# Snapshot id: flightdeck-bundled-2026-05 (see README / docs/pricing-catalog.md).
# Official vendor list pricing (verify before trusting this file): https://ai.google.dev/gemini-api/docs/pricing
provider: google
pricing_version: flightdeck-bundled-2026-05
entries:
Expand Down
1 change: 1 addition & 0 deletions src/flightdeck/bundled_pricing/openai.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# FlightDeck bundled snapshot — illustrative public list prices, not live vendor APIs.
# For production accuracy, run: flightdeck pricing import <your_authoritative.yaml>
# Snapshot id: flightdeck-bundled-2026-05 (see README / docs/pricing-catalog.md).
# Official vendor list pricing (verify before trusting this file): https://openai.com/api/pricing/
provider: openai
pricing_version: flightdeck-bundled-2026-05
entries:
Expand Down
65 changes: 65 additions & 0 deletions src/flightdeck/bundled_pricing_age.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
"""Age and staleness helpers for ``flightdeck-bundled-YYYY-MM`` pricing snapshots."""

from __future__ import annotations

import re
from datetime import date, datetime, timezone

# Bundled snapshot ids use the first day of the labeled month as the freshness anchor.
BUNDLED_PRICING_VERSION_RE = re.compile(r"^flightdeck-bundled-(\d{4})-(\d{2})$")

# Default max age before CLI / diff warn (days since anchor).
DEFAULT_BUNDLED_PRICING_MAX_AGE_DAYS = 90


def pricing_stale_check_date() -> date:
"""UTC date used for staleness checks (patch in tests)."""
return datetime.now(timezone.utc).date()


def is_flightdeck_bundled_pricing_version(pricing_version: str) -> bool:
return bundled_pricing_anchor_date(pricing_version) is not None


def bundled_pricing_anchor_date(pricing_version: str) -> date | None:
m = BUNDLED_PRICING_VERSION_RE.match(pricing_version.strip())
if not m:
return None
year, month = int(m.group(1)), int(m.group(2))
if not (1 <= month <= 12):
return None
return date(year, month, 1)


def bundled_pricing_age_days(pricing_version: str, *, today: date) -> int | None:
anchor = bundled_pricing_anchor_date(pricing_version)
if anchor is None:
return None
return (today - anchor).days


def bundled_pricing_stale_warning(
pricing_version: str,
*,
today: date | None = None,
max_age_days: int = DEFAULT_BUNDLED_PRICING_MAX_AGE_DAYS,
role: str | None = None,
) -> str | None:
"""
Return a human-readable warning if this bundled snapshot is older than ``max_age_days``.

``role`` is optional ("baseline" / "candidate") for diff copy.
"""
anchor = bundled_pricing_anchor_date(pricing_version)
if anchor is None:
return None
day = today if today is not None else pricing_stale_check_date()
age = (day - anchor).days
if age <= max_age_days:
return None
prefix = f"{role} " if role else ""
return (
f"{prefix}pricing_version {pricing_version!r} is a FlightDeck bundled snapshot from "
f"{anchor.isoformat()} (~{age} days old). List prices drift; run `flightdeck pricing import` "
f"with authoritative YAML or upgrade to a newer `flightdeck-ai` minor for refreshed bundled tables."
)
Loading
Loading