Skip to content

cost: memoize CostAnalyzer._read_records with mtime invalidation#152

Open
Antawari wants to merge 1 commit into
mainfrom
ishtar/cost-analyzer-memoize-ledger
Open

cost: memoize CostAnalyzer._read_records with mtime invalidation#152
Antawari wants to merge 1 commit into
mainfrom
ishtar/cost-analyzer-memoize-ledger

Conversation

@Antawari
Copy link
Copy Markdown
Contributor

Summary

Every CostAnalyzer public method (cumulative_cost, session_cost, agent_costs, model_costs, all_sessions, all_records) started with self._read_records() — which opened the file, iterated every line, ran json.loads, AND Pydantic-validated each record into either DispatchRecord or PipelineRecord. The default cost_summary callback (bonfire cost with no subcommand) calls both cumulative_cost() AND all_sessions() — two full ledger passes per command. On a long-lived operator's 100k+ line ledger that's a wall-clock pause every invocation.

Fix

src/bonfire/cost/analyzer.py

  • Added per-instance _cache + _cache_mtime_ns attributes.
  • _read_records() now checks the ledger's current mtime against the cached value; on match, returns the cached tuple as-is (identity-equal). On mismatch (or first call), re-reads + repopulates the cache.
  • Missing-file case handled via _MTIME_MISSING = -1 sentinel — distinct from any valid os.stat().st_mtime_ns, so the cache invalidates correctly when the file appears after a prior missing-file read.
  • New _current_ledger_mtime_ns() helper isolates the stat call + missing-file handling in one place.
  • Updated class docstring to acknowledge the memoization contract.

Tests

tests/unit/test_cost_analyzer_memoization.py (new · Knight RED) — 8 tests across 4 classes:

Class Test What it pins
TestReadRecordsMemoization Back-to-back calls return same tuple instance Cache-hit identity (the strongest assertion)
TestReadRecordsMemoization Cache-hit returns identical record content Sanity guard
TestCacheInvalidatesOnMtimeChange Ledger modification triggers re-read Uses os.utime to force mtime advance
TestCacheInvalidatesOnMtimeChange No file change → cache holds Identity check
TestCacheAcrossPublicMethods cumulative_cost + all_sessions share one read Verified via Path.open monkeypatch counter (pre-fix: 2 opens; post-fix: 1)
TestCacheAcrossPublicMethods All 6 public methods consistent across the cache Result-correctness regression guard
TestCacheEmptyLedgerHandling Missing-file → empty result + cached identity Edge case
TestCacheEmptyLedgerHandling File created after first read → cache invalidates Mtime-from-missing-to-real transition

TDD verification

  • Knight RED state confirmed against pre-fix code: 5 fail / 3 pass (the 3 passing tests are correctness regression guards that hold pre-fix because the no-cache path is "always fresh").
  • Warrior GREEN state confirmed after fix: 8/8 pass in 0.81s.
  • Full cost-package regression: 83 passed (test_cost_analyzer.py 894-line suite + cost_cli + cost_consumer + cost_models).
  • ruff check + ruff format --check on changed files: clean.

Out of scope (filed for follow-up PR)

Acceptance criterion #2 — skipping Pydantic per-record validation for read-only aggregations (raw dict access for cumulative_cost / agent_costs / model_costs). The memoization alone delivers the headline "one read per CLI invocation" performance win; the Pydantic-bypass is a secondary optimization with broader code-shape impact (the aggregation methods would need a parallel raw-dict path). Filed for a separate PR.

Test plan

  • pytest tests/unit/test_cost_analyzer_memoization.py — confirm 8/8 green.
  • pytest tests/unit/test_cost_analyzer.py — confirm full existing suite still green.
  • (Optional manual) time bonfire cost against a synthetic 100k-line ledger before/after; expect cost_summary to halve its wall-clock pause.

🤖 Generated with Claude Code

Every CostAnalyzer public method (cumulative_cost, session_cost,
agent_costs, model_costs, all_sessions, all_records) started with
self._read_records(), which opened the file, iterated every line, ran
json.loads, AND Pydantic-validated each record. The default cost_summary
callback (`bonfire cost` with no subcommand) calls cumulative_cost() AND
all_sessions() — two full ledger passes per command. On a long-lived
operator's 100k+ line ledger this is a wall-clock pause every invocation.

## Fix

src/bonfire/cost/analyzer.py:

  - Added per-instance _cache + _cache_mtime_ns attributes.
  - _read_records() now checks the ledger's current mtime against the
    cached value; on match, returns the cached tuple as-is (identity-
    equal). On mismatch (or first call), re-reads + repopulates the cache.
  - Missing-file case handled via _MTIME_MISSING (-1) sentinel — distinct
    from any valid os.stat().st_mtime_ns, so the cache invalidates
    correctly when the file appears after a prior missing-file read.
  - _current_ledger_mtime_ns() helper isolates the stat call + missing-
    file handling in one place.
  - Updated class docstring to acknowledge the memoization contract.

## Tests

tests/unit/test_cost_analyzer_memoization.py (new · Knight RED):

  Eight tests across four classes:
  - TestReadRecordsMemoization (2):
      * Back-to-back calls return the EXACT SAME tuple (identity check —
        the strongest cache-hit assertion).
      * Cache-hit returns identical record content (sanity guard).
  - TestCacheInvalidatesOnMtimeChange (2):
      * Ledger modification triggers cache miss + re-read (uses
        os.utime to force mtime advance for filesystems with second-
        resolution timestamps).
      * No file change → cache holds (identity check).
  - TestCacheAcrossPublicMethods (2):
      * cumulative_cost() + all_sessions() share one underlying read
        (verified via Path.open monkeypatch counter — pre-fix opens=2,
        post-fix opens=1).
      * All six public methods called in sequence return consistent
        results (regression guard).
  - TestCacheEmptyLedgerHandling (2):
      * Missing-file → empty result + cached identity.
      * File created after first read → cache invalidates and second
        read picks up new state.

## Out of scope (acceptance criterion #2 — deferred to follow-up PR)

Skipping Pydantic per-record validation for read-only aggregations (raw
dict access for cumulative_cost / agent_costs / model_costs). The
memoization alone delivers the headline "one read per CLI invocation"
performance win; the Pydantic-bypass is a secondary optimization with
broader code-shape impact (the aggregation methods would need a parallel
raw-dict path). Filed for a separate PR.

## Verification

  pytest tests/unit/test_cost_analyzer_memoization.py
    8 passed (Knight RED → GREEN verified)

  pytest tests/unit/test_cost_analyzer.py + cost_cli + cost_consumer + cost_models
    83 passed (full cost-package regression)

  ruff check + format on changed files: clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant