Skip to content

fix(events): JSONL canonical, SQLite projection, reconcile-on-init, doctor --reconcile#164

Open
Gradata wants to merge 1 commit intopr/cleanup-and-tests-2026-05-02from
pr/dualwrite-atomicity-2026-05-02
Open

fix(events): JSONL canonical, SQLite projection, reconcile-on-init, doctor --reconcile#164
Gradata wants to merge 1 commit intopr/cleanup-and-tests-2026-05-02from
pr/dualwrite-atomicity-2026-05-02

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented May 2, 2026

Summary

Fixes council #1 production blocker. Dual-write to events.jsonl + system.db had no two-phase commit; a crash mid-write left the brain in silent split-brain state with no recovery path.

This PR makes JSONL the canonical source of truth, SQLite the idempotent projection, and adds reconciliation on every Brain() open plus a gradata doctor --reconcile operator escape hatch.

Stacks on #163 — please merge that one first.

Changes

  • _events.py — JSONL append + fsync FIRST, SQLite INSERT is the projection. New reconcile_jsonl_to_sqlite() replays missing rows.
  • brain.pyBrain.__init__ runs JSONL → SQLite reconcile after migrations. New public observe(text, kind=) API.
  • cli.py + _doctor.py — new gradata doctor --reconcile reports drift count and heals it.
  • tests/test_dualwrite_atomicity.py — 6 path-agnostic public-API tests covering happy path, kill-9 mid-batch, reconcile replay, idempotency, doctor CLI, concurrent writers.

Test plan

  • pytest tests/test_dualwrite_atomicity.py6 passed.
  • Full focused regression on changed surface — 42 passed.
  • Non-integration suite (excluding socket-bound daemon/plugin tests blocked by sandbox) — 4130 passed, 4 skipped.
  • pyright src/ — 0 errors, 27 warnings (unchanged baseline).

Layering check

_events.py is Layer 0; Brain.__init__ (Layer 2) calls into it. No upward imports introduced.

Risk

  • Reconcile-on-init runs on every Brain open. ~50-200ms on a 100k-event brain at first reopen; watermark is incremental so subsequent opens are O(drift), not O(total).
  • Concurrent writers serialize via JSONL append + advisory lock — throughput trade-off accepted for correctness.
  • Property: jsonl_count >= sqlite_count is now an invariant. Reverse drift is impossible.

Council references

  • council_2026-05-02T11-59-00.md (v4 RISK class, all 7 lenses through fallback chain)
  • council_2026-05-02T12-24-08.md (PR sequencing — split + tests-as-spec first)

…octor --reconcile

Council v4 (council_2026-05-02T11-59-00.md) ranked dual-write atomicity
the #1 production blocker. Crash mid-write between events.jsonl append
and SQLite INSERT could leave the brain in silent split-brain state
with no recovery path.

What
- src/gradata/_events.py
  - JSONL is the canonical source of truth. Append + fsync FIRST,
    SQLite INSERT is now an idempotent projection derived from JSONL.
  - Added reconcile_jsonl_to_sqlite() that scans JSONL past the
    SQLite watermark and replays missing rows.
  - Single SQLite projection helper used by both the live write path
    and the retain orchestrator.
  - Env-gated crash-window delay for deterministic kill-9 testing
    only (no production effect).
- src/gradata/brain.py
  - Brain.__init__ runs JSONL → SQLite reconciliation after migrations.
  - Brain() resolves BRAIN_DIR / cwd when no explicit path is supplied.
  - observe(text, kind="correction") public event API used by the
    PR2 spec.
- src/gradata/cli.py + src/gradata/_doctor.py
  - New `gradata doctor --reconcile`: scans for drift, reports the
    count, replays missing JSONL rows into SQLite, exits non-zero on
    inconsistency that can't be healed.
- tests/test_dualwrite_atomicity.py
  - Path-agnostic public-API tests covering: happy path, kill-9 mid
    batch (JSONL must lead SQLite, never trail), reconcile replay,
    idempotency, doctor CLI drift report, concurrent-writer JSONL
    line integrity.

Why
- Before: dual-write claimed atomic in CLAUDE.md, no two-phase commit,
  no recovery. Crash → silent data loss or duplicate-on-replay.
- After: JSONL is the log, SQLite is the projection. Every reopen
  reconciles. doctor --reconcile is the operator escape hatch.
  Property: jsonl_count >= sqlite_count, always.

Test plan
- pytest tests/test_dualwrite_atomicity.py — 6 passed.
- Full focused regression on changed surface — 42 passed.
- Non-integration suite (excluding socket-bound daemon/plugin tests
  blocked by sandbox) — 4130 passed, 4 skipped.
- pyright src/ — 0 errors, 27 warnings (unchanged baseline).

Layering check
- _events.py is Layer 0. Brain.__init__ in Layer 2 calls into it.
  No upward imports introduced.

Risk
- Reconcile-on-init runs on every Brain open. For a brain with
  100k events this adds ~50ms-200ms one-time at startup. Watermark
  is incremental so subsequent opens are O(drift) not O(total).
- Concurrent writers serialize via JSONL append + advisory lock.
  Throughput trade-off is acceptable for correctness.

Council references
- council_2026-05-02T11-59-00.md (v4 RISK class, all 7 lenses)
- council_2026-05-02T12-24-08.md (PR sequencing — TDD-first)

Stacks on #163.
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 878bfe11-0299-4c4b-b8a6-24cf9268ea5a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pr/dualwrite-atomicity-2026-05-02

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants