Skip to content

Refactor Solver/Integrator flow to use dataclasses, combine camera/aligned position estimates into ImuDeadReckoning#429

Merged
brickbots merged 18 commits into
mainfrom
refactor/pointing-estimate-dataclasses
May 30, 2026
Merged

Refactor Solver/Integrator flow to use dataclasses, combine camera/aligned position estimates into ImuDeadReckoning#429
brickbots merged 18 commits into
mainfrom
refactor/pointing-estimate-dataclasses

Conversation

@brickbots
Copy link
Copy Markdown
Owner

@brickbots brickbots commented May 23, 2026

Summary

Reworks the Positioning data path (solver.pyintegrator.py → consumers) off the legacy solved dict and onto typed dataclasses, then sharpens the boundaries between the solver, the integrator, and the IMU dead-reckoner. The change lands in three layers:

Reviewers

There is a lot of stuff here, but the main areas for input are the shape of the dataclasses in PiFinder/types/positioning.py and the flow of data from solver.py->integrator.py.

1. Dataclass migration — kill the solved dict (220a6484)

End-to-end adoption of the dataclasses in PiFinder/types/positioning.py:

  • PointingEstimate replaces the solved dict everywhere. Pointing is modelled as a 2 × 2 matrix — two axes (camera, aligned) × two states (solve, estimate) — reached as pointing.<axis>.<state>.<RA|Dec|Roll>.
  • The integrator owns the long-lived PointingEstimate; its solve cells survive failed plate-solves so IMU dead-reckoning keeps producing aligned estimates.
  • shared_state.solution() / set_solution() carry PointingEstimate and default to an empty instance, so all ~15 consumers (UI, server, pos_server, nearby, …) read through the access shape without None checks.
  • Alignment queues carry typed commands/responses (AlignOnRaDec / AlignCancel / ReloadSqmCalibration, AlignedResult), dispatched via isinstance().
  • Rename solve_pixeltarget_pixel across code, default_config.json, and shared_state. No config-migration logic — users re-align to write the new key.
  • Deletes dead code (solver_main.py, get_initialized_solved_dict(), the legacy dict bridge methods).
  • Adds matched_centroids / matched_stars to PointingEstimate so the SQM calibration UI can replay SQM from cached published solutions.

2. Collapse the two-axis IDR into one dual-axis instance (9a2ee904)

  • ImuDeadReckoning now handles both axes in one instance. solve(camera, aligned, q_x2imu) captures the static q_cam2aligned alongside the drifting q_eq2x; predict() composes the camera prediction with q_cam2aligned and returns both as a (camera, aligned) tuple. reset() clears both.
  • integrator.py drops the idr_camera / idr_aligned pair for a single idr.
  • The IDR stays a pure math primitive: RaDecRoll in, RaDecRoll out — no import from PiFinder.types.positioning.
  • imu_dead_reckoning_legacy.py is retained in-tree as the equivalence reference until we're comfortable removing it in a follow-up.

3. Split the solver→integrator message into a SolveResult DTO (913e5d08)

The migration in (1) had the solver build the same PointingEstimate the integrator owns and publishes — forcing it to set estimate == solve cells and emit a hollow PointingEstimate on failure. This layer separates the wire message from the published aggregate:

  • New SolveResult = SuccessfulSolve | FailedSolve, a typed DTO on solver_queue. The integrator dispatches via isinstance() and is the sole builder/owner of PointingEstimate.
  • SuccessfulSolve carries flat per-axis Pointings (no solve/estimate split — the solver never IMU-progresses) plus a single solve_time; the integrator fans these into both cells and assigns solve_time / cam_solve_time.
  • The inline failed-solve branch is promoted to _apply_failed_solve so tests exercise the real integrator path.
  • Rationale captured in ADR-0003; vocabulary added to positioning/CONTEXT.md.

Also in this layer: silence a spurious invalid value encountered in isnan RuntimeWarningnp.isnan() on the IDR's NaN-quaternion sentinel tripped numpy's FP-invalid flag. Replaced with a per-component math.isnan helper (_quat_has_nan), clearing all 25 unit-test warnings.

Docs (fb92dcd0 + this work)

docs/ax/positioning.md and docs/ax/positioning/CONTEXT.md updated to the new vocabulary; ADR-0003 added. The canonical glossary (CONTEXT.md) and the vocabulary decision (ADR-0001) remain authoritative for terms.

Test plan

  • nox -s lint type_hints smoke_tests unit_tests — all green (214 unit tests pass; 219 unit + smoke), 0 warnings.
  • Run on hardware: confirm dead-reckoning continues across failed solves and the aligned-axis offset is preserved under IMU motion.
  • Run alignment flow: confirm the IDR picks up the alignment offset on the next solve and pointing.aligned.estimate reflects it.

Additional Notes for reviewers

  • imu_dead_reckoning_legacy.py is intentionally retained as the equivalence-test reference; removal is a planned follow-up.
  • A stale handoff.md scratch doc that had slipped into the branch was removed in this PR.

🤖 Generated with Claude Code

brickbots and others added 3 commits May 22, 2026 21:15
End-to-end adoption of the dataclasses in PiFinder/types/positioning.py:

- Solver builds a fresh PointingEstimate per attempt and pushes to
  solver_queue. Alignment queue carries AlignOnRaDec / AlignCancel /
  ReloadSqmCalibration commands and AlignedResult responses, dispatched
  via isinstance().
- Integrator owns the long-lived PointingEstimate. solve cells survive
  failed plate-solves so IMU dead-reckoning keeps producing aligned
  estimates.
- shared_state.solution() / set_solution() carry PointingEstimate;
  default to an empty PointingEstimate() so consumers can always call
  .has_pointing() without a None check.
- All ~15 consumers (UI, server, pos_server, nearby, etc.) read via the
  PointingEstimate access shape (pointing.aligned.estimate.RA, etc.).
- solve_pixel -> target_pixel rename: code, default_config.json,
  shared_state.target_pixel() / set_target_pixel(), Config key. No
  user-config migration logic; users re-align to write the new key.
- Delete solver_main.py (dead code), get_initialized_solved_dict(),
  to_legacy_dict() / from_legacy_dict() bridge methods, and the
  "PROPOSAL ONLY" framing in types/positioning.py.
- Add matched_centroids / matched_stars fields to PointingEstimate so
  the SQM calibration UI can replay SQM calculations from cached
  published solutions.
- Pull the simplified ImuDeadReckoning (and legacy companion + tests)
  from idr_tests so the integrator's IDR call surface matches.
- 21 new unit tests in test_pointing_estimate.py covering dataclass
  basics, solver builders, integrator merge semantics, failed-solve
  anchor preservation, alignment dispatch, picklability, deep-copy
  isolation.
- CONTEXT.md cleaned of stale migration notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ImuDeadReckoning now handles both camera and aligned axes in one
instance. solve() takes (camera, aligned, q_x2imu) and captures
q_cam2aligned alongside q_eq2x; predict() composes the camera
prediction with q_cam2aligned and returns both as a tuple.

The integrator drops the idr_camera/idr_aligned pair for a single
idr. The IDR remains a math primitive (RaDecRoll in, tuple of
RaDecRoll out) and does not import from PiFinder.types.positioning.

Equivalence tests now parametrize over (screen_direction,
alignment_offset) and check both predicted.camera == legacy.cam and
predicted.aligned == legacy.scope for identity and real-offset cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brickbots brickbots changed the title Collapse two-axis IDR into a single dual-axis instance Refactor Solver/Integrator paths to use dataclasses, fold aligned position estimate into ImuDeadReckoning May 24, 2026
brickbots and others added 4 commits May 23, 2026 19:01
The solver was building the full PointingEstimate that the integrator
owns and publishes, forcing it to set estimate==solve cells and emit a
hollow PointingEstimate on failure. Separate the wire message from the
published aggregate:

- Add SolveResult = SuccessfulSolve | FailedSolve, a typed DTO on
  solver_queue. The integrator dispatches via isinstance() and remains
  the sole builder/owner of PointingEstimate.
- SuccessfulSolve carries flat per-axis Pointings (no solve/estimate
  split, since the solver never IMU-progresses) plus a single
  solve_time; the integrator fans these into both cells and assigns
  solve_time/cam_solve_time.
- Promote the inline failed-solve branch to _apply_failed_solve so the
  test exercises the real integrator path instead of a copy.
- Document the split in positioning/CONTEXT.md, positioning.md, and
  new ADR-0003.

Also silence a spurious "invalid value encountered in isnan"
RuntimeWarning: np.isnan() on the IDR's NaN-quaternion sentinel trips
numpy's FP-invalid flag. Replace with a per-component math.isnan helper
(_quat_has_nan), clearing all 25 unit-test warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handoff.md was a session-handoff note for the (now-completed) IDR
dual-axis collapse; it's WIP scratch that shouldn't ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tetra3 is a git submodule (PiFinder/tetra3) and is imported as a
top-level `tetra3` via the dev-only symlink python/tetra3 ->
PiFinder/tetra3/tetra3. That symlink was untracked, so nox.yml CI —
which checks out the submodule (submodules: true) but never created the
symlink — failed to `import tetra3`, breaking the unit tests that import
PiFinder.solver (test_pointing_estimate.py).

Commit the symlink so it is restored on checkout. The submodule provides
the real package for it to point at, so the unit suite runs against real
tetra3 with no per-test stubbing. Drop the now-redundant manual `ln -s`
from web-integration-tests.yml (it would otherwise fail with "File
exists").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brickbots brickbots changed the title Refactor Solver/Integrator paths to use dataclasses, fold aligned position estimate into ImuDeadReckoning Refactor Solver/Integrator flow to use dataclasses, combine camera/aligned position estimates into ImuDeadReckoning May 24, 2026
@brickbots
Copy link
Copy Markdown
Owner Author

@TakKanekoGit Here's a partially simplified ImuDeadReckoning class + a whole lot of refactoring around it to make the solver->integrator flow much more understandable. Hopefully this overall simplification helps everyone, but it's a ton to review.

I'm pretty sure about everything but the math in ImuDeadReckoning - Looking at the changes is a bit hard to follow, but the class is pretty small now. I think I preserved the camera center to alignment point logic/math properly and apply it correctly, but I'm still very much not across the quaternion transform maths.

What should be happening is that all the estimation is being done from a camera_center basis (the camera pointing), the delta between the camera and aligned solves is computed every time there is a new image solve, and the aligned estimate is derived from the camera estimate not calculated from scratch which should help with the missing roll from Tetra3 for the target_pixel solution.

Whew! I'm going to test this under the stars to verify this all, but I think it should work 🤞

Copy link
Copy Markdown
Contributor

@TakKanekoGit TakKanekoGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a big change! I've just started the review but must dash off. I'll hopefully have time tomorrow (sorry about that).

Good luck with the star testing! Have you tested it in demo mode? I often managed to catch errors with ImuDeadReckoning() using the demo mode.

Comment thread python/PiFinder/integrator.py Outdated
@brickbots
Copy link
Copy Markdown
Owner Author

@mrosseel This is a pretty big change in the data models flowing through the solver->integrator and then used by the rest of the system. It will definitely impact you telemetry work, so wanted to flag you here so you could have a heads up and provide any input. Hoping to merge this soon so avoid creating more potential conflicts with new code 👍

brickbots and others added 4 commits May 24, 2026 18:48
The merge of main brought in the UI module smoke harness (#438), whose
"warm" fixture built the legacy `solved` dict via
`get_initialized_solved_dict` and published it with `set_solution`. This
branch's refactor deleted that helper and changed `set_solution` to take
a `PointingEstimate`, so the harness no longer imported.

- Rewrite the warm fixture to build a `PointingEstimate` by hand: a fresh
  camera plate-solve with both axes' solve+estimate cells populated
  (estimate == solve, no IMU progression), aligned == camera (no
  alignment offset), imu_anchor None, and SolveDiagnostics. Mirrors what
  the integrator produces from a SuccessfulSolve.
- Swap the import for the positioning dataclasses; reword the docstring's
  "solved fixture" to canonical vocabulary.
- docs/ax/ui.md §9.2: correct the stale note that solution() returns
  None by default (it now defaults to an empty PointingEstimate()).

Validated: pytest -m integration tests/test_ui_modules.py -> 191 passed,
2 skipped (the intentional UIAlign sweep skips).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
solve_state is, by the glossary's own definition, exactly
solution().has_pointing() -- and the integrator kept the two in sync by
hand at three sites (set_solve_state True/True/False paired with
set_solution). That dual-write is a drift hazard: a new publish path that
forgets the partner call leaves the flag stale.

Derive solve_state inside set_solution (= v.has_pointing()) and drop the
three manual set_solve_state calls in the integrator. solve_state stays a
bare bool so the UI can keep polling it cheaply without round-tripping the
whole PointingEstimate across the manager proxy every frame -- it's now an
enforced cache rather than a hand-maintained parallel.

Behavioral note: the SuccessfulSolve branch previously flipped solve_state
True eagerly, before the freshness gate; now the gated publish (step 5)
sets it. A successful solve that fails the gate is a stale/duplicate
solve_time (already deduped by the solver via exposure_end), so in practice
the flag still tracks every real solve -- and no longer claims "solved" on
a dropped duplicate.

Docs (positioning.md / CONTEXT.md) and the UI harness warm fixture updated
to the single-writer model.

Addresses TakKanekoGit's review note on PR #429.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
set_solution now assigns v.has_pointing() (typed bool) to __solve_state,
which mypy had inferred as None-typed from its `= None` init -> assignment
error. The old set_solve_state took an untyped (Any) value, so this never
surfaced. Annotate the field as Optional[bool], matching its documented
tri-state (None = no solve yet / True / False).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brickbots
Copy link
Copy Markdown
Owner Author

Just tested this briefly under the stars... It seems to work just like the current code in main, which is nice.

Both this branch and main are just AWESOME. Fast, smooth chart updates, works regardless of what weird angle I put the PiFinder in, just amazing work @TakKanekoGit

I did find one bug in this version that I'll fix up. It sometimes reverts to saying 'no solve' even though it's had one and was updating it with the IMU. I'm sure it's something simple leaking through, maybe related to the derivation of the solve_state when the solution is updated 🤷‍♂️

Copy link
Copy Markdown
Contributor

@TakKanekoGit TakKanekoGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @brickbots. This looks fantastic. Very nice. The algorithm looks all fine. I've added a few very minor comments.

Comment thread python/PiFinder/integrator.py
Comment thread python/PiFinder/integrator.py Outdated
Comment thread python/PiFinder/integrator.py Outdated
Comment thread python/PiFinder/types/positioning.py
Comment thread python/PiFinder/integrator.py Outdated
brickbots and others added 3 commits May 25, 2026 19:12
Clarify what the published timing field means: estimate_time is the
measurement epoch of the data behind the current estimate (camera frame
exposure_end, or IMU sample timestamp), not the integrator's publish
time. "solve" is now reserved for plate-solve everywhere; the current
(possibly IMU-progressed) value is the estimate.

- PointingEstimate.solve_time -> estimate_time; drop cam_solve_time
  (value-identical to last_solve_success under epoch semantics).
  cam_active -> is_camera_solve(); "time since solve" -> last_solve_success.
- SuccessfulSolve drops its redundant solve_time; the integrator promotes
  last_solve_success (the frame's exposure_end) to estimate_time.
- Migrate the shared_state.imu() dict to the ImuSample dataclass, which
  gains a `timestamp` (sample epoch, stamped in the IMU process) and a
  to_dict() for the web API; drop the unused move_start/move_end.
- IMU dead-reckoning now stamps estimate_time = imu.timestamp.

Glossary (CONTEXT.md) + docs/adr/0004-pointing-estimate-timing.md.
Addresses TakKanekoGit's review notes on PR #429.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the bug observed on PR #429 where the display sometimes reverted
to "no solve" after having solved and while tracking with the IMU.

A FailedSolve cleared pointing.aligned.estimate and the integrator
published that immediately (to feed auto-exposure), dropping
solve_state to False via has_pointing(). The IMU advance was meant to
re-fill the cells, but it only fires above the motion deadband -- so a
solve failing while held steady left the system stuck at "no solve"
until the next success or a larger nudge.

_apply_failed_solve now preserves the estimate cells (and estimate_time):
once anchored, the last IMU-progressed pointing stays published and the
IMU keeps advancing it. "No solve" is reserved for the genuinely
unanchored state (before the first successful solve). Auto-exposure is
unaffected -- the unconditional failed-solve publish stays, now carrying
the preserved pointing.

Flips the failed-solve test to assert the cells survive + has_pointing()
holds. Glossary + docs/adr/0005-failed-solve-preserves-estimate.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tak's review note on PR #429: the RaDecRoll->Pointing assignment in the
integrator was clumsy (per-field .get(deg=True) + cast(float, ...)).

Add Pointing.from_radecroll() as the inverse of the existing
Pointing.as_radecroll(), completing the degrees<->radians bridge pair.
It lives on Pointing (not RaDecRoll) so the dependency only runs
positioning -> coordinates, never back, and reads the radian fields
directly (always floats), so the casts disappear. _advance_with_imu's
two 4-line blocks collapse to two assignments; cast import dropped.

Keeps the deliberate split: Pointing/degrees is the published data
model; RaDecRoll/radians stays confined to the dead-reckoning math
(rejecting the "use RaDecRoll everywhere" alternative). Glossary updated
with both bridge directions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brickbots
Copy link
Copy Markdown
Owner Author

@TakKanekoGit Thank you for the second round of review. I've committed all my updates so please take a look and resolve or add new comments to the conversations.

I've not tested the "no solve" bug fix under the stars yet. I'm on a work trip until Friday so that will have to wait for me, but anyone who wants pull this branch and test is very welcome!

@mrosseel
Copy link
Copy Markdown
Collaborator

mrosseel commented May 26, 2026

@mrosseel This is a pretty big change in the data models flowing through the solver->integrator and then used by the rest of the system. It will definitely impact you telemetry work, so wanted to flag you here so you could have a heads up and provide any input. Hoping to merge this soon so avoid creating more potential conflicts with new code 👍

probably best if I wait to see what comes out of it and then merge with telemetry branch, which should be up to date with main as of now.

@mrosseel
Copy link
Copy Markdown
Collaborator

mrosseel commented May 26, 2026 via email

@TakKanekoGit
Copy link
Copy Markdown
Contributor

@mrosseel This is a pretty big change in the data models flowing through the solver->integrator and then used by the rest of the system. It will definitely impact you telemetry work, so wanted to flag you here so you could have a heads up and provide any input. Hoping to merge this soon so avoid creating more potential conflicts with new code 👍

probably best if I wait to see what comes out of it and then merge with telemetry branch, which should be up to date with main as of now.

@mrosseel -- Just to let you know, I'm working on changes to imu_pi.py to use gyro measurements directly which will probably conflict with your telemetry PR. I'll probably need your help to fix the merge conflicts when we get to it.

Comment thread python/PiFinder/pointing_model/imu_dead_reckoning.py
brickbots and others added 2 commits May 30, 2026 10:25
The dataclass migration (220a648) switched shared_state.solution() from
the legacy `solved` dict to a PointingEstimate, but /api/solution and
/api/visible_stars kept reading it dict-style (sol["RA"], sol.get("FOV"),
sol["camera_solve"], dict(sol)). On the new dataclass those raise
TypeError/AttributeError, so both endpoints crashed on every call with a
live solve. /api/status emitted a repr string for "solution" instead of
an object (json.dumps default=str masked the break). No test covered
these routes, so the suite stayed green.

- Add _solution_to_dict() / _pointing_to_dict() in the web-API layer
  (keeping legacy-dict shape out of the domain types, per the PR's
  design): they re-emit the legacy `solved` dict keys so external
  clients (OpenClaw) keep a stable wire contract. solve_time keeps its
  old key name (estimate_time under the hood, ADR-0004); cam_solve_time
  is re-derived from last_solve_success; the unserializable imu_quat is
  dropped (no endpoint read it).
- /api/status, /api/solution: serialize via _solution_to_dict; replace
  the always-false `if not sol` (a dataclass is always truthy) with
  `not sol.has_pointing()`.
- /api/visible_stars: read FOV from sol.diagnostics.FOV; select source
  from pointing.camera.solve (camera) vs pointing.aligned.estimate
  (aligned), preserving the old camera_solve-vs-solution semantics.
- Add tests/test_api_extensions.py locking in the serialization mapping
  (aligned->top-level RA/Dec, camera.solve->camera_solve, empty estimate
  serializes cleanly, JSON-serializable).

nox lint + 232 unit/smoke green; mypy clean on the file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…imate-dataclasses

# Conflicts:
#	python/PiFinder/ui/align.py
#	python/PiFinder/ui/chart.py
@brickbots brickbots merged commit 2b1efa1 into main May 30, 2026
1 check passed
mrosseel added a commit to mrosseel/PiFinder that referenced this pull request Jun 1, 2026
Port telemetry record/replay onto the refactored Solver/Integrator flow
(upstream brickbots#429): SolveResult DTOs on solver_queue, integrator-owned
PointingEstimate, ImuSample on shared_state.

- integrator.py: take upstream's dataclass integrator, re-add telemetry
  hooks. Replay now converts recorded events back into SolveResult /
  ImuSample messages and feeds them through the same _apply_successful_solve
  / _apply_failed_solve / _advance_with_imu paths as live data (the old
  branch duplicated integrator logic inside TelemetryManager).
- telemetry.py: recorder consumes SolveResult/ImuSample; player produces
  them. JSONL field names unchanged, so existing recordings stay replayable.
- types/positioning.py: ImuSample gains gyro/accel fields for raw-sensor
  telemetry; imu_pi.py populates them.
- Drop pointing.py and solved.py: superseded by types/positioning.py and
  the upstream integrator (incl. the legacy roll-by-mount-type logic that
  upstream replaced with chart-side handling).
- solver.py: take upstream as-is (get_initialized_solved_dict is gone).
- Tests ported to the new types; drift tests now exercise the real
  integrator apply/advance functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants