Refactor Solver/Integrator flow to use dataclasses, combine camera/aligned position estimates into ImuDeadReckoning#429
Conversation
End-to-end adoption of the dataclasses in PiFinder/types/positioning.py: - Solver builds a fresh PointingEstimate per attempt and pushes to solver_queue. Alignment queue carries AlignOnRaDec / AlignCancel / ReloadSqmCalibration commands and AlignedResult responses, dispatched via isinstance(). - Integrator owns the long-lived PointingEstimate. solve cells survive failed plate-solves so IMU dead-reckoning keeps producing aligned estimates. - shared_state.solution() / set_solution() carry PointingEstimate; default to an empty PointingEstimate() so consumers can always call .has_pointing() without a None check. - All ~15 consumers (UI, server, pos_server, nearby, etc.) read via the PointingEstimate access shape (pointing.aligned.estimate.RA, etc.). - solve_pixel -> target_pixel rename: code, default_config.json, shared_state.target_pixel() / set_target_pixel(), Config key. No user-config migration logic; users re-align to write the new key. - Delete solver_main.py (dead code), get_initialized_solved_dict(), to_legacy_dict() / from_legacy_dict() bridge methods, and the "PROPOSAL ONLY" framing in types/positioning.py. - Add matched_centroids / matched_stars fields to PointingEstimate so the SQM calibration UI can replay SQM calculations from cached published solutions. - Pull the simplified ImuDeadReckoning (and legacy companion + tests) from idr_tests so the integrator's IDR call surface matches. - 21 new unit tests in test_pointing_estimate.py covering dataclass basics, solver builders, integrator merge semantics, failed-solve anchor preservation, alignment dispatch, picklability, deep-copy isolation. - CONTEXT.md cleaned of stale migration notes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ImuDeadReckoning now handles both camera and aligned axes in one instance. solve() takes (camera, aligned, q_x2imu) and captures q_cam2aligned alongside q_eq2x; predict() composes the camera prediction with q_cam2aligned and returns both as a tuple. The integrator drops the idr_camera/idr_aligned pair for a single idr. The IDR remains a math primitive (RaDecRoll in, tuple of RaDecRoll out) and does not import from PiFinder.types.positioning. Equivalence tests now parametrize over (screen_direction, alignment_offset) and check both predicted.camera == legacy.cam and predicted.aligned == legacy.scope for identity and real-offset cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The solver was building the full PointingEstimate that the integrator owns and publishes, forcing it to set estimate==solve cells and emit a hollow PointingEstimate on failure. Separate the wire message from the published aggregate: - Add SolveResult = SuccessfulSolve | FailedSolve, a typed DTO on solver_queue. The integrator dispatches via isinstance() and remains the sole builder/owner of PointingEstimate. - SuccessfulSolve carries flat per-axis Pointings (no solve/estimate split, since the solver never IMU-progresses) plus a single solve_time; the integrator fans these into both cells and assigns solve_time/cam_solve_time. - Promote the inline failed-solve branch to _apply_failed_solve so the test exercises the real integrator path instead of a copy. - Document the split in positioning/CONTEXT.md, positioning.md, and new ADR-0003. Also silence a spurious "invalid value encountered in isnan" RuntimeWarning: np.isnan() on the IDR's NaN-quaternion sentinel trips numpy's FP-invalid flag. Replace with a per-component math.isnan helper (_quat_has_nan), clearing all 25 unit-test warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handoff.md was a session-handoff note for the (now-completed) IDR dual-axis collapse; it's WIP scratch that shouldn't ship. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tetra3 is a git submodule (PiFinder/tetra3) and is imported as a top-level `tetra3` via the dev-only symlink python/tetra3 -> PiFinder/tetra3/tetra3. That symlink was untracked, so nox.yml CI — which checks out the submodule (submodules: true) but never created the symlink — failed to `import tetra3`, breaking the unit tests that import PiFinder.solver (test_pointing_estimate.py). Commit the symlink so it is restored on checkout. The submodule provides the real package for it to point at, so the unit suite runs against real tetra3 with no per-test stubbing. Drop the now-redundant manual `ln -s` from web-integration-tests.yml (it would otherwise fail with "File exists"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@TakKanekoGit Here's a partially simplified ImuDeadReckoning class + a whole lot of refactoring around it to make the solver->integrator flow much more understandable. Hopefully this overall simplification helps everyone, but it's a ton to review. I'm pretty sure about everything but the math in ImuDeadReckoning - Looking at the changes is a bit hard to follow, but the class is pretty small now. I think I preserved the camera center to alignment point logic/math properly and apply it correctly, but I'm still very much not across the quaternion transform maths. What should be happening is that all the estimation is being done from a camera_center basis (the Whew! I'm going to test this under the stars to verify this all, but I think it should work 🤞 |
TakKanekoGit
left a comment
There was a problem hiding this comment.
It's a big change! I've just started the review but must dash off. I'll hopefully have time tomorrow (sorry about that).
Good luck with the star testing! Have you tested it in demo mode? I often managed to catch errors with ImuDeadReckoning() using the demo mode.
|
@mrosseel This is a pretty big change in the data models flowing through the solver->integrator and then used by the rest of the system. It will definitely impact you telemetry work, so wanted to flag you here so you could have a heads up and provide any input. Hoping to merge this soon so avoid creating more potential conflicts with new code 👍 |
…imate-dataclasses
The merge of main brought in the UI module smoke harness (#438), whose "warm" fixture built the legacy `solved` dict via `get_initialized_solved_dict` and published it with `set_solution`. This branch's refactor deleted that helper and changed `set_solution` to take a `PointingEstimate`, so the harness no longer imported. - Rewrite the warm fixture to build a `PointingEstimate` by hand: a fresh camera plate-solve with both axes' solve+estimate cells populated (estimate == solve, no IMU progression), aligned == camera (no alignment offset), imu_anchor None, and SolveDiagnostics. Mirrors what the integrator produces from a SuccessfulSolve. - Swap the import for the positioning dataclasses; reword the docstring's "solved fixture" to canonical vocabulary. - docs/ax/ui.md §9.2: correct the stale note that solution() returns None by default (it now defaults to an empty PointingEstimate()). Validated: pytest -m integration tests/test_ui_modules.py -> 191 passed, 2 skipped (the intentional UIAlign sweep skips). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
solve_state is, by the glossary's own definition, exactly solution().has_pointing() -- and the integrator kept the two in sync by hand at three sites (set_solve_state True/True/False paired with set_solution). That dual-write is a drift hazard: a new publish path that forgets the partner call leaves the flag stale. Derive solve_state inside set_solution (= v.has_pointing()) and drop the three manual set_solve_state calls in the integrator. solve_state stays a bare bool so the UI can keep polling it cheaply without round-tripping the whole PointingEstimate across the manager proxy every frame -- it's now an enforced cache rather than a hand-maintained parallel. Behavioral note: the SuccessfulSolve branch previously flipped solve_state True eagerly, before the freshness gate; now the gated publish (step 5) sets it. A successful solve that fails the gate is a stale/duplicate solve_time (already deduped by the solver via exposure_end), so in practice the flag still tracks every real solve -- and no longer claims "solved" on a dropped duplicate. Docs (positioning.md / CONTEXT.md) and the UI harness warm fixture updated to the single-writer model. Addresses TakKanekoGit's review note on PR #429. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
set_solution now assigns v.has_pointing() (typed bool) to __solve_state, which mypy had inferred as None-typed from its `= None` init -> assignment error. The old set_solve_state took an untyped (Any) value, so this never surfaced. Annotate the field as Optional[bool], matching its documented tri-state (None = no solve yet / True / False). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Just tested this briefly under the stars... It seems to work just like the current code in main, which is nice. Both this branch and main are just AWESOME. Fast, smooth chart updates, works regardless of what weird angle I put the PiFinder in, just amazing work @TakKanekoGit I did find one bug in this version that I'll fix up. It sometimes reverts to saying 'no solve' even though it's had one and was updating it with the IMU. I'm sure it's something simple leaking through, maybe related to the derivation of the solve_state when the solution is updated 🤷♂️ |
TakKanekoGit
left a comment
There was a problem hiding this comment.
Hi @brickbots. This looks fantastic. Very nice. The algorithm looks all fine. I've added a few very minor comments.
Clarify what the published timing field means: estimate_time is the measurement epoch of the data behind the current estimate (camera frame exposure_end, or IMU sample timestamp), not the integrator's publish time. "solve" is now reserved for plate-solve everywhere; the current (possibly IMU-progressed) value is the estimate. - PointingEstimate.solve_time -> estimate_time; drop cam_solve_time (value-identical to last_solve_success under epoch semantics). cam_active -> is_camera_solve(); "time since solve" -> last_solve_success. - SuccessfulSolve drops its redundant solve_time; the integrator promotes last_solve_success (the frame's exposure_end) to estimate_time. - Migrate the shared_state.imu() dict to the ImuSample dataclass, which gains a `timestamp` (sample epoch, stamped in the IMU process) and a to_dict() for the web API; drop the unused move_start/move_end. - IMU dead-reckoning now stamps estimate_time = imu.timestamp. Glossary (CONTEXT.md) + docs/adr/0004-pointing-estimate-timing.md. Addresses TakKanekoGit's review notes on PR #429. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes the bug observed on PR #429 where the display sometimes reverted to "no solve" after having solved and while tracking with the IMU. A FailedSolve cleared pointing.aligned.estimate and the integrator published that immediately (to feed auto-exposure), dropping solve_state to False via has_pointing(). The IMU advance was meant to re-fill the cells, but it only fires above the motion deadband -- so a solve failing while held steady left the system stuck at "no solve" until the next success or a larger nudge. _apply_failed_solve now preserves the estimate cells (and estimate_time): once anchored, the last IMU-progressed pointing stays published and the IMU keeps advancing it. "No solve" is reserved for the genuinely unanchored state (before the first successful solve). Auto-exposure is unaffected -- the unconditional failed-solve publish stays, now carrying the preserved pointing. Flips the failed-solve test to assert the cells survive + has_pointing() holds. Glossary + docs/adr/0005-failed-solve-preserves-estimate.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tak's review note on PR #429: the RaDecRoll->Pointing assignment in the integrator was clumsy (per-field .get(deg=True) + cast(float, ...)). Add Pointing.from_radecroll() as the inverse of the existing Pointing.as_radecroll(), completing the degrees<->radians bridge pair. It lives on Pointing (not RaDecRoll) so the dependency only runs positioning -> coordinates, never back, and reads the radian fields directly (always floats), so the casts disappear. _advance_with_imu's two 4-line blocks collapse to two assignments; cast import dropped. Keeps the deliberate split: Pointing/degrees is the published data model; RaDecRoll/radians stays confined to the dead-reckoning math (rejecting the "use RaDecRoll everywhere" alternative). Glossary updated with both bridge directions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@TakKanekoGit Thank you for the second round of review. I've committed all my updates so please take a look and resolve or add new comments to the conversations. I've not tested the "no solve" bug fix under the stars yet. I'm on a work trip until Friday so that will have to wait for me, but anyone who wants pull this branch and test is very welcome! |
probably best if I wait to see what comes out of it and then merge with telemetry branch, which should be up to date with main as of now. |
|
Hi, telemetry branch has the fix and should be up to date with main
…On Tue, 26 May 2026, 19:37 Tak Kaneko, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/PiFinder/integrator.py
<#429 (comment)>:
>
- # Set up dead-reckoning tracking by the IMU:
- imu_dead_reckoning = ImuDeadReckoning(cfg.get_option("screen_direction"))
- # imu_dead_reckoning.set_cam2scope_alignment(q_scope2cam) # TODO: Enable when q_scope2cam is available from alignment
+ # Single IMU dead-reckoner handling both axes. Seeded with the
+ # (camera, aligned) pair at each successful plate-solve.
+ idr = ImuDeadReckoning(screen_direction)
Hi @brickbots <https://github.com/brickbots> -- I have a function to
estimate this (q_imu2cam) but I paused it until the issue with the
invalid solves are fixed and I can record some new telemetry data (perhaps
this PR might even fix it?). @mrosseel <https://github.com/mrosseel> has
the details on the root cause. Hopefully, we'll be able to include it with
the next release.
—
Reply to this email directly, view it on GitHub
<#429?email_source=notifications&email_token=AALFXXX4ZXLLOJ7HGQMLP3D44XI7FA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMZWGYYDQMZVGA22M4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJLDGN5XXIZLSL5RWY2LDNM#discussion_r3305707765>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALFXXVFAOIL7F47IIV3HX344XI7FAVCNFSM6AAAAACZKB5HXCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHM2DGNRWGA4DGNJQGU>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@mrosseel -- Just to let you know, I'm working on changes to |
The dataclass migration (220a648) switched shared_state.solution() from the legacy `solved` dict to a PointingEstimate, but /api/solution and /api/visible_stars kept reading it dict-style (sol["RA"], sol.get("FOV"), sol["camera_solve"], dict(sol)). On the new dataclass those raise TypeError/AttributeError, so both endpoints crashed on every call with a live solve. /api/status emitted a repr string for "solution" instead of an object (json.dumps default=str masked the break). No test covered these routes, so the suite stayed green. - Add _solution_to_dict() / _pointing_to_dict() in the web-API layer (keeping legacy-dict shape out of the domain types, per the PR's design): they re-emit the legacy `solved` dict keys so external clients (OpenClaw) keep a stable wire contract. solve_time keeps its old key name (estimate_time under the hood, ADR-0004); cam_solve_time is re-derived from last_solve_success; the unserializable imu_quat is dropped (no endpoint read it). - /api/status, /api/solution: serialize via _solution_to_dict; replace the always-false `if not sol` (a dataclass is always truthy) with `not sol.has_pointing()`. - /api/visible_stars: read FOV from sol.diagnostics.FOV; select source from pointing.camera.solve (camera) vs pointing.aligned.estimate (aligned), preserving the old camera_solve-vs-solution semantics. - Add tests/test_api_extensions.py locking in the serialization mapping (aligned->top-level RA/Dec, camera.solve->camera_solve, empty estimate serializes cleanly, JSON-serializable). nox lint + 232 unit/smoke green; mypy clean on the file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…imate-dataclasses # Conflicts: # python/PiFinder/ui/align.py # python/PiFinder/ui/chart.py
Port telemetry record/replay onto the refactored Solver/Integrator flow (upstream brickbots#429): SolveResult DTOs on solver_queue, integrator-owned PointingEstimate, ImuSample on shared_state. - integrator.py: take upstream's dataclass integrator, re-add telemetry hooks. Replay now converts recorded events back into SolveResult / ImuSample messages and feeds them through the same _apply_successful_solve / _apply_failed_solve / _advance_with_imu paths as live data (the old branch duplicated integrator logic inside TelemetryManager). - telemetry.py: recorder consumes SolveResult/ImuSample; player produces them. JSONL field names unchanged, so existing recordings stay replayable. - types/positioning.py: ImuSample gains gyro/accel fields for raw-sensor telemetry; imu_pi.py populates them. - Drop pointing.py and solved.py: superseded by types/positioning.py and the upstream integrator (incl. the legacy roll-by-mount-type logic that upstream replaced with chart-side handling). - solver.py: take upstream as-is (get_initialized_solved_dict is gone). - Tests ported to the new types; drift tests now exercise the real integrator apply/advance functions.
Summary
Reworks the Positioning data path (
solver.py→integrator.py→ consumers) off the legacysolveddict and onto typed dataclasses, then sharpens the boundaries between the solver, the integrator, and the IMU dead-reckoner. The change lands in three layers:Reviewers
There is a lot of stuff here, but the main areas for input are the shape of the dataclasses in
PiFinder/types/positioning.pyand the flow of data from solver.py->integrator.py.1. Dataclass migration — kill the
solveddict (220a6484)End-to-end adoption of the dataclasses in
PiFinder/types/positioning.py:PointingEstimatereplaces thesolveddict everywhere. Pointing is modelled as a 2 × 2 matrix — two axes (camera,aligned) × two states (solve,estimate) — reached aspointing.<axis>.<state>.<RA|Dec|Roll>.PointingEstimate; itssolvecells survive failed plate-solves so IMU dead-reckoning keeps producing aligned estimates.shared_state.solution()/set_solution()carryPointingEstimateand default to an empty instance, so all ~15 consumers (UI,server,pos_server,nearby, …) read through the access shape withoutNonechecks.AlignOnRaDec/AlignCancel/ReloadSqmCalibration,AlignedResult), dispatched viaisinstance().solve_pixel→target_pixelacross code,default_config.json, andshared_state. No config-migration logic — users re-align to write the new key.solver_main.py,get_initialized_solved_dict(), the legacy dict bridge methods).matched_centroids/matched_starstoPointingEstimateso the SQM calibration UI can replay SQM from cached published solutions.2. Collapse the two-axis IDR into one dual-axis instance (
9a2ee904)ImuDeadReckoningnow handles both axes in one instance.solve(camera, aligned, q_x2imu)captures the staticq_cam2alignedalongside the driftingq_eq2x;predict()composes the camera prediction withq_cam2alignedand returns both as a(camera, aligned)tuple.reset()clears both.integrator.pydrops theidr_camera/idr_alignedpair for a singleidr.RaDecRollin,RaDecRollout — no import fromPiFinder.types.positioning.imu_dead_reckoning_legacy.pyis retained in-tree as the equivalence reference until we're comfortable removing it in a follow-up.3. Split the solver→integrator message into a
SolveResultDTO (913e5d08)The migration in (1) had the solver build the same
PointingEstimatethe integrator owns and publishes — forcing it to setestimate == solvecells and emit a hollowPointingEstimateon failure. This layer separates the wire message from the published aggregate:SolveResult=SuccessfulSolve|FailedSolve, a typed DTO onsolver_queue. The integrator dispatches viaisinstance()and is the sole builder/owner ofPointingEstimate.SuccessfulSolvecarries flat per-axisPointings (nosolve/estimatesplit — the solver never IMU-progresses) plus a singlesolve_time; the integrator fans these into both cells and assignssolve_time/cam_solve_time._apply_failed_solveso tests exercise the real integrator path.positioning/CONTEXT.md.Also in this layer: silence a spurious
invalid value encountered in isnanRuntimeWarning—np.isnan()on the IDR's NaN-quaternion sentinel tripped numpy's FP-invalid flag. Replaced with a per-componentmath.isnanhelper (_quat_has_nan), clearing all 25 unit-test warnings.Docs (
fb92dcd0+ this work)docs/ax/positioning.mdanddocs/ax/positioning/CONTEXT.mdupdated to the new vocabulary; ADR-0003 added. The canonical glossary (CONTEXT.md) and the vocabulary decision (ADR-0001) remain authoritative for terms.Test plan
nox -s lint type_hints smoke_tests unit_tests— all green (214 unit tests pass; 219 unit + smoke), 0 warnings.pointing.aligned.estimatereflects it.Additional Notes for reviewers
imu_dead_reckoning_legacy.pyis intentionally retained as the equivalence-test reference; removal is a planned follow-up.handoff.mdscratch doc that had slipped into the branch was removed in this PR.🤖 Generated with Claude Code