Skip to content

Fixes backend asset state invalidation + provides a skip#6150

Merged
AntoineRichard merged 22 commits into
isaac-sim:developfrom
AntoineRichard:antoiner/feat/cleaned-up_asset_invalidation_timestamps
Jun 16, 2026
Merged

Fixes backend asset state invalidation + provides a skip#6150
AntoineRichard merged 22 commits into
isaac-sim:developfrom
AntoineRichard:antoiner/feat/cleaned-up_asset_invalidation_timestamps

Conversation

@AntoineRichard

@AntoineRichard AntoineRichard commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Description

Fix stale cached pose and velocity state after backend asset state writers push data into simulation.

This updates the Newton, PhysX, and OV PhysX backend asset writers to invalidate the affected cached
data through shared data-class reset helpers.

Fixes # (issue): N/A

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Screenshots

N/A

Validation

  • ./isaaclab.sh -f (all hooks passed except check-git-lfs-pointers, which failed because git-lfs is not installed locally)
  • SKIP=check-git-lfs-pointers ./isaaclab.sh -f

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added a changelog fragment under source/<pkg>/changelog.d/ for every touched package (do not edit CHANGELOG.rst or bump extension.toml — CI handles that)
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

Add cross-backend tests that verify asset writers invalidate cached pose and velocity data. Include changelog fragments and formatter cleanups for the touched backend files.
@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label Jun 11, 2026
@AntoineRichard AntoineRichard marked this pull request as draft June 11, 2026 16:55
Address review feedback on the cache-invalidation cleanup:

- Add from_link/from_com flags to the Newton reset_pose/reset_velocity
  helpers so writing a com pose or link velocity no longer clobbers the
  freshly-written buffer, matching the PhysX and OVPhysX behavior.
- Delete a leftover invalidate_fk call in the Newton rigid object
  collection link-velocity writer that double-fired and ran even when
  skip_forward was set.
- Add the skip_forward parameter to the abstract base writers so the
  contract matches the concrete backend implementations.
- Reword the from_link/from_com docstrings across all backends to state
  what the flag actually gates, and add kinematic-chain rationale
  comments to the OVPhysX helpers.
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes stale cached pose and velocity state after backend asset state writers push data into simulation, and adds skip_forward: bool = False to every write method across all three physics backends (Newton, OVPhysX, PhysX) and their abstract base classes so callers can batch multiple writes before a single cache invalidation.

  • Scattered per-field timestamp = -1.0 assignments in each write method are replaced by centralised reset_pose() / reset_velocity() helpers on each data class; a new _ensure_fk_fresh() guard on pose/velocity properties (and backend FK calls) ensures the cached value is only recomputed when it is actually stale.
  • OVPhysX write_joint_velocity_to_sim_* now explicitly zeroes the finite-difference joint-acceleration buffer before calling reset_velocity(), preventing spurious acceleration spikes after velocity resets.
  • Newton kernels.py removes now-unused out-parameters from four body-pose/velocity kernels; the test harness patches NewtonManager.forward as a no-op for the Newton collection interface fixture where no live simulation state exists.

Confidence Score: 3/5

Safe to merge after addressing the body_com_pose_w staleness bug in Newton RigidObjectData.

The invalidation pattern is correct and consistently applied across all three backends and all asset types. One confirmed P1 regression in Newton rigid_object_data.py: body_com_pose_w uses a once-cached ProxyArray that is never re-evaluated after reset_pose() invalidates _root_com_pose_w.timestamp — the corresponding collection data class uses a proper TimestampedBuffer check and calls _ensure_fk_fresh() transitively. All other changes look correct.

source/isaaclab_newton/isaaclab_newton/assets/rigid_object/rigid_object_data.py — body_com_pose_w property does not call _ensure_fk_fresh() or re-trigger root_com_pose_w recompute after a write.

Important Files Changed

Filename Overview
source/isaaclab/isaaclab/assets/articulation/base_articulation.py Added skip_forward: bool = False to all abstract write method signatures; no logic changes.
source/isaaclab/isaaclab/assets/rigid_object/base_rigid_object.py Added skip_forward parameter to all abstract write method signatures; no logic changes.
source/isaaclab_newton/isaaclab_newton/assets/articulation/articulation_data.py Added reset_pose() / reset_velocity() helpers and _ensure_fk_fresh() for Newton; consistent invalidation with SimulationManager.invalidate_fk() and _fk_timestamp guards.
source/isaaclab_newton/isaaclab_newton/assets/articulation/articulation.py All write methods delegate cache invalidation to data.reset_pose() / reset_velocity() under skip_forward guard; skip_forward properly propagated through mask→index delegate chains.
source/isaaclab_newton/isaaclab_newton/assets/kernels.py Removed now-unused body_link_state_w / body_state_w / body_com_state_w out-parameters from four kernels; state-patching logic deleted in favour of timestamp invalidation.
source/isaaclab_newton/isaaclab_newton/assets/rigid_object/rigid_object_data.py Added _ensure_fk_fresh(), reset_pose(), reset_velocity(); body_link_pose_w and body_com_vel_w now guard on FK freshness before returning cached data — but body_com_pose_w is missing the same guard.
source/isaaclab_newton/isaaclab_newton/assets/rigid_object_collection/rigid_object_collection_data.py Added _fk_timestamp init, _ensure_fk_fresh(), reset_pose(), reset_velocity(), and FK timestamp advance in update(); consistent with other Newton data classes.
source/isaaclab_ovphysx/isaaclab_ovphysx/assets/articulation/articulation_data.py Added _fk_timestamp, _ensure_fk_fresh() via update_articulations_kinematic(), reset_pose/velocity helpers; write_joint_velocity explicitly zeros _joint_acc before reset_velocity to prevent FD acceleration spikes.
source/isaaclab_ovphysx/isaaclab_ovphysx/assets/articulation/articulation.py All write methods delegate to reset_pose/reset_velocity under skip_forward guard; write_joint_position drops _joint_acc invalidation (intentional — position write doesn't change velocities).
source/isaaclab_physx/isaaclab_physx/assets/articulation/articulation.py All write methods delegate to reset_pose/reset_velocity; write_joint_state_to_sim_mask now delegates to write_joint_state_to_sim_index (avoiding double invalidation).
source/isaaclab_physx/isaaclab_physx/assets/articulation/articulation_data.py Added _fk_timestamp, _ensure_fk_fresh(), reset_pose(), reset_velocity() consistent with OVPhysX counterpart.
source/isaaclab/test/assets/test_rigid_object_collection_iface.py Patches NewtonManager.forward as a no-op in the collection_iface fixture to prevent failures when _ensure_fk_fresh() fires with no live simulation state.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant Asset as Asset (Articulation / RigidObject)
    participant Data as AssetData
    participant Sim as Physics Sim (PhysX / Newton / OVPhysX)

    Note over Caller,Sim: Normal write (skip_forward=False)
    Caller->>Asset: write_root_link_pose_to_sim_index(pose)
    Asset->>Sim: kernel / binding write
    Asset->>Data: "reset_pose(from_link=True)"
    Data->>Data: "invalidate _body_*_pose_w, _*_state_w, _fk_timestamp=-1"
    Data->>Sim: invalidate_fk() [Newton only]
    Caller->>Asset: body_link_pose_w (read)
    Asset->>Data: _ensure_fk_fresh()
    Data->>Sim: forward() / update_articulations_kinematic()
    Data-->>Asset: fresh cached pose
    Asset-->>Caller: pose tensor

    Note over Caller,Sim: Batched write (skip_forward=True)
    Caller->>Asset: "write_root_link_pose_to_sim_index(pose, skip_forward=True)"
    Asset->>Sim: kernel / binding write
    Note right of Asset: NO reset_pose() called
    Caller->>Asset: "write_joint_position_to_sim_index(pos, skip_forward=True)"
    Asset->>Sim: kernel / binding write
    Note right of Caller: Caller manually flushes
    Caller->>Data: reset_pose()
    Caller->>Data: reset_velocity()
    Caller->>Asset: body_link_pose_w (read)
    Asset->>Data: _ensure_fk_fresh()
    Data->>Sim: forward()
    Data-->>Asset: fresh cached pose
    Asset-->>Caller: pose tensor
Loading

Comments Outside Diff (1)

  1. source/isaaclab_newton/isaaclab_newton/assets/rigid_object/rigid_object_data.py, line 380-389 (link)

    P1 body_com_pose_w bypasses FK freshness after a write

    body_com_pose_w caches a ProxyArray wrapping _root_com_pose_w.data exactly once (when _body_com_pose_w_ta is None) and returns it directly on every subsequent call — no timestamp guard, no _ensure_fk_fresh() call. After a pose write triggers reset_pose() (which sets _root_com_pose_w.timestamp = -1.0), reading body_com_pose_w returns the stale pre-write buffer because neither FK nor the root_com_pose_w lazy-recompute is triggered.

    By contrast, rigid_object_collection_data.py's body_com_pose_w uses a proper TimestampedBuffer check and accesses self.body_link_pose_w.warp (which calls _ensure_fk_fresh()) inside the recompute block — so the collection variant works correctly. The _ensure_fk_fresh() docstring also explicitly lists body_com_pose_w as a property that should be refreshed after a write.

    A minimal fix is to add _ensure_fk_fresh() and an unconditional _ = self.root_com_pose_w access at the top of body_com_pose_w so the TimestampedBuffer recompute fires before the cached ProxyArray is returned.

Reviews (2): Last reviewed commit: "Stub Newton forward() in collection ifac..." | Re-trigger Greptile

Comment thread source/isaaclab_newton/isaaclab_newton/assets/rigid_object/rigid_object.py Outdated
The joint-position writers stopped invalidating the body-frame com pose
and the finite-differenced joint acceleration when their inline
invalidations were replaced by the shared reset helpers, which do not
cover those two buffers. Reading joint_acc or the body com pose after a
joint-position write therefore returned stale values.

Invalidate both buffers in the joint-position writers, unconditionally,
mirroring how the joint-velocity writers already invalidate joint_acc so
the write_joint_state path stays covered.

Also remove the AST-based regression test, which only checked that the
reset helpers were called and could not catch this missed invalidation.
Address review feedback on the cache-invalidation cleanup:

- Forward skip_forward through the Newton and PhysX rigid object
  collection mask writers, which previously dropped it when delegating
  to the index writers. The Newton com/link mask writers also stopped
  double-invalidating by letting the index writer own the reset.
- Unify the skip_forward docstring across the abstract base and all
  three backends to state that it skips both the post-write cache
  invalidation and the forward pass.
- Reword the OVPhysX joint-position invalidation comment so the
  "mirrors the joint-velocity writer" note applies only to the joint
  acceleration, and fix the reset_pose docstring to say world-frame
  body poses rather than body-frame poses.
Reference the base classes via their public isaaclab.assets paths
instead of private submodule paths, matching how they are cross-
referenced elsewhere in the docs.
…ward

The "solver reset (Kamino)" note was copied from the Newton backend into
the PhysX writers, but Kamino is a Newton-only solver. Keep the accurate
FK-recomputation note and drop the misleading Kamino reference.

Also document the skip_forward parameter on the Newton root-pose writers
whose Args sections were missing it.
_body_com_pose_b is the center-of-mass offset in each body's link
frame, a static inertial property that joint motion cannot change, so
invalidating it after a joint-position write is a no-op. Newton and
PhysX do not invalidate it; drop it from OVPhysX for consistency and
keep only the joint-acceleration invalidation, which is genuinely
derived from joint velocity.
Move the joint-acceleration invalidation in the OVPhysX joint writers
under the skip_forward check so it is consistent with the rest of the
post-write invalidation: skip_forward now uniformly controls whether
the caller wants refreshed data after the write. The joint-state
aggregate writer, which delegates with skip_forward=True, invalidates
the joint acceleration itself.
Replace the getattr-over-name-strings loops in the OVPhysX data-class
invalidation helpers with explicit per-buffer timestamp resets, matching
the Newton/PhysX style. All these buffers are eagerly allocated, so the
getattr default and None guard were dead code that would have silently
skipped a buffer on a future typo or rename.
Adopt the PhysX _fk_timestamp / _ensure_fk_fresh mechanism in the
OVPhysX articulation data class instead of the ad-hoc, pose-only
update_articulations_kinematic call. A single _fk_timestamp tracks
whether forward kinematics is current: it is kept in sync after each
sim step, set to -1.0 by reset_pose/reset_velocity to force a refresh,
and checked by _ensure_fk_fresh before FK-dependent reads.

This also fixes a latent staleness gap: previously only body_link_pose_w
refreshed FK, so reading body velocities after a manual joint write
could return stale kinematics. body_com_vel_w now refreshes too,
matching PhysX.
Reword the skip_forward parameter docs to describe the effect accurately
for every asset type: it skips refreshing cached data after the write.
The previous "forward pass" phrasing did not apply to rigid objects,
which have no forward kinematics. Also document the caller's obligation
to invalidate stale cached data before reading it back when the flag is
set, so the deferred-invalidation contract is explicit.
@AntoineRichard AntoineRichard changed the title Fix backend asset state invalidation Fix backend asset state invalidation + provide a skip Jun 12, 2026
@AntoineRichard AntoineRichard changed the title Fix backend asset state invalidation + provide a skip Fixes backend asset state invalidation + provides a skip Jun 12, 2026
Make the OVPhysX joint acceleration lazily timestamp-gated like the PhysX
and Newton backends: recompute it on read when stale and reset the
finite-difference baseline (zero the cache, sync previous velocity) on a
joint-velocity write, instead of the inert timestamp invalidation the
joint-position and joint-velocity writers used to apply. Position writers
no longer touch joint_acc, since a position write does not change the
joint velocities it is differenced from.

Initialize _fk_timestamp in the Newton rigid object collection data and
keep it in sync in update(), matching the articulation and rigid object
data classes; without the initializer the first FK-dependent access
raised AttributeError.

Also standardize the PhysX rigid object and collection reset helpers on
the explicit per-buffer invalidation form, widen the Newton rigid object
reset-helper env_ids hint to accept torch tensors, and clean up
docstrings/comments across backends (skip_forward wording, PhysX
forward-kinematics notes, stale finite-difference references).
The Newton rigid object collection's body_link_pose_w goes through
_ensure_fk_fresh() -> NewtonManager.forward(), which runs eval_fk against
a live simulation state. The mocked interface fixture has no such state,
so the pose writers (which read body_link_pose_w) raised AttributeError
on _state_0 once a prior reset_pose() invalidated the FK timestamp.

Stub forward() to a no-op for the Newton backend during the test body;
the mock view already supplies the cached pose data directly.
@AntoineRichard AntoineRichard marked this pull request as ready for review June 12, 2026 12:52
@AntoineRichard AntoineRichard requested a review from hujc7 June 12, 2026 14:03
Comment on lines +162 to +163
if self._root_state_w is not None:
self._root_state_w.timestamp = -1.0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this pattern is very common, wondering if we can create a helper function for individual reset. Or one takes a list input and reset them all. One of the benefit might be to avoid typos, e.g. checking on property but actually resetting a different one.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I had a getattr / setattr path, but in the end you need to write the string anyway, I'm not sure it makes that big of a difference, it's a bit more compact for sure, but then it's a bunch of getattr / setattr not sure it's better.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added isaaclab.utils.buffers.reset_timestamps(buffers) — it takes a list and invalidates each non-None buffer in one call. Every backend's _reset_pose/_reset_velocity now uses it, so each buffer is named exactly once at the call site and the repeated if self._x is not None: self._x.timestamp = -1.0 blocks (and the check-one-reset-another typo risk you flagged) are gone.

physx_instance.update_articulations_kinematic()
self._fk_timestamp = self._sim_timestamp

def reset_pose(self, from_link: bool = True) -> None:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the correspondence is consistent, reset_pose -> from_link, reset_v -> from_from. But I am wondering if it be generalized one lever further, e.g. from: ResetSource = ResetSource.LINK, or from: ResetSource = "link", so it's more consistent. Maybe a little too much.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt like a boolean check would have been faster that's why I went with the from_link. I'll double check for homogeneity.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the from_link/from_com booleans — there are only the two states and they are documented per-method, so a ResetSource enum would add indirection without much benefit here. (Agreed it would be a little too much.)

self._fk_timestamp = self._sim_timestamp

def reset_pose(
self, env_ids: wp.array | torch.Tensor | None = None, env_mask: wp.array | None = None, from_link: bool = True

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Seems an outlier.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least that's what can flow inside of this.

*,
root_pose: torch.Tensor | wp.array,
env_ids: Sequence[int] | torch.Tensor | wp.array | None = None,
skip_forward: bool = False,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Curious on this. The usage is commonly not skip_forward, would e.g. "force_forward", or something positive also serve the purpose and avoids double negation? At the end, would skipping be more common?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed Genesis here, they have skip_forward flag set to false by default. But that's fair point.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to keep skip_forward as-is. It mirrors Genesis's rigid solver convention exactly — RigidSolver.set_qpos / set_base_links_pos / set_base_links_quat take skip_forward: bool = False and gate the forward-kinematics pass on if not skip_forward: (rigid_solver.py). Our write_*_to_sim methods sit at that same level, and the flag is now propagated uniformly across all three backends plus the base classes, so a rename would diverge from that precedent and churn the whole write surface. Defaulting to False keeps "forward runs unless explicitly skipped" as the safe default.

self._body_link_state_w.timestamp = -1.0
if self._body_com_state_w is not None:
self._body_com_state_w.timestamp = -1.0
# Why do we have _fk_timestamp and invalidate_fk? _fk_timestamp is on the data-side,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe consider adding a # NOTE: here and make it more declarative, e.g. "the reason to have....". I was thinking it might be a todo when first reading it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded to a declarative # NOTE: block in both _reset_pose and _reset_velocity.

# since we do finite differencing.
self.body_com_acc_w

def reset_pose(self, from_link: bool = True) -> None:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the function signature differs and there's no base class declaration, I think those could be declared as _private_function to indicate they are internal.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, thanks for catching the missing base class implementations.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — renamed reset_pose/reset_velocity to _reset_pose/_reset_velocity across all three backends and declared them as abstract methods on BaseArticulationData/BaseRigidObjectData/BaseRigidObjectCollectionData, so they are internal but keep a discoverable contract.

SimulationManager.forward()
self._fk_timestamp = self._sim_timestamp

def reset_pose(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is some positive/negative test needed to catch the change from the old behavior?

@AntoineRichard AntoineRichard Jun 15, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mostly a clean-up, it does not change the behavior except for the velocities, the pose were already covered in the standard tests. I'll see if we can add the velocities.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the velocity coverage: a new test_write_root_state_functions_data_consistency for the articulation, plus priming in the rigid-object test_write_state_functions_data_consistency. The existing test did not actually exercise the invalidation — the derived caches recompute on first access unless they are primed at the current timestamp before the write — so I added that priming. Verified both tests fail if the reset_velocity invalidation is removed. The body_com_pose_w staleness greptile flagged is fixed and covered as well.

Newton RigidObjectData.body_com_pose_w cached a reshaped view of
root_com_pose_w on first access and returned it unchanged thereafter,
without re-deriving root_com_pose_w or refreshing forward kinematics.
After a link-frame pose write invalidated root_com_pose_w, reads of
body_com_pose_w returned the pre-write buffer.

Touch root_com_pose_w and refresh forward kinematics on each access,
mirroring the sibling body_link_pose_w and the collection data class.

Add regression coverage that writes root state without an intervening
sim step and asserts the body- and cross-frame caches reflect the
write. The velocity checks prime the derived caches before the write
so they genuinely exercise the invalidation.
Rename reset_pose/reset_velocity to _reset_pose/_reset_velocity across
the Newton, PhysX, and OV PhysX articulation, rigid-object, and
rigid-object-collection data classes and their state writers, and
declare them as abstract methods on the base data classes so the
invalidation contract is discoverable.

Add isaaclab.utils.buffers.reset_timestamps to invalidate a list of
timestamped buffers in one call, and use it in every reset helper so
each buffer is named exactly once at the call site, removing the
repeated None-checks and the associated check-one-reset-another typo
risk.
The new abstract _reset_pose/_reset_velocity methods on the base asset
data classes left MockArticulationData, MockRigidObjectData, and
MockRigidObjectCollectionData with unimplemented abstract methods, so
they could no longer be instantiated and every mock-based test failed
at construction.

Add no-op implementations on the three mock data classes, matching
their existing no-op update().
Make env_ids/env_mask keyword-only on the Newton _reset_pose and
_reset_velocity overrides so from_link/from_com stay first like the
abstract base signature, keeping the overrides Liskov-substitutable for
a base-typed reference (all call sites already pass these by keyword).

Document the newly-required abstract _reset_pose/_reset_velocity hooks
on the base asset data classes as a breaking change, with migration
guidance for custom simulation-backend subclasses.
…on_timestamps

Resolve the sole conflict in the Newton rigid_object_collection writer
by taking develop's cleaned-up comment (drops the stale newton#3071
TODO); the workaround code is unchanged on both sides.
@AntoineRichard AntoineRichard merged commit 01099bb into isaac-sim:develop Jun 16, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants