Skip to content

OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants#6084

Open
jmart-nv wants to merge 3 commits into
isaac-sim:developfrom
jmart-nv:jmart/cartpole-rtx
Open

OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants#6084
jmart-nv wants to merge 3 commits into
isaac-sim:developfrom
jmart-nv:jmart/cartpole-rtx

Conversation

@jmart-nv

@jmart-nv jmart-nv commented Jun 9, 2026

Copy link
Copy Markdown

Description

Recent benchmarks have shown a large convergence gap for the cartpole camera env on combos that use newton physics and non-rgb RTX variants. This is because newton physics does not provide implicit damping, and thus requires temporal data - either implicit (TAA artifacts in RTX rgb) or explicit (frame stacking in newton+warp.)

Since the non-rgb variants of RTX do not provide implicit temporal data via TAA, it is necessary to enable frame stacking so the CNN can infer velocity frame-to-frame. This allows the problematic combos to converge to the same high success_rate as their analogous baselines: newton+RTX depth/albedo now out-perform the same physx combos. Newton+RTX simple shading converges comparably to physx - still relatively low, but this will require a deeper investigation into simple shading if we wish to fix it.

Performance shows a marginal FPS overhead and twice as much memory usage - within expectations for this change.

Newton+RTX SR (fs1) SR (fs2) FPS (fs1) FPS (fs2) VRAM (fs1) VRAM (fs2)
depth 0.01 0.96 27.9k 26.9k 9.4 GB 14.3 GB
albedo 0.06 0.95 24.8k 21.8k 18.7 GB 33.6 GB
simple_shading 0.03 0.89 24.5k 21.8k 18.7 GB 34.1 GB

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Screenshots

convergence_framestack_fix_kit6001

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation (N/A - frame stacking already documented)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

…ariants that don't provide implicit temporal info.
@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label Jun 9, 2026

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 IsaacLab Review Bot — PR #6084

OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants

Summary

Well-motivated fix that correctly extends the frame-stacking heuristic to cover Newton+RTX AOVs lacking DLSS temporal accumulation. The three-tier resolver logic (non-Newton → 1, Newton+Warp → 2, Newton+RTX per-AOV) is clean and the code is well-documented. Tests comprehensively cover the new truth table.


Findings

🟡 Warning — semantic_segmentation under Newton+RTX silently defaults to no stacking

semantic_segmentation produces discrete class labels and almost certainly bypasses DLSS temporal accumulation (it is not a photorealistic signal that benefits from temporal AA). Under Newton+RTX, the current logic returns frame_stack=1 since "semantic_segmentation".startswith(("depth", "albedo", "simple_shading")) is False.

If this is intentional (e.g., segmentation observations are not used for velocity inference or the env is only used for non-RL purposes), please add a brief code comment noting the exclusion. If unintentional, it would cause the same convergence failure this PR fixes for the other AOVs.

File: source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py, line 61


🟡 Warning — Unused SIMPLE_SHADING_TYPES constant creates maintenance confusion

The module defines SIMPLE_SHADING_TYPES (line 25) as a set of exact strings, but _resolve_frame_stack_default uses startswith("simple_shading") instead. This is a mismatch that may confuse future contributors: they might update the set thinking it controls stacking, while the actual logic is the prefix match. Consider either:

  • Using the constant in the resolver: any(data_type.startswith(p) for p in _DLSS_BYPASS_PREFIXES), or
  • Adding a comment near SIMPLE_SHADING_TYPES clarifying it is not used for frame-stack resolution.

File: source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py, lines 25–28 vs 61


🔵 Suggestion — Future-proof with a constant for DLSS-bypass prefixes

The startswith(("depth", "albedo", "simple_shading")) pattern is compact but adding new non-DLSS AOVs requires editing logic inside the method. Consider extracting to a module constant:

_NO_TEMPORAL_CUE_PREFIXES = ("depth", "albedo", "simple_shading")

This gives future contributors a single, documented place to register new AOV prefixes that bypass DLSS, reducing the risk of silent convergence failures when new AOVs are added.


🔵 Suggestion — Make test_newton_with_default_renderer precondition explicit

The existing test test_newton_with_default_renderer now implicitly depends on the default preset's data type being "rgb" to reach the correct code path. Adding an explicit assertion (matching the style of the new tests) would prevent false confidence if the default changes:

def test_newton_with_default_renderer(self):
    cfg = _resolve("newton_mjwarp")
    assert isinstance(cfg.sim.physics, NewtonCfg)
    assert isinstance(cfg.tiled_camera.renderer_cfg, IsaacRtxRendererCfg)
    assert cfg.tiled_camera.data_types == ["rgb"]  # <-- make precondition explicit
    assert _policy_default(cfg) == 1

File: source/isaaclab_tasks/test/core/test_cartpole_camera_presets_frame_stack.py, line ~79


🟢 Note — No end-to-end test for Newton+RTX+non-rgb combo

The TestEnvConstructionEndToEnd parametrized cases cover Newton+Warp and PhysX but don't exercise the new Newton+RTX+depth (or similar) path through actual env construction. The cfg-level tests in TestFrameStackTruthTable give good coverage of the resolver logic, but if wiring between the resolver and the env class regresses specifically for RTX, it wouldn't be caught. Low priority since the resolver is already tested in isolation.


Verdict

Approve with minor suggestions. The fix is correct, well-scoped, and thoroughly tested at the config level. The primary actionable item is confirming whether semantic_segmentation under Newton+RTX intentionally stays at frame_stack=1. The other suggestions are non-blocking maintainability improvements.


🤖 Generated by IsaacLab Review Bot | Commit: db5ecb5

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes convergence failures in the cartpole camera environment under Newton physics combined with non-rgb RTX AOVs (depth, albedo, simple_shading). Those AOVs bypass DLSS temporal accumulation, so the policy had no temporal cue to infer velocity from — the same gap that already required explicit frame stacking for the Newton+Warp combo.

  • Capability-flag abstraction: adds provides_implicit_damping() to PhysicsCfg and provides_temporal_camera_data(data_type) to RendererCfg, with each backend subclass overriding as appropriate. The resolver in CartpoleCameraEnv._resolve_frame_stack_default now consults these flags instead of hard-coding type names.
  • Frame stacking extended: Newton + RTX + non-rgb AOVs now auto-resolve to frame_stack=2, matching Newton+Warp; Newton + RTX + rgb remains at 1 since DLSS already supplies the temporal cue.
  • Test coverage: five new truth-table tests cover the per-AOV split, plus an end-to-end Newton+RTX+depth case verifying the full obs-pipeline shape.

Confidence Score: 5/5

Safe to merge — the logic is correct, test coverage is comprehensive, and the refactor is backward-compatible for all existing backend combinations.

The capability-flag pattern is clean and each override is straightforward. All edge cases (PhysX implicit damping short-circuits stacking regardless of AOV; Newton+RTX+rgb keeps single-frame; Newton+Warp still returns 2) are explicitly tested. No pre-existing behavior changes for non-Newton backends.

No files require special attention.

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py Refactors _resolve_frame_stack_default from hardcoded Newton+Warp type-checks to polymorphic capability flags; logic is correct and all tested paths verified
source/isaaclab_tasks/test/core/test_cartpole_camera_presets_frame_stack.py Adds 5 new truth-table tests (Newton+RTX per AOV) and one end-to-end Newton+RTX+depth case; good coverage of the new stacking logic
source/isaaclab/isaaclab/physics/physics_manager_cfg.py Adds provides_implicit_damping() capability method to base PhysicsCfg with a safe default of True
source/isaaclab/isaaclab/renderers/renderer_cfg.py Adds provides_temporal_camera_data(data_type) capability method to base RendererCfg with a safe default of False
source/isaaclab_newton/isaaclab_newton/physics/newton_manager_cfg.py Correctly overrides provides_implicit_damping() to False for Newton's symplectic integrator
source/isaaclab_newton/isaaclab_newton/renderers/newton_warp_renderer_cfg.py Correctly overrides provides_temporal_camera_data() to False for all AOVs (pure rasterizer)
source/isaaclab_physx/isaaclab_physx/renderers/isaac_rtx_renderer_cfg.py Correctly overrides provides_temporal_camera_data() to True only for rgb/rgba (DLSS path), False for all other AOVs
source/isaaclab_ov/isaaclab_ov/renderers/ovrtx_renderer_cfg.py Mirrors IsaacRtxRendererCfg override: True for rgb/rgba, False otherwise

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[_resolve_frame_stack_default] --> B{physics_cfg is None\nor provides_implicit_damping?}
    B -- Yes --> R1[return 1\nno stacking needed]
    B -- No\nNewton --> C[get renderer_cfg\nand data_type]
    C --> D{renderer_cfg.provides_temporal_camera_data\ndata_type?}
    D -- True\nIsaacRTX/OVRTX + rgb/rgba --> R1
    D -- False\nWarp any AOV\nRTX + depth/albedo/simple_shading/seg --> R2[return 2\nframe stacking enabled]
Loading

Reviews (2): Last reviewed commit: "Address PR review: drop dead SIMPLE_SHAD..." | Re-trigger Greptile

Comment thread source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py Outdated
Comment thread source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py Outdated
Newton has no implicit damping, so the policy needs a temporal cue to infer velocity.
Returns ``2`` when the render carries none, else ``1``:

- Non-Newton (PhysX / OV-PhysX): damping suffices -> ``1``.

@pbarejko pbarejko Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have some problem in the design:

return 2 if data_type.startswith(("depth", "albedo", "simple_shading")) else 1

Those are kind of details of the output of the renderer, or its configuration that we re-define it in yet another place. At some point this could become a maintenance nightmare.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored this: the physics and renderer cfgs now report their own capabilities and requirements, so cartpole doesn't need to hard-code a list of combos that need frame stacking, it can infer that info from the new cfg APIs.

@jmart-nv jmart-nv marked this pull request as draft June 10, 2026 19:49
@jmart-nv

Copy link
Copy Markdown
Author

Moved back to draft: still investigating the simple shading issues (found the correct lighting settings, running another round of tests with those) and refactoring some of the framestack logic

@jmart-nv jmart-nv marked this pull request as ready for review June 12, 2026 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants