OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants#6084
OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants#6084jmart-nv wants to merge 3 commits into
Conversation
…ariants that don't provide implicit temporal info.
There was a problem hiding this comment.
🤖 IsaacLab Review Bot — PR #6084
OMPE-96088: Fix cartpole camera anomalies for newton + rtx variants
Summary
Well-motivated fix that correctly extends the frame-stacking heuristic to cover Newton+RTX AOVs lacking DLSS temporal accumulation. The three-tier resolver logic (non-Newton → 1, Newton+Warp → 2, Newton+RTX per-AOV) is clean and the code is well-documented. Tests comprehensively cover the new truth table.
Findings
🟡 Warning — semantic_segmentation under Newton+RTX silently defaults to no stacking
semantic_segmentation produces discrete class labels and almost certainly bypasses DLSS temporal accumulation (it is not a photorealistic signal that benefits from temporal AA). Under Newton+RTX, the current logic returns frame_stack=1 since "semantic_segmentation".startswith(("depth", "albedo", "simple_shading")) is False.
If this is intentional (e.g., segmentation observations are not used for velocity inference or the env is only used for non-RL purposes), please add a brief code comment noting the exclusion. If unintentional, it would cause the same convergence failure this PR fixes for the other AOVs.
File: source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py, line 61
🟡 Warning — Unused SIMPLE_SHADING_TYPES constant creates maintenance confusion
The module defines SIMPLE_SHADING_TYPES (line 25) as a set of exact strings, but _resolve_frame_stack_default uses startswith("simple_shading") instead. This is a mismatch that may confuse future contributors: they might update the set thinking it controls stacking, while the actual logic is the prefix match. Consider either:
- Using the constant in the resolver:
any(data_type.startswith(p) for p in _DLSS_BYPASS_PREFIXES), or - Adding a comment near
SIMPLE_SHADING_TYPESclarifying it is not used for frame-stack resolution.
File: source/isaaclab_tasks/isaaclab_tasks/core/cartpole/cartpole_direct_camera_env.py, lines 25–28 vs 61
🔵 Suggestion — Future-proof with a constant for DLSS-bypass prefixes
The startswith(("depth", "albedo", "simple_shading")) pattern is compact but adding new non-DLSS AOVs requires editing logic inside the method. Consider extracting to a module constant:
_NO_TEMPORAL_CUE_PREFIXES = ("depth", "albedo", "simple_shading")This gives future contributors a single, documented place to register new AOV prefixes that bypass DLSS, reducing the risk of silent convergence failures when new AOVs are added.
🔵 Suggestion — Make test_newton_with_default_renderer precondition explicit
The existing test test_newton_with_default_renderer now implicitly depends on the default preset's data type being "rgb" to reach the correct code path. Adding an explicit assertion (matching the style of the new tests) would prevent false confidence if the default changes:
def test_newton_with_default_renderer(self):
cfg = _resolve("newton_mjwarp")
assert isinstance(cfg.sim.physics, NewtonCfg)
assert isinstance(cfg.tiled_camera.renderer_cfg, IsaacRtxRendererCfg)
assert cfg.tiled_camera.data_types == ["rgb"] # <-- make precondition explicit
assert _policy_default(cfg) == 1File: source/isaaclab_tasks/test/core/test_cartpole_camera_presets_frame_stack.py, line ~79
🟢 Note — No end-to-end test for Newton+RTX+non-rgb combo
The TestEnvConstructionEndToEnd parametrized cases cover Newton+Warp and PhysX but don't exercise the new Newton+RTX+depth (or similar) path through actual env construction. The cfg-level tests in TestFrameStackTruthTable give good coverage of the resolver logic, but if wiring between the resolver and the env class regresses specifically for RTX, it wouldn't be caught. Low priority since the resolver is already tested in isolation.
Verdict
✅ Approve with minor suggestions. The fix is correct, well-scoped, and thoroughly tested at the config level. The primary actionable item is confirming whether semantic_segmentation under Newton+RTX intentionally stays at frame_stack=1. The other suggestions are non-blocking maintainability improvements.
🤖 Generated by IsaacLab Review Bot | Commit: db5ecb5
Greptile SummaryThis PR fixes convergence failures in the cartpole camera environment under Newton physics combined with non-
Confidence Score: 5/5Safe to merge — the logic is correct, test coverage is comprehensive, and the refactor is backward-compatible for all existing backend combinations. The capability-flag pattern is clean and each override is straightforward. All edge cases (PhysX implicit damping short-circuits stacking regardless of AOV; Newton+RTX+rgb keeps single-frame; Newton+Warp still returns 2) are explicitly tested. No pre-existing behavior changes for non-Newton backends. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[_resolve_frame_stack_default] --> B{physics_cfg is None\nor provides_implicit_damping?}
B -- Yes --> R1[return 1\nno stacking needed]
B -- No\nNewton --> C[get renderer_cfg\nand data_type]
C --> D{renderer_cfg.provides_temporal_camera_data\ndata_type?}
D -- True\nIsaacRTX/OVRTX + rgb/rgba --> R1
D -- False\nWarp any AOV\nRTX + depth/albedo/simple_shading/seg --> R2[return 2\nframe stacking enabled]
Reviews (2): Last reviewed commit: "Address PR review: drop dead SIMPLE_SHAD..." | Re-trigger Greptile |
| Newton has no implicit damping, so the policy needs a temporal cue to infer velocity. | ||
| Returns ``2`` when the render carries none, else ``1``: | ||
|
|
||
| - Non-Newton (PhysX / OV-PhysX): damping suffices -> ``1``. |
There was a problem hiding this comment.
I think we have some problem in the design:
return 2 if data_type.startswith(("depth", "albedo", "simple_shading")) else 1
Those are kind of details of the output of the renderer, or its configuration that we re-define it in yet another place. At some point this could become a maintenance nightmare.
There was a problem hiding this comment.
Refactored this: the physics and renderer cfgs now report their own capabilities and requirements, so cartpole doesn't need to hard-code a list of combos that need frame stacking, it can infer that info from the new cfg APIs.
|
Moved back to draft: still investigating the simple shading issues (found the correct lighting settings, running another round of tests with those) and refactoring some of the framestack logic |
…recondition, add Newton+RTX+depth E2E case
Description
Recent benchmarks have shown a large convergence gap for the cartpole camera env on combos that use newton physics and non-rgb RTX variants. This is because newton physics does not provide implicit damping, and thus requires temporal data - either implicit (TAA artifacts in RTX rgb) or explicit (frame stacking in newton+warp.)
Since the non-rgb variants of RTX do not provide implicit temporal data via TAA, it is necessary to enable frame stacking so the CNN can infer velocity frame-to-frame. This allows the problematic combos to converge to the same high success_rate as their analogous baselines: newton+RTX depth/albedo now out-perform the same physx combos. Newton+RTX simple shading converges comparably to physx - still relatively low, but this will require a deeper investigation into simple shading if we wish to fix it.
Performance shows a marginal FPS overhead and twice as much memory usage - within expectations for this change.
Type of change
Screenshots
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there