Skip to content

fix(glcm): default co-occurrence offset to 1 and exclude background#356

Open
darkclad wants to merge 1 commit into
PolusAI:mainfrom
darkclad:main-glcm-defaults-and-background
Open

fix(glcm): default co-occurrence offset to 1 and exclude background#356
darkclad wants to merge 1 commit into
PolusAI:mainfrom
darkclad:main-glcm-defaults-and-background

Conversation

@darkclad

@darkclad darkclad commented Jul 1, 2026

Copy link
Copy Markdown

Two production-path defects the hard-coded phantom tests missed:

  • GLCM_OFFSET was zero-initialised when settings are compiled from the production path, so dx=dy=0 -> every pixel co-occurs with itself -> CONTRAST=0 for any image. Default GLCM_OFFSET=1 (IBSI delta=1) in env_features.cpp (plus GLCM_GREYDEPTH/GLCM_NUMANG).
  • The MATLAB grey-binning path maps off-ROI background (0) to level 1; the old zero-skip guard only ran on the IBSI path, so background inside a concave ROI flooded the co-occurrence matrix and diluted CONTRAST. Skip on the ORIGINAL intensity (imR==0) in all binning paths (glcm.cpp).
  • Soft-NAN guard CORRELATION/INFOMEAS1 for the degenerate single-grey-level case that a correctly-masked tiny ROI can produce (glcm.cpp/.h, glcm_nontriv.cpp).

Refresh the 25 background-sensitive phantom goldens in test_glcm.h (the background-insensitive keys are unchanged), add test_glcm_bugs.h with a production-path offset-default regression guard (registered in test_all.cc), and add GLCM contrast/background Python tests validated against PyRadiomics/MIRP.

@darkclad

darkclad commented Jul 1, 2026

Copy link
Copy Markdown
Author

Justification — golden value changes for main-glcm-defaults-and-background

Branch: main-glcm-defaults-and-background (split from main-feature-validation, based on main)
Source fix: glcm.cpp, glcm.h, glcm_nontriv.cpp, env_features.cpp
Test files touched: tests/test_glcm.h (phantom goldens), tests/test_glcm_bugs.h (new),
tests/test_all.cc (registration), tests/python/test_feature_bugs.py (2 tests)

The bugs fixed

  1. Co-occurrence offset defaulted to 0 (env_features.cpp). When the GLCM settings are compiled
    from the production path, GLCM_OFFSET was zero-initialised, so dx = dy = 0: every pixel
    co-occurs with itself → a purely diagonal matrix → CONTRAST = 0 for any image. The hard-coded
    C++ phantom tests set offset = 1 explicitly, so they never exercised the default and missed this.
    Fix: default GLCM_OFFSET = 1 (IBSI delta = 1), plus GLCM_GREYDEPTH/GLCM_NUMANG.
  2. Background counted in the MATLAB binning path (glcm.cpp calculateCoocMatAtAngle). The
    non-IBSI (MATLAB) grey-binning maps off-ROI background (original intensity 0) to level 1, and the
    old guard only skipped zeros on the IBSI path. So background inside a concave ROI's bounding box
    was counted as a real grey tone and flooded the matrix with spurious diagonal mass (diluting
    CONTRAST). Fix: skip on the original intensity (imR.yx(...) == 0) in all binning paths, so
    every path agrees with the IBSI path and external oracles.
  3. Soft-NAN guards for CORRELATION (f_corr) and INFOMEAS1 (f_info_meas_corr1): once background
    is correctly excluded, a tiny/uniform ROI can leave a single grey level → zero-variance marginal →
    0/0 = NaN. Return the soft-NAN sentinel instead.

tests/test_glcm.h golden changes (the 25 refreshed phantom keys)

These are Nyxus-convention regression values for the digital phantom (100 grey levels, offset 1,
Nyxus's asymmetric cooc matrix) — not an external oracle for the phantom itself. They were refreshed
because the phantom's slices z2–z4 contain masked-out (background) pixels, and fix (2) now excludes
those pixels, so the co-occurrence matrix — and every statistic derived from it — changes on those
slices.

  • What is the ground truth: the values are the corrected code's deterministic output on the
    phantom, taken verbatim from the oracle-validated main-feature-validation. The C++ test
    (test_glcm_feature) re-passes with them (its agrees_gt(..., 100.) is a loose regression bound).
  • Internal consistency: keys insensitive to the background exclusion — GLCM_CONTRAST,
    GLCM_CLUTEND, GLCM_SUMVARIANCE, GLCM_IDN, GLCM_IDMN — were kept unchanged at full
    precision. Only the 25 background-sensitive keys moved. GLCM_CORRELATION/GLCM_INFOMEAS1 are the
    soft-NAN(=0)-guarded ones on the single degenerate slice.
  • The map/function were renamed (glcm_values, glcm_truth_key) as part of this GLCM test's own
    refresh; this is GLCM-local, not the cross-cutting table-merge refactor (which was dropped).

External-oracle proof lives in the Python tests

The phantom goldens are regression anchors; the oracle validation is in
tests/python/test_feature_bugs.py, which runs the production featurize() path:

  • test_glcm_contrast_nonzero_by_default — a horizontal intensity ramp must have 0° contrast > 0 and
    90° contrast == 0. Directly catches bug (1) (offset=0 → all-zero contrast).
  • test_glcm_background_not_counted — on the canonical concave ROI quantized to levels 1..64 (identity
    binning), GLCM_CONTRAST_AVE must match the transparent numpy reference / MIRP / PyRadiomics
    (≈133.9). Pre-fix this ROI gave ≈99 (background-diluted). Catches bug (2).

test_glcm_bugs.h (new) + test_all.cc

test_glcm_bug_offset_default_is_one() is a C++ regression guard that builds GLCM through the
production settings path (not the hard-coded phantom), asserting the co-occurrence distance
defaults to 1 — the exact gap that let bug (1) ship. Registered in test_all.cc as
TEST_GLCM_BUG_OFFSET_DEFAULT (only the #include and the TEST(...) block were added; the file's
pre-existing mixed line endings were left byte-for-byte untouched — no churn).

CI

Full suite green on this branch: C++ runAllTests 683/683 passed; pytest tests/python/
50 passed, 1 skipped (48 baseline + the 2 GLCM tests above).

Two production-path defects the hard-coded phantom tests missed:

- GLCM_OFFSET was zero-initialised when settings are compiled from the
  production path, so dx=dy=0 -> every pixel co-occurs with itself ->
  CONTRAST=0 for any image. Default GLCM_OFFSET=1 (IBSI delta=1) in
  env_features.cpp (plus GLCM_GREYDEPTH/GLCM_NUMANG).
- The MATLAB grey-binning path maps off-ROI background (0) to level 1; the
  old zero-skip guard only ran on the IBSI path, so background inside a
  concave ROI flooded the co-occurrence matrix and diluted CONTRAST. Skip on
  the ORIGINAL intensity (imR==0) in all binning paths (glcm.cpp).
- Soft-NAN guard CORRELATION/INFOMEAS1 for the degenerate single-grey-level
  case that a correctly-masked tiny ROI can produce (glcm.cpp/.h, glcm_nontriv.cpp).

Refresh the 25 background-sensitive phantom goldens in test_glcm.h (the
background-insensitive keys are unchanged), add test_glcm_bugs.h with a
production-path offset-default regression guard (registered in test_all.cc),
and add GLCM contrast/background Python tests validated against PyRadiomics/MIRP.
@darkclad darkclad force-pushed the main-glcm-defaults-and-background branch from 7c1ede4 to 9f0f459 Compare July 2, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant