(corrected) Add MedNeXt-L + SkeletonRecall trainer: addresses #191 merger failure mode, validated on 7-cube re-score with official Kaggle topology metric by ciscoriordan · Pull Request #975 · ScrollPrize/villa

ciscoriordan · 2026-05-24T04:31:41Z

Adds a MedNeXt-L + SkeletonRecall surface trainer for the compressed and highly curved regions where the ResEnc-L surface model loses recall (#191). This re-pitches #925 with the evaluation corrected: scored on the official Kaggle Vesuvius Surface Detection metric (0.30 TopoScore + 0.35 SurfaceDice@τ + 0.35 VOI) rather than binary IoU, on all 7 held-out S1 cubes at threshold 0.5.

mean over 7 S1 cubes	d058	MedNeXt-L SkelRec
Kaggle score	0.3996	0.4397
SurfaceDice@τ=2	0.508	0.580
VOI	0.567	0.667
TopoScore	0.084	0.022

MedNeXt-L wins on all 7 cubes. d058 leads only on TopoScore, because MN's smoother predictions add spurious handles and cavities against the binary GT, and the SurfaceDice and VOI gains outweigh that. Following ensemble-not-swap, the recommended deployment is voxel-wise max(d058, MN) at 0.4437, which beat the mean and every weighted blend in a 6-op fusion sweep. Per-cube numbers, the fusion sweep, and the TopoScore sub-investigation are in the supporting repo.

Downstream check: feeding each surface into the v4 ink detector on the same crops, MN matches the hand-segmented bruniss surface within 3-5 AUC, while d058's argmax heightmap extraction gives sub-random ink.

The trainer subclasses nnUNetTrainerSkeletonRecall and only swaps in MedNeXt-L (kernel5, with a kernel3 VRAM fallback) from the optional [mednext] extra, lazy-imported so nnU-Net trainer discovery does not require it. It builds the deep-supervision heads unconditionally and toggles do_ds for the output mode, so a trained checkpoint strict-loads under nnUNetv2_predict. Pretrained weights are on HF (kernel5_skelrec_dataset059_ep33).

maxliebscher · 2026-05-25T07:32:08Z

This re-pitch is much easier to evaluate than the earlier PR; the Kaggle-metric framing and the downstream ink AUC sanity check are the right kind of evidence for #191.

I found four small integration/doc things that would be worth tightening before merge:

The new trainer imports nnunet_mednext at module import time, while [mednext] is optional. nnU-Net trainer discovery/import paths can touch trainer modules even when a user is not trying to build this MedNeXt trainer, so this would be safer as a lazy import inside build_network_architecture() with a clear pip install -e ".[mednext]" error if the extra is missing.
This class inherits the default set_deep_supervision_enabled(), which toggles mod.decoder.deep_supervision. MedNeXt uses do_ds, so validation/training toggles may not switch the output mode correctly unless this trainer overrides the toggle for MedNeXt modules.
The new optional extra is declared as mednext = ["mednextv1"]. Upstream MIC-DKFZ/MedNeXt currently documents installation via cloning the repository and running pip install -e ., so could you confirm this extra resolves in the intended install environment, or switch the extra to the correct package/source spec?
Could you update the newly added README section and trainer module docstring to match the corrected 7-cube official Kaggle metric in this PR body, or trim the metric details from repo docs and point readers to the supporting writeup? Right now those landed-doc additions still describe the older 5-cube high_compressed IoU / overall_macro IoU benchmark, which looks like superseded Add MedNeXt-L + SkeletonRecall trainer for compressed surface regions #925-era evidence.

I locally checked that a small patch for those first two integration risks is applicable to this PR head and that focused stubbed tests pass without installing MedNeXt. A full import/build/forward smoke with pip install -e ".[mednext]" would still be useful before merge because this PR depends on an external architecture package and bypasses the default nnU-Net architecture builder.

- Lazy-import mednextv1 inside build_network_architecture so nnU-Net trainer discovery doesn't break when the optional extra is absent; raise an actionable ImportError pointing at `pip install -e ".[mednext]"`. - Override set_deep_supervision_enabled to toggle MedNeXt's `do_ds` instead of the default `decoder.deep_supervision` (MedNeXt has no decoder attribute). - Point the `mednext` extra at the upstream git source; mednextv1 isn't on PyPI. - Update README + docstring to the corrected 7-cube official Kaggle metric (0.3996 -> 0.4397, +10.0%) and drop the superseded 5-cube high_compressed IoU.

vercel · 2026-05-25T17:40:53Z

@ciscoriordan is attempting to deploy a commit to the scroll Team on Vercel.

A member of the Team first needs to authorize it.

- Lazy-import mednextv1 inside build_network_architecture so nnU-Net trainer discovery doesn't break when the optional extra is absent; raise an actionable ImportError pointing at `pip install -e ".[mednext]"`. - Override set_deep_supervision_enabled to toggle MedNeXt's `do_ds` instead of the default `decoder.deep_supervision` (MedNeXt has no decoder attribute). - Point the `mednext` extra at the upstream git source; mednextv1 isn't on PyPI. - Update README + docstring to the corrected 7-cube official Kaggle metric (0.3996 -> 0.4397, +10.0%) and drop the superseded 5-cube high_compressed IoU.

Adds nnUNetTrainerSkeletonRecall_MedNeXtL_kernel5 (plus a kernel3 fallback in the same file) that extends the existing nnUNetTrainerSkeletonRecall and only overrides build_network_architecture to swap the default ResEnc U-Net for MedNeXt-L. Loss, transforms, data loaders, and train/validation steps are inherited unchanged. Motivation: issue ScrollPrize#191 ("Surface and Fiber Predictions in Compressed or Highly Curved areas"). On a held-out 5-cube benchmark over PHerc Paris 1 (S1) and PHerc 1667 (S4), this trainer scores +0.267 absolute high_compressed IoU (+66% relative) over the d058 ResEnc-L production model; overall_macro IoU 0.534 -> 0.685. MedNeXt is consumed as the upstream pip package mednextv1 (https://github.com/MIC-DKFZ/MedNeXt) and added as an optional dependency so users not using this trainer see no change in install footprint: pip install -e ".[mednext]" README updated with a short section pointing to the issue, the benchmark comment, the HF weights, and the kernel3 fallback.

ciscoriordan · 2026-05-31T23:00:53Z

All four are addressed: the nnunet_mednext import is now lazy inside build_network_architecture, there's a do_ds deep-supervision override, the [mednext] extra points at the MIC-DKFZ git source, and the docs lead with the corrected 7-cube Kaggle metric. The pip install -e ".[mednext]" install plus a MedNeXt-L kernel5 build/forward both check out, and I rebased onto current main.

giorgioangel · 2026-06-02T13:58:43Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfcaa15286

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…or inference nnUNetv2_predict rebuilds the network with enable_deep_supervision=False and strict-loads a checkpoint trained with deep supervision. MedNeXt only registers its out_1..out_4 heads when constructed with deep_supervision=True, so the inference build was missing those weights and load_state_dict failed on unexpected out_* keys. Construct the heads unconditionally in a shared _build_mednext helper and use do_ds to select single- vs deep-supervision output, so both the kernel5 and kernel3 checkpoints load for inference.

ciscoriordan requested a review from giorgioangel as a code owner May 24, 2026 04:31

ciscoriordan mentioned this pull request May 24, 2026

Surface and Fiber Predictions in Compressed or Highly Curved areas #191

Open

SuperOptimizer closed this May 24, 2026

SuperOptimizer reopened this May 24, 2026

ciscoriordan force-pushed the mednext-l-skeletonrecall-trainer branch from 4e5705a to 3c336f2 Compare May 30, 2026 00:13

ciscoriordan force-pushed the mednext-l-skeletonrecall-trainer branch from 3c336f2 to dfcaa15 Compare May 30, 2026 01:05

chatgpt-codex-connector Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread ...nunetv2/training/nnUNetTrainer/variants/loss/nnUNetTrainerSkeletonRecall_MedNeXtL_kernel5.py Outdated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(corrected) Add MedNeXt-L + SkeletonRecall trainer: addresses #191 merger failure mode, validated on 7-cube re-score with official Kaggle topology metric#975

(corrected) Add MedNeXt-L + SkeletonRecall trainer: addresses #191 merger failure mode, validated on 7-cube re-score with official Kaggle topology metric#975
ciscoriordan wants to merge 2 commits into
ScrollPrize:mainfrom
ciscoriordan:mednext-l-skeletonrecall-trainer

ciscoriordan commented May 24, 2026 •

edited

Loading

Uh oh!

maxliebscher commented May 25, 2026 •

edited

Loading

Uh oh!

vercel Bot commented May 25, 2026

Uh oh!

ciscoriordan commented May 31, 2026

Uh oh!

giorgioangel commented Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ciscoriordan commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxliebscher commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel Bot commented May 25, 2026

Uh oh!

ciscoriordan commented May 31, 2026

Uh oh!

giorgioangel commented Jun 2, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ciscoriordan commented May 24, 2026 •

edited

Loading

maxliebscher commented May 25, 2026 •

edited

Loading