Skip to content

Lower ASLR entropy so ASan can map its shadow region#270

Merged
ms609 merged 4 commits into
mainfrom
fix-asan-aslr-entropy
May 18, 2026
Merged

Lower ASLR entropy so ASan can map its shadow region#270
ms609 merged 4 commits into
mainfrom
fix-asan-aslr-entropy

Conversation

@ms609
Copy link
Copy Markdown
Owner

@ms609 ms609 commented May 18, 2026

Summary

The `AddressSanitizer tests` job has been failing on every push for several weeks (last 5+ `ASan.yml` runs on main) with:

```
==PID==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
```

This is google/sanitizers#856 / llvm/llvm-project#85780. The ubuntu-24.04 runner kernel uses `vm.mmap_rnd_bits=32` for ASLR, which exceeds what ASan's shadow memory layout can accommodate, so the sanitizer aborts at startup.

The canonical fix is `sudo sysctl -w vm.mmap_rnd_bits=28` before any ASan-instrumented binary runs. This PR adds that step right after `actions/checkout` in .github/workflows/ASan.yml.

Why only `tests` fails today while `examples` / `vignettes` pass: `memcheck/tests.R` calls `testthat::test_local()` which loads more shared libraries before ASan allocates its shadow than the smaller scripts do — pushing the address-space layout past what default-entropy ASLR allows. The sysctl applies to all three matrix configurations so the fix is robust to future drift.

Test plan

  • Watch the `gcc-ASAN` workflow on this PR; `AddressSanitizer tests` should now pass (along with the already-green examples/vignettes subjobs).
  • No package-code changes; package tests / R CMD check unaffected.

🤖 Generated with Claude Code

ms609 added 4 commits May 18, 2026 13:29
The "AddressSanitizer tests" job has been failing on every push for
several weeks with:

  ==PID==Shadow memory range interleaves with an existing memory
  mapping. ASan cannot proceed correctly. ABORTING.

This is google/sanitizers#856 / llvm/llvm-project#85780. The
ubuntu-24.04 runner ships a kernel with vm.mmap_rnd_bits=32; ASan's
shadow memory layout cannot accommodate that much ASLR entropy and
aborts at startup. Setting vm.mmap_rnd_bits=28 (the canonical
workaround) gives ASan a clean shadow region.

Only "AddressSanitizer tests" is affected on this repo today, because
testthat::test_local() loads more shared libraries before ASan
allocates its shadow than the examples/vignettes subjobs do. Applying
the sysctl unconditionally to all three configurations is harmless
and avoids future drift.
Setting vm.mmap_rnd_bits=28 alone was applied successfully (logs
confirm 'vm.mmap_rnd_bits = 28') but ASan still aborted with the
ELF_ET_DYN_BASE diagnostic - the kernel's load address for PIE
binaries on ubuntu-24.04 collides with ASan's shadow regardless of
mmap entropy. Disable system-wide ASLR for the runner VM to give
ASan a stable, low-address layout.
The previous attempt set sysctl vm.mmap_rnd_bits=28 and
kernel.randomize_va_space=0, both confirmed applied on the runner.
ASan still aborted with the ELF_ET_DYN_BASE diagnostic, indicating
the failure is not about ASLR entropy.

Log analysis showed the abort happens AFTER 'library(TreeTools)'
succeeds (the 'Creating a generic function' messages from TreeTools
loading print, then ASan aborts). That timing rules out a parent-R
startup failure: libasan must be initialising late, when R dlopen()s
the package's instrumented .so.

The 'Initialize ASan configuration' step had

    export LD_PRELOAD=\$(gcc -print-file-name=libasan.so)

inside a subshell, so the variable did not persist to the Rscript
step. Without LD_PRELOAD, libasan is only pulled in via the package's
.so dependency, by which point R's heap, libR.so, and (for the tests
subjob) testthat's full dependency stack already occupy the address
range ASan needs for its shadow region. examples and vignettes
happened to leave enough room; tests did not.

Set LD_PRELOAD inline on the Rscript command so libasan loads at
process startup, before any other shared object. Keep the
mmap_rnd_bits sysctl as defence in depth; drop the more invasive
randomize_va_space=0 now that the root cause is addressed.
The LD_PRELOAD fix in the previous commit moved ASan past the
shadow-memory abort, but leak detection now fires on:
- libcrypto.so.3 / libssl.so.3 / libxml2.so.2 one-time init memory
- xml2 R package allocations from HTML parsing
- /usr/bin/sed (a 1-byte allocation in sed itself, called by some
  test-discovery step)

None of these are TreeTools bugs and none are actionable from this
codebase. CRAN's own gcc-ASAN setup runs with detect_leaks=0 for the
same reason - LSan in an R session pulls in too much third-party
state to be useful. We retain valgrind for leak coverage.
@ms609 ms609 merged commit b1335ce into main May 18, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant