Fix GH-8739: opcache_reset()/invalidate() races via epoch-based reclamation#21778
Open
superdav42 wants to merge 1 commit intophp:PHP-8.4from
Open
Fix GH-8739: opcache_reset()/invalidate() races via epoch-based reclamation#21778superdav42 wants to merge 1 commit intophp:PHP-8.4from
superdav42 wants to merge 1 commit intophp:PHP-8.4from
Conversation
…clamation opcache_reset() and opcache_invalidate() destroy shared memory (hash table memset, interned strings restore, SHM watermark reset) while concurrent reader threads/processes still hold pointers into that memory. This causes zend_mm_heap corrupted crashes under concurrent load in both ZTS (FrankenPHP, parallel) and FPM configurations. A prior fix attempt (PR php#14803) wrapped cache lookups in zend_shared_alloc_lock(), which was rejected by Dmitry Stogov because readers must remain lock-free. This patch introduces epoch-based reclamation (EBR), a proven lock-free pattern also used by RCU in the Linux kernel and by Crossbeam in Rust: - Readers publish their epoch on request start (one atomic store) and clear it on request end (one atomic store). No locks acquired. - Writers (opcache_reset path) increment the global epoch and defer the actual hash/interned-string cleanup until all pre-epoch readers have completed. The deferred reset completes at the next request boundary after the drain is complete. - The existing immediate-cleanup fast path is retained for the case when accel_is_inactive() reports no active readers. Additional safety: - When the per-slot pool (256 slots) is exhausted, readers fall back to an aggregate overflow counter that unconditionally blocks deferred reclamation while non-zero. This preserves safety for installations with more concurrent readers than slots. - UNASSIGNED vs OVERFLOW sentinels distinguish "slot not yet attempted" from "allocation failed", avoiding per-request atomic contention on the slot-next counter once slots are exhausted. - A full memory barrier after zend_accel_hash_clean()'s memset ensures readers on weak-memory architectures see the zeroed table before any subsequent insert. - A defensive ZCG(locked) check in accel_try_complete_deferred_reset() prevents the ZEND_ASSERT(!ZCG(locked)) failure that would otherwise fire if the completer is re-entered while the SHM lock is held. Fixes phpGH-8739 Fixes phpGH-14471 Fixes phpGH-18517 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
048c11e to
e110a90
Compare
superdav42
added a commit
to superdav42/static-php-cli
that referenced
this pull request
Apr 16, 2026
Bundles the EBR reset-safety patch from upstream PR #21778 as a source-extract hook so tarball builds pick it up automatically. The patch fixes zend_mm_heap corruption crashes triggered by opcache_reset()/opcache_invalidate() racing concurrent readers under ZTS (FrankenPHP) and FPM. - src/globals/patch/php84_opcache_ebr_reset_safety.patch: verbatim from PR #21778 commit e110a901 (NEWS hunk stripped for release tarball compatibility) - SourcePatcher::patchPhpOpcacheEbrResetSafety gated on 80400 <= ver < 80500 with an idempotency probe against accel_try_complete_deferred_reset Remove once the fix is merged and tagged into an 8.4.x release.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the long-standing race in
opcache_reset()/opcache_invalidate()that causeszend_mm_heap corruptedcrashes under concurrent load in both ZTS (FrankenPHP, parallel) and FPM configurations.The writer destructively mutates
ZCSG(hash)(memset) and the interned strings buffer while readers may still hold pointers into that memory. The fix uses epoch-based reclamation (EBR) — a lock-free pattern used by RCU in the Linux kernel and by Crossbeam in Rust — to defer the destructive cleanup until all pre-reset readers have completed their requests.Why EBR (and not PR #14803's locking)
@dstogov rejected the prior attempt (#14803) because it took
zend_shared_alloc_lock()on every cache lookup:This patch satisfies that constraint: readers remain lock-free. The per-request cost is two atomic stores to cache-line-padded slots (no false sharing). Writers wait for readers to drain instead of the other way around.
Design
Readers (
accel_activate,accel_post_deactivate):ZCSG(current_epoch)to this thread's slot.ACCEL_EPOCH_INACTIVEto this thread's slot.Writers (the deferred-restart path in
accel_activatewhenaccel_is_inactive()is false):drain_epoch = current_epoch, incrementcurrent_epoch(new readers publish a higher epoch).accelerator_enabled = false,reset_deferred = true.zend_accel_hash_clean()oraccel_interned_strings_restore_state()yet.Drain completion (called from both
accel_activateandaccel_post_deactivate):min(slot[*].epoch) > drain_epochand no overflow readers are active, take the SHM lock and run the deferred cleanup.Safety under slot exhaustion
The per-slot pool is 256 slots × 64 cache-padded bytes = 16 KiB of SHM. Installations with more than 256 concurrent threads/processes (large FPM pools) hit the overflow path: the aggregate
ZCSG(epoch_overflow_active)counter is incremented instead, and deferred reclamation is held back until it reaches zero. This preserves correctness — overflow readers never disappear silently.Test results
All tests pass with the change applied (
--enable-debug --enable-zts --enable-opcache):make test(28 extensions built)Concurrent dev-server test: ~250 parallel requests with mixed
opcache_reset()calls — server remained responsive, no crashes, no heap corruption.Regression tests added
ext/opcache/tests/gh8739_reset_epoch_safety.phpt— cached class remains accessible after resetext/opcache/tests/gh8739_reset_multiple.phpt— multiple resets don't crashext/opcache/tests/gh8739_invalidate_epoch_safety.phpt— invalidate doesn't crash while holding cached pointerFixes
opcache_reset()crash under high load in FPM)Test plan
class T {} new T; opcache_reset();underab -c 24 -n 200000) — should no longer crash