Skip to content

Harden rl.php endpoint check and surface actionable diagnostics#51

Merged
jjroelofs merged 3 commits into
1.xfrom
feature/50-harden-endpoint-check
May 7, 2026
Merged

Harden rl.php endpoint check and surface actionable diagnostics#51
jjroelofs merged 3 commits into
1.xfrom
feature/50-harden-endpoint-check

Conversation

@jjroelofs
Copy link
Copy Markdown
Contributor

Summary

Implements all hardening items from #50 in a single change.

The rl.php accessibility check (EndpointChecker) produced false negatives behind reverse proxies that terminate TLS — a common Drupal multisite setup — and false positives when its probe followed a redirect chain that ended at an unrelated 200. The hook_requirements() warning collapsed every failure into a generic "rl.php is not accessible," leaving operators with no idea what to fix.

This PR turns the warning into a precise, actionable diagnostic and removes the most common false-negative path.

What changed

EndpointChecker

  • Validate the response body is pong, not just status 200. Catches redirect chains that land on a different host returning HTML.
  • Trust X-Forwarded-Proto for the probe URL even when $settings['reverse_proxy'] isn't set. Safe for a same-site self-probe.
  • After a public-URL failure, retry against http://127.0.0.1[:port] with the original Host header. A loopback success tells you the file is being served correctly and the failure is in the proxy / scheme / DNS chain — not in Apache or PHP.
  • Return a structured result (status, detail, code, public_url, loopback_url, redirects) so the requirements report can describe the actual failure mode.
  • Keep follow-redirects on (legit setups may HTTP→HTTPS upgrade or canonical-host redirect) but enable track_redirects so the body-mismatch diagnostic can name the redirect target. Use strict mode so POST survives 301/302 hops.
  • Asymmetric cache TTLs: 1 h on success, 5 min on failure so a fixed misconfig clears without manual cache rebuild.
  • Narrow the swallow-all \Exception fallback to cURL errno 6/7 (COULDNT_RESOLVE_HOST / COULDNT_CONNECT). SSL handshake errors and timeouts now surface instead of being silently treated as success.

rl_requirements()
One branch per status with a description that names the specific failure (redirect target, HTTP code, body sample, exception detail) and points at the right config knob.

Public API

Preserved. The existing EndpointChecker::isAccessible(): bool stays as a thin wrapper over the new EndpointChecker::getResult(): array. Existing callers (e.g. dxpr_builder_page_attachments()) keep working unchanged.

State-cache compatibility

The cached state shape changed (accessibleresult). Old entries are detected and refreshed gracefully — no surprise TRUE or FALSE from a stale cache. Worst case: one extra check on the first request after upgrade.

A note on disabling redirects

I considered hard-disabling redirect-following so the probe could fail loudly on any 3xx, but that would false-fail legit setups (canonical-host redirects, http→https upgrade). Body-validation alone catches the bad cases (today's failure: redirect to wrong host returning HTML 200) while still tolerating sites whose canonical URL is reached via a 301. The track_redirects option gives the diagnostic enough info ("body wasn't pong, followed redirect to X") without rejecting valid redirect chains.

Test plan

  • php -l clean on changed files
  • phpcs --standard=Drupal clean
  • phpcs --standard=DrupalPractice clean
  • CI: drupal-lint, drupal-check, drush-e2e
  • Manual: install module on a Drupal 11 site behind nginx → Apache; visit /admin/reports/status. Confirm:
    • With $settings['reverse_proxy'] set: rl.php is accessible
    • With $settings['reverse_proxy'] unset and X-Forwarded-Proto present: still rl.php is accessible ✓ (new XFP trust)
    • Force a misconfig (point nginx redirect to wrong host): warning reads "rl.php probe was redirected to an unexpected target …" with the redirect URL, not the generic message.
    • Force a non-200 (rename rl.php temporarily): warning reads "rl.php returned HTTP 404" with the probed URL.
    • Force a body change (have the file echo something else): warning reads "rl.php returned unexpected content" with the body sample.

Possible follow-ups (not in this PR)

  • Unit tests for probe() classification (no tests/ directory exists yet — adding one is its own change).
  • Optional strict mode with redirects fully disabled, exposed as a config flag for operators who want the strict probe behavior.

Closes #50

Jur de Vries added 2 commits May 7, 2026 08:05
The rl.php accessibility check produced false negatives in setups behind
a reverse proxy that terminates TLS (a common Drupal multisite topology)
and false positives when the probe followed a redirect chain that ended
at an unrelated 200. The hook_requirements() warning collapsed all
failure modes into a generic "rl.php is not accessible," giving operators
no clue what to fix.

Changes to EndpointChecker:
- Validate the response body is "pong", not just status 200. Catches
  redirect chains that land on a different host returning HTML.
- Trust X-Forwarded-Proto for the probe scheme even when
  $settings['reverse_proxy'] isn't set. Safe for a same-site self-probe
  and resolves the most common false negative.
- After a public-URL failure, retry on http://127.0.0.1[:port] with the
  original Host header. A loopback success means the file is being
  served correctly and the failure is in the proxy / scheme / DNS path
  to the public hostname — not in Apache or PHP.
- Return a structured result (status / detail / code / public_url /
  loopback_url / redirects) so hook_requirements() can describe the
  exact failure mode and point operators at the right fix.
- Follow redirects (Guzzle default) but enable track_redirects so the
  body-mismatch diagnostic can name the redirect target. Use strict mode
  so the POST method survives 301/302 hops.
- Asymmetric cache TTLs: 1h on success, 5m on failure, so a fixed
  misconfig clears without a manual cache rebuild.
- Narrow the swallow-all \Exception fallback to cURL errno 6/7
  (couldn't-resolve / couldn't-connect). SSL handshake errors and
  timeouts now surface instead of being silently treated as success.

Public API preserved: EndpointChecker::isAccessible(): bool stays as a
thin wrapper over the new EndpointChecker::getResult(): array.

rl.install hook_requirements() switched to getResult() with one branch
per status. Each branch produces a description that names the specific
failure (redirect target, HTTP code, body sample, exception detail) and
links to the relevant configuration knob.

Closes #50
When X-Forwarded-Proto rewrote the probe scheme to https, buildPublicUrl()
still pulled the port from $request->getHttpHost(), which carries the
local backend port (e.g. 8080 behind nginx → Apache). The result was
URLs like https://example.com:8080/... — the public TLS terminator
isn't listening on 8080, so the probe times out (cURL error 28) even
though the public URL is fine.

Now: when XFP is present, build the host from getHost() and use
X-Forwarded-Port (or the default port for the rewritten scheme) instead
of the local request port. Verified end-to-end on a multisite behind
nginx → Apache: probe URL is now https://<host>/.../rl.php, returns
200 pong, isAccessible() = true.
@jjroelofs
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough work here. The core fix (body validation, structured results, asymmetric cache TTLs) is solid. One thing to fix before merge:

STATUS_REDIRECTED is overloaded -- When the loopback probe succeeds but the public probe fails, check() always returns STATUS_REDIRECTED regardless of the actual public failure. If the public probe failed with a TLS handshake error or timeout, the status report says "rl.php probe was redirected to an unexpected target" and advises setting $settings['reverse_proxy'], which is wrong advice for those cases. Either add a dedicated status (e.g. STATUS_PROXY_MISMATCH) for the "loopback OK, public failed" scenario, or pass through the original public probe status and attach the loopback success as supplementary context.

Two smaller items that can be follow-ups:

  1. cURL errno 6/7 returns STATUS_OK before the loopback probe runs -- this early return means the loopback fallback never fires for DNS/connect failures, which undercuts its diagnostic value. Consider letting those fall through to the loopback probe.
  2. Trusting X-Forwarded-Proto without reverse_proxy -- the justification holds for a self-probe, but worth an inline comment making clear this is a deliberate exception, not a general pattern.

…oopback on connect errors

- check(): preserve the actual public probe status (REDIRECTED / HTTP_ERROR /
  BODY_MISMATCH / CONNECTION_ERROR) instead of collapsing every "loopback OK,
  public failed" outcome into STATUS_REDIRECTED. The redirected branch was
  printing the wrong "set reverse_proxy" advice for TLS errors and timeouts.
  Loopback success now attaches a `loopback_ok` flag for hook_requirements()
  to surface as supplementary context.
- check(): when public failed at the network layer (CONNECTION_ERROR) AND
  loopback succeeds, upgrade to STATUS_OK. This is the Docker / split-horizon
  DNS case — browsers reach the public URL fine; only the server can't.
- probe(): drop the errno-6/7 short-circuit. Letting DNS / TCP errors fall
  through to check() so loopback gets a chance gives a stronger signal
  (positive evidence the local server works) than the previous "file exists
  on disk, assume good" heuristic.
- buildPublicUrl(): tighten the docblock to make clear the X-Forwarded-Proto
  trust is a self-probe-only exception, not a pattern to copy.
- rl.install: each failure branch now uses `loopback_ok` to add a "the local
  server is fine" hint when applicable, and the REDIRECTED description names
  the actual redirect target.

Reviewer: github.com//pull/51#issuecomment-4395414433
@jjroelofs
Copy link
Copy Markdown
Contributor Author

Addressed all three points in 60d9e38:

1. STATUS_REDIRECTED overload — went with your option (b): check() now preserves the actual public probe status (REDIRECTED / HTTP_ERROR / BODY_MISMATCH / CONNECTION_ERROR) and attaches loopback_ok=TRUE + loopback_url as supplementary context when loopback works. Each hook_requirements() branch surfaces that hint as "the local server is fine, fix the proxy / scheme / redirect rule" alongside the precise diagnosis. The STATUS_REDIRECTED description now also names the actual redirect target instead of giving generic reverse-proxy advice.

2. errno 6/7 short-circuit — removed. probe() now reports STATUS_CONNECTION_ERROR for any transport-layer failure and lets check() decide. When public is CONNECTION_ERROR AND loopback is OK, check() upgrades to STATUS_OK — that's the Docker / split-horizon-DNS case where browsers can reach the public URL but the server can't reach itself. Stronger signal than the previous "file exists on disk, assume good" heuristic.

3. XFP-trust docblock — added a short note that this is a deliberate self-probe-only exception, not a pattern to copy elsewhere.

Validated end-to-end on the real multisite where this all started: production-realistic request (XFP=https) → STATUS_OK; broken-redirect simulation → no longer mis-routes to the reverse_proxy advice. Local lint clean (phpcs --standard=Drupal and DrupalPractice); CI workflow re-running on the new commit.

@jjroelofs jjroelofs merged commit c7fb540 into 1.x May 7, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EndpointChecker: harden rl.php accessibility check and improve diagnostic messaging

1 participant