perf: skip H2 for full-tunnel batch requests#1040
Conversation
Full-tunnel batches already coalesce N ops into one HTTP request, so H2 stream multiplexing adds no benefit — there's nothing to multiplex. Worse, H2 introduces measurable regressions: - Long-poll batches complete at 16–17s (LONGPOLL_DEADLINE + latency) instead of timing out at 10s on H1. Each idle poll holds an Apps Script execution slot 60% longer, reducing available concurrency. - NonRetryable error path (RequestSent::Maybe) silently drops batches with no retry — data loss the H1 path doesn't have. - POOL_MIN_H2_FALLBACK trims the H1 pool from 8 to 2 once H2 lands, starving tunnel batches that still need the H1 pool. A/B tested on Pixel 6 Pro (30 batch samples each): | Metric | H2 (stock v1.9.20) | H1 (this PR) | v1.9.14 | |------------------|--------------------|--------------|---------| | 16–17s batches | 8–10/30 | 0/30 | 0/30 | | 10s timeouts | 0 | 4/30 | 5/30 | | Active RTTs | 1.4–2.4s | 1.3–2.2s | 1.4–2.3s| Changes: - tunnel_batch_request_to: skip h2_relay_request, go straight to H1 pool acquire(). Removes the H2 try/fallback/NonRetryable block. - run_pool_refill: always maintain POOL_MIN (8) connections. The H2-era POOL_MIN_H2_FALLBACK (2) trim starved tunnel batches; with tunnel traffic on H1 the pool must stay at full capacity. H2 multiplexing remains active for relay mode (non-full) where it genuinely helps — each browser request is a separate HTTP call that benefits from stream multiplexing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Reviewed via Anthropic Claude. Verified locally:
The architectural reasoning is sound: tunnel batches are already-coalesced single HTTP requests where H2 stream multiplexing has no value (nothing to multiplex). H2 multiplexing genuinely shines on relay mode (one HTTP call per browser request, dozens parallel) — and you've explicitly kept H2 there. The A/B data on Pixel 6 Pro is concrete (8-10/30 stalls → 0/30). Two things I want to verify before merging: 1. Interaction with PR #1029 (just merged in v1.9.20): that PR was @rezaisrad's fix for the 3-week #924 cluster — same 2. Cross-check vs #962 (r0ar's data): that issue's controlled A/B showed h2 enabled was strictly better than Plan: leaving open for 3-5 days community testing. Specifically asking testers:
If 2-3 testers confirm "feels like v1.9.14 again" without surprise regressions, I'll squash-merge as v1.9.21. Excellent perf work — the empirical bisect-into-h2-overhead is the right framing for these kinds of low-level transport regressions. [reply via Anthropic Claude | reviewed by @therealaleph] |
Bumps Cargo.toml v1.9.20 → v1.9.21 and ships the changelog. Headline: PR #1040 (@yyoyoian-pixel) skips H2 multiplexing for tunnel_batch_request_to (Full-mode batch path). Tunnel batches already coalesce N ops into one HTTP request, so H2 stream multiplexing has nothing to multiplex there; the H2 path was actively hurting via long-poll stalls + silent batch drops + pool starvation. H2 remains active for relay mode (apps_script), where r0ar's #962 A/B data confirmed it's strictly better. A/B on Pixel 6 Pro: 0/30 vs 8-10/30 long-poll stalls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve conflict in src/domain_fronter.rs (tunnel_batch_request_with_timeout). main (therealaleph#1040) intentionally removed the h2 fast path from tunnel batches because coalesced batches see no h2-multiplexing benefit. The branch's contribution — threading caller-supplied `batch_timeout` (so pipelined batches honor `max(configured, PIPELINED_BATCH_TIMEOUT_FLOOR)`) — is preserved on the h1 header-read path, which is the only path now. Build: clean on both crates. Tests: 217/217 client lib, 57/57 tunnel-node.
Completes [#1040](#1040) (v1.9.21). #1040 skipped H2 for `tunnel_batch_request_to` but missed `tunnel_request` — the single-op path used for plain `connect` ops. The 16-17s long-poll stalls persisted on full-tunnel sessions that go through the single-op path; this PR closes that gap. Same fix shape: remove the H2 try/fallback/NonRetryable block from `tunnel_request`, go straight to H1 pool `acquire()`. H2 remains active for relay-mode paths (`do_relay_once_with`, exit-node `relay()`). ## All h2_relay_request call sites audited | Call site | Function | Mode | H2 skipped? | |---|---|---|---| | `do_relay_once_with` | relay | Relay | No (correct — relay benefits from H2) | | `relay()` exit-node | relay | Relay | No (correct) | | `tunnel_request` | tunnel single op | Full tunnel | **YES — this PR** | | `tunnel_batch_request_to` | tunnel batch | Full tunnel | Yes (PR #1040) | | `tunnel_batch_request_with_timeout` | tunnel batch | Full tunnel | Yes (PR #1040) | No other full-tunnel paths use H2 after this fix. ## Verified locally on top of v1.9.21 - `cargo test --lib --release`: 209/209 ✅ - `cargo build --release --features ui --bin mhrv-rs-ui`: clean ✅ Reviewed via Anthropic Claude. Co-Authored-By: yyoyoian-pixel <noreply@github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ops (#1041) Bumps Cargo.toml v1.9.21 → v1.9.22. Ships @yyoyoian-pixel's PR #1041 which completes #1040 — v1.9.21 skipped H2 for tunnel_batch_request_to but missed tunnel_request (single-op connect path). 5/5 h2_relay_request call sites now audited; all full-tunnel paths use H1, relay paths keep H2. 209 lib tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Full-tunnel batches already coalesce N ops into one HTTP request — H2 stream multiplexing has nothing to multiplex. This PR skips the H2 path for
tunnel_batch_request_toand keeps the H1 pool at full capacity.H2 multiplexing remains active for relay mode (non-full) where each browser request is a separate HTTP call that genuinely benefits from stream multiplexing.
Problem
H2 for tunnel batches introduced three regressions vs v1.9.14:
RequestSent::Maybefailures drop the entire batch with no retry — a failure mode H1 doesn't have.POOL_MIN_H2_FALLBACKtrims the H1 pool from 8→2 once H2 connects, but tunnel batches still use H1 and need the full pool.A/B test results (Pixel 6 Pro, 30 batch samples each)
This PR restores v1.9.14 tunnel performance while keeping all v1.9.15+ improvements (H2 for relay, zero-copy mux, block DoH/QUIC, TLS pool tuning).
Changes
tunnel_batch_request_to: remove H2 try/fallback/NonRetryable block, go straight to H1 poolacquire()(-54 lines)run_pool_refill: always maintainPOOL_MIN(8). RemovePOOL_MIN_H2_FALLBACK(2) trim that starved tunnel batches (-12 lines)Test plan
cargo checkclean🤖 Generated with Claude Code