AE filtered/whitelist arm captures 0 for the flagged pod (live filtered-write), while unfiltered + the computed proxy show it should keep it
Symptom (6a28bdc0, 2026-06-10, AE authenticated + scanning)
First fully-authenticated volproof run. EVERYTHING (unfiltered) arm captured real data, but the AE (filtered/whitelist) arm wrote 0:
log4shell_all (EVERYTHING/unfiltered) http 1011 rows / 17 pods ✅
log4shell_ae (filtered/whitelist) http 0 / 0 pods ❌
AE filtered scans log: streaming.TableScanner: query completed mode=whitelist pods=2 rows=0 table=http_events — the filtered query completes (not DeadlineExceeded) but returns 0 rows for its 2-pod whitelist.
Proof the data exists + should be kept
Computed proxy = EVERYTHING capture ∩ dx active-set (http_events.pod IN concat(aa.namespace,'/',aa.pod)):
- the dx-flagged backend produced 161 http rows / 60,443 uncomp / 5,114 comp in-window and IS in
adaptive_attribution (R0001 fired: exfil chain cut/tr/base32/getent).
So the adaptive policy SHOULD keep 161 http (84% row / 94.6% compressed reduction vs ALL). The live filtered-write emitted none of it.
Root cause (whitelist membership/retention, NOT the regex)
- The pod-name format matches:
http_events.pod = ns/pod (from px.upid_to_pod_name), activeset.Key.Render() = ns/pod. The proxy join confirms they match. So scanner.go regex_match('^(ns/pod|…)$', df.pod) is not the problem.
- The problem is the live whitelist did not contain the http-producing flagged backend during its capture window. The AE whitelist was
pods=2 (benign), and the flagged backend (adaptive_attribution last_seen mid-window) was not retained in it. By post-run the whitelist had aged to 2 non-http pods.
Why this regressed (didn't happen on f518)
On f518 the backend was warm/long-running → stably in the active-set → AE arm captured (vc1 http=142). The per-arm backend bounce added for the stateful-exploit fix (k8sstormcenter/bob#140) makes the flagged pod transient: it's bounced at arm start, flagged (R0001) only mid-window, and isn't propagated into / retained in the AE filtering whitelist long enough to capture. Relates to entlein/dx#62 (active-set not refreshed/retained).
What to fix
- dx→AE push: ensure an R0001-flagged pod is pushed into the AE filtering whitelist promptly on detection and retained for at least the capture window (TTL ≥ the AE stream window; don't age out a still-active flagged pod). (No
StartExport/aeclient log lines were observed during the run — verify the control-push path actually fires for R0001.)
- AE FilterUpdater: confirm the live whitelist reflects
adaptive_attribution (it lagged to 2 pods missing the flagged backend).
- Verify with a fresh-but-warmed pod: flagged pod present in
mode=whitelist AND query completed … rows>0.
- Secondary:
DeadlineExceeded on heavy tables (conn_stats/amqp) at ADAPTIVE_STREAM_WINDOW_SEC=20 — raise window for filtered mode (few pods).
Interim
The valid ALL-vs-AE reduction number is obtainable via the computed proxy (EVERYTHING ∩ active-set) until the live filtered-write retains flagged pods. http: 84.1% rows / 94.6% compressed; dns ~100% (benign-only); combined 98.4% compressed (62×).
AE filtered/whitelist arm captures 0 for the flagged pod (live filtered-write), while unfiltered + the computed proxy show it should keep it
Symptom (6a28bdc0, 2026-06-10, AE authenticated + scanning)
First fully-authenticated volproof run. EVERYTHING (unfiltered) arm captured real data, but the AE (filtered/whitelist) arm wrote 0:
AE filtered scans log:
streaming.TableScanner: query completed mode=whitelist pods=2 rows=0 table=http_events— the filtered query completes (not DeadlineExceeded) but returns 0 rows for its 2-pod whitelist.Proof the data exists + should be kept
Computed proxy = EVERYTHING capture ∩ dx active-set (
http_events.podINconcat(aa.namespace,'/',aa.pod)):adaptive_attribution(R0001 fired: exfil chain cut/tr/base32/getent).So the adaptive policy SHOULD keep 161 http (84% row / 94.6% compressed reduction vs ALL). The live filtered-write emitted none of it.
Root cause (whitelist membership/retention, NOT the regex)
http_events.pod=ns/pod(frompx.upid_to_pod_name),activeset.Key.Render()=ns/pod. The proxy join confirms they match. Soscanner.goregex_match('^(ns/pod|…)$', df.pod)is not the problem.pods=2(benign), and the flagged backend (adaptive_attributionlast_seen mid-window) was not retained in it. By post-run the whitelist had aged to 2 non-http pods.Why this regressed (didn't happen on f518)
On f518 the backend was warm/long-running → stably in the active-set → AE arm captured (vc1 http=142). The per-arm backend bounce added for the stateful-exploit fix (k8sstormcenter/bob#140) makes the flagged pod transient: it's bounced at arm start, flagged (R0001) only mid-window, and isn't propagated into / retained in the AE filtering whitelist long enough to capture. Relates to entlein/dx#62 (active-set not refreshed/retained).
What to fix
StartExport/aeclient log lines were observed during the run — verify the control-push path actually fires for R0001.)adaptive_attribution(it lagged to 2 pods missing the flagged backend).mode=whitelistANDquery completed … rows>0.DeadlineExceededon heavy tables (conn_stats/amqp) atADAPTIVE_STREAM_WINDOW_SEC=20— raise window for filtered mode (few pods).Interim
The valid ALL-vs-AE reduction number is obtainable via the computed proxy (EVERYTHING ∩ active-set) until the live filtered-write retains flagged pods. http: 84.1% rows / 94.6% compressed; dns ~100% (benign-only); combined 98.4% compressed (62×).