Skip to content

http, ui: reduces service observability noise#284

Merged
lePereT merged 2 commits into
migration-basefrom
http-moreshhh
Jun 23, 2026
Merged

http, ui: reduces service observability noise#284
lePereT merged 2 commits into
migration-basefrom
http-moreshhh

Conversation

@lePereT

@lePereT lePereT commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this? (check all applicable)

  • Bug Fix
  • Code Refactor
  • Test

Description

Reduces HTTP and UI observability noise while keeping useful operator visibility.

Previously the HTTP service emitted frequent obs metrics during normal request traffic, which made the monitor difficult to use when callers such as the switch driver were polling regularly. After quieting HTTP, the monitor still showed excessive UI status chatter because obs/v1/ui/metric/status was republished for timestamp-only and dependency replay changes.

This PR makes HTTP and UI observability quiet by default:

  • Replaces noisy per-exchange obs/v1/http/metric/<id>/stats publication with quieter obs/v1/http/metric/<id>/status.

  • Keeps cap/http/<id>/state/stats for programmatic HTTP state.

  • Adds HTTP observability config:

    • status_interval_s
    • request_trace
    • success_events
    • failure_rate_limit_s
  • Suppresses repeated equivalent HTTP failure logs within the configured rate-limit window.

  • Emits sparse HTTP failure summaries with a suppressed count.

  • Always emits a single request_recovered event after a prior HTTP failure, even when success_events = false.

  • Clears HTTP failure suppression windows after recovery so a later failure is visible as a fresh failure.

  • Keeps HTTP per-request debug tracing opt-in only.

  • Adds UI lifecycle status gating for obs/v1/ui/metric/status.

  • Publishes UI status only when the semantic lifecycle state changes or a slow heartbeat interval elapses.

  • Ignores volatile UI status fields for gating, including at, ts, run_id and dependency updated_at.

  • Adds UI observability config:

    • status_interval_s

Manual test

  • yes

Manual test description

Verified through the monitor path that HTTP no longer emits high-frequency successful exchange metrics by default, while still reporting meaningful failure and recovery transitions.

Also verified from Big Box monitor logs that remaining monitor noise was dominated by repeated UI status publications, then added UI semantic status gating to reduce timestamp-only and dependency replay chatter.

Added tests?

  • yes

Added HTTP unit coverage for:

  • default observability configuration;
  • observability config validation;
  • first failure logging;
  • repeated failure suppression;
  • recovery logging when success_events = false;
  • clearing suppression state after recovery;
  • fresh failure visibility after recovery.

Added UI unit coverage for:

  • observability status interval configuration;
  • lifecycle status publication gated by semantic change;
  • ignoring volatile dependency timestamps when computing the UI status key.

Added to documentation?

  • no documentation needed

@lePereT lePereT requested a review from rslater-cs June 19, 2026 02:20
@lePereT lePereT marked this pull request as ready for review June 19, 2026 02:20
@lePereT lePereT changed the title http: reduces HTTP service observability noise http, ui: reduces service observability noise Jun 23, 2026
@lePereT lePereT merged commit 7b96d83 into migration-base Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant