feat(policy): unified failure accrual and response-penalty load biasing#15374
Conversation
Route objects do not hold effective failure-accrual configuration. Accrual is scoped to the parent Service or EgressNetwork. The route admission webhook still rejected invalid-value accrual annotations on routes, implying the setting was meaningful when nothing reads it. Stop validating it. This is an upstream-visible behavior change. A route object with an invalid-value accrual annotation that apply-time validation rejected before now admits silently. Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add control-plane support for two opt-in outbound-policy features served to proxies. Both require a data plane built against linkerd2-proxy-api v0.20.0, so a Service that sets none of the new annotations serializes to a wire policy identical to the prior release. The control plane validates every annotation and rejects any value the proxy would reject wholesale, so one bad input never invalidates the whole client policy. Response-penalty load biasing steers peak-EWMA load balancing away from endpoints that return failures: HTTP 429, 503, every other 5xx, and the gRPC failure trailer codes. An operator enables it per Service with the penalize-failures annotation, and the proxy then has a PenaltyPeakEwma load estimator. Two annotations tune that estimator. The load-biaser-penalty annotation sets the penalty weight, default 5s, and load-biaser-max-retry-after caps how long the estimator honors a Retry-After hint, default 300s. Both defaults match what the proxy used before, so an unset Service keeps the prior wire. The penalty decay has no annotation, since the proxy folds it into its single RTT EWMA. The separate honor-retry-after annotation lets a tripped endpoint's probe schedule respect a server Retry-After or gRPC pushback hint. That schedule stays bounded by the breaker's own backoff maximum. Unified failure accrual adds a breaker that trips on either a run of consecutive failures or a low success ratio measured over a trailing window, selected with the value unified on the existing failure-accrual key. That window is a ring of fixed-duration buckets rather than an exponential decay. The consecutive mode keeps its prior behavior, and the new success-rate parameters take the alpha annotation prefix to mark the surface experimental. Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add integration coverage for the unified and consecutive accrual modes, the penalize-failures and honor-retry-after annotations, the parent-scoped balancer inheritance in both directions, and the mode-conflict and inert-configuration diagnostics. Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
f981bfe to
0c5c2e8
Compare
|
Rebased to fix clippy and add minor fixes. |
adleong
left a comment
There was a problem hiding this comment.
Some non-blocking suggestions and questions, but otherwise this looks good.
There are some slight annotation name differences between this and what's in linkerd/website#2126 but I'll update that docs PR to match what's here.
| jitter_ratio: backoff.jitter, | ||
| respect_retry_after_hint: false, | ||
| }), | ||
| backoff: Some(convert_backoff(backoff, honor_retry_after)), |
There was a problem hiding this comment.
the consecutive failures accrual doesn't support honor_retry_after so we might as well pass false here.
| .keys() | ||
| .any(|k| k.starts_with(success_rate_key!(""))) | ||
| { | ||
| tracing::warn!( |
There was a problem hiding this comment.
If there is any invalidly configured Service in the cluster, I believe that this warning will be logged in the policy controller every time the service is updated and every time it is reindexed. This could mean a steady stream of regular repeated warnings for as long as the invalid service exists. This may be more verbose than we want.
There was a problem hiding this comment.
This comment also applies to all warnings in the parsing code.
There was a problem hiding this comment.
Yes, I agree it will be very noisy. Addressed in 1af90b0 (still using warn level for the rejection/drop cases).
| /// parser every boolean control-plane annotation already passes through. | ||
| /// An unrecognized value is rejected. A typo surfaces rather than silently | ||
| /// flipping the feature. | ||
| fn parse_balancer_toggle( |
There was a problem hiding this comment.
This is great that we're matching the behavior here with the bool parsing that already happens in the go controller. This isn't balancer specific and I expect we'd want to re-use this function for any boolean annotation parsing we add in the future.
There was a problem hiding this comment.
Renamed it to parse_bool_annotation in 5202e3d.
| "linkerd-policy-controller-k8s-api", | ||
| "linkerd2-proxy-api", | ||
| "maplit", | ||
| "prost-types", |
There was a problem hiding this comment.
Why does this PR require adding a dependency?
There was a problem hiding this comment.
It's adding a test dependency for policy-test because PenaltyPeakEwma exposes prost_types::Duration directly. There's nothing new, because this dependency already exists elsewhere (for runtime, in policy-controller/grpc), and this is just adding the edge policy-test -> test_depends-on -> prost-types.
A small clean-up we can do is declaring the version requirement in the root Cargo.toml and refer to the workspace version from both policy-test and policy-controller, but I thought of doing that in a separate PR.
The helper accepts the same boolean tokens as the Go controller's strconv.ParseBool and reads whatever annotation key the caller passes. This drops the balancer framing to use a more generic, reusable name. Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Use debug level for redudant or no-op annotations to avoid log spamming. Keep warn level for the values that get rejected and dropped. Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
This adds control-plane support for two opt-in outbound-policy features served to proxies: unified failure accrual and response-penalty load biasing. Both are configured through annotations on a Service (and, for failure accrual, an EgressNetwork), and both require a data plane built against
linkerd2-proxy-apiv0.20.0.A Service that sets none of the new annotations serializes to a wire policy identical to the prior release, so nothing changes for operators who do not opt in.
Unified failure accrual
Linkerd already supports consecutive-failure accrual via
balancer.linkerd.io/failure-accrual: consecutive. This PR adds a second mode,unified, that trips a breaker on either a run of consecutive failures or a low success ratio measured over a trailing window. The window is a ring of fixed-duration buckets.Unified mode runs both breaker policies. The consecutive dimension stays active at its default of 7 even when only success-rate parameters are set. To run success-rate-only breaking, set
failure-accrual-consecutive-max-failures: 0.Response-penalty load biasing
When enabled with
penalize-failures, peak-EWMA load balancing steers traffic away from endpoints that return failures (HTTP 429, 503, every other 5xx, and the gRPC failure trailer codes). The proxy uses aPenaltyPeakEwmaload estimator that two annotations tune. This applies to Service backends only. EgressNetwork uses a forwarding backend with no balancer, so the load-biaser annotations have no effect there.Annotations
balancer.linkerd.io/failure-accrualconsecutive; this PR addsunified.balancer.linkerd.io/failure-accrual-consecutive-max-failures0disables the consecutive dimension.7balancer.linkerd.io/failure-accrual-consecutive-max-penalty60sbalancer.linkerd.io/failure-accrual-consecutive-min-penalty1sbalancer.linkerd.io/failure-accrual-consecutive-jitter-ratio0.5balancer.alpha.linkerd.io/failure-accrual-success-rate-thresholdunifiedmode.0disables the success-rate dimension.0.8balancer.alpha.linkerd.io/failure-accrual-success-rate-window10sbalancer.alpha.linkerd.io/failure-accrual-success-rate-min-requests5balancer.alpha.linkerd.io/failure-accrual-honor-retry-afterRetry-Afteror gRPC pushback hint. Stays bounded by the breaker backoff maximum.falsebalancer.alpha.linkerd.io/penalize-failuresPenaltyPeakEwma).falsebalancer.alpha.linkerd.io/load-biaser-penalty5sbalancer.alpha.linkerd.io/load-biaser-max-retry-afterRetry-Afterhint.300sAnnotation stability
The new experimental surface (success-rate parameters, penalty load biasing, and the retry-after honoring toggle) uses the
balancer.alpha.linkerd.io/prefix. The inherited consecutive-failure knobs keep the stablebalancer.linkerd.io/prefix, and theunifiedvalue extends the existing stablefailure-accrualkey.Backwards compatibility
The control plane validates every annotation and rejects any value the proxy would reject, so one bad input never invalidates the whole client policy. With no new annotations set, the emitted outbound-policy proto is identical to the prior release. The new defaults (penalty 5s, max-retry-after 300s, success-rate window 10s, and so on) match what the proxy used before, so enabling a feature without tuning it keeps the previous behavior.
Validation and failure handling
The new annotations are scoped to Services (and EgressNetwork). Malformed values are logged and the fields fall back to their defaults. A malformed accrual sub-value drops the whole accrual configuration for that Service rather than the single field.