Skip to content

feat: add playbook labels to all alerts#310

Merged
christophrichtersap merged 2 commits into
mainfrom
feat/add-playbook-labels
May 28, 2026
Merged

feat: add playbook labels to all alerts#310
christophrichtersap merged 2 commits into
mainfrom
feat/add-playbook-labels

Conversation

@christophrichtersap
Copy link
Copy Markdown
Contributor

@christophrichtersap christophrichtersap commented May 28, 2026

Summary

  • Adds playbook: label to all 10 alerts (7 in operator.yaml, 3 in eviction.yaml)
  • Labels point to existing playbook pages on operations.global.cloud.sap

Alerts updated

Alert Playbook slug
HypervisorOnboardingStuck hypervisoronboardingstuck
HypervisorEvictionStuck hypervisorevictionstuck
HypervisorEvictedTooLong hypervisorevictedtoolong
HypervisorTraitSyncFailed hypervisortraitsyncfailed
HypervisorAggregateSyncFailed hypervisoraggregatesyncfailed
HypervisorOperatorReconcileErrors hypervisorreconcileerrors
HypervisorOperatorDown hypervisoroperatordown
EvictionFailed evictionfailed
EvictionMigrationFailing evictionmigrationfailing
EvictionOutstandingRamHigh evictionoutstandingram

Ref: PlusOne/CPE-KVM#236

Test plan

  • No changes to alert expressions or thresholds — labels only
  • Deploy to QA and verify playbook links render in Alertmanager/Slack notifications

Summary by CodeRabbit

  • New Features
    • Eviction and hypervisor operator alerts now include associated playbook references for streamlined incident response and troubleshooting guidance.

Review Change Stack

Every alert now has a `playbook:` label pointing to the corresponding
page on operations.global.cloud.sap. This enables one-click navigation
from alert notifications to triage procedures.

Ref: PlusOne/CPE-KVM#236
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Warning

Review limit reached

@christophrichtersap, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 49 minutes and 24 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d8aaa863-f23e-4003-87b7-7278d3120508

📥 Commits

Reviewing files that changed from the base of the PR and between 32d1851 and 99359bb.

📒 Files selected for processing (2)
  • charts/openstack-hypervisor-operator/templates/servicemonitor.yaml
  • charts/openstack-hypervisor-operator/values.yaml
📝 Walkthrough

Walkthrough

This PR adds playbook references to ten alert rules across the hypervisor operator alerting configuration. Three eviction-lifecycle alerts and seven operator-status alerts each gain a playbook label pointing to corresponding KVM operation runbooks.

Changes

Alert playbook references

Layer / File(s) Summary
Eviction lifecycle alerts playbook references
charts/openstack-hypervisor-operator/alerts/eviction.yaml
EvictionFailed, EvictionMigrationFailing, and EvictionOutstandingRamHigh each receive a playbook field referencing KVM eviction runbook paths.
Hypervisor operator alerts playbook references
charts/openstack-hypervisor-operator/alerts/operator.yaml
Seven operator alerts (HypervisorOnboardingStuck, HypervisorEvictionStuck, HypervisorEvictedTooLong, HypervisorTraitSyncFailed, HypervisorAggregateSyncFailed, HypervisorOperatorReconcileErrors, HypervisorOperatorDown) each receive a playbook field referencing KVM operation runbook paths.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested reviewers

  • fwiesel

Poem

🐰 With whiskers twitching, the rabbit did roam,
Through alerts and playbooks, the operator's home,
Each rule now points to its handbook of care,
A field full of references, helpful and fair! 📚✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding playbook labels to all alerts across the two YAML files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/add-playbook-labels

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@charts/openstack-hypervisor-operator/alerts/operator.yaml`:
- Line 14: operator.yaml contains broken playbook links pointing at
docs/compute/kvm/playbooks/* (e.g., hypervisoronboardingstuck,
hypervisorevictionstuck, hypervisorevictedtoolong, hypervisortraitsyncfailed,
hypervisoraggregatesyncfailed, hypervisorreconcileerrors,
hypervisoroperatordown) that do not exist; either create corresponding
documentation pages at docs/compute/kvm/playbooks/<playbook-name> (preferably
markdown files with the playbook content and frontmatter) or update the paths in
charts/openstack-hypervisor-operator/alerts/operator.yaml (and any other files
referencing the same paths such as alerts/eviction.yaml) to point to the correct
existing doc location so all listed playbook links resolve.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 17e342e5-fb44-4b02-94a6-360994427244

📥 Commits

Reviewing files that changed from the base of the PR and between 65bc78f and 32d1851.

📒 Files selected for processing (2)
  • charts/openstack-hypervisor-operator/alerts/eviction.yaml
  • charts/openstack-hypervisor-operator/alerts/operator.yaml

labels:
severity: warning
type: hypervisor_operator
playbook: docs/compute/kvm/playbooks/hypervisoronboardingstuck
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify operator playbook documentation exists
# Expected: Find references to these playbook paths in the repository or documentation

echo "Searching for operator playbook path references..."
rg -i "hypervisoronboardingstuck|hypervisorevictionstuck|hypervisorevictedtoolong|hypervisortraitsyncfailed|hypervisoraggregatesyncfailed|hypervisorreconcileerrors|hypervisoroperatordown" \
  --type md --type yaml -g '!alerts/' -C2

echo -e "\nSearching for playbook documentation structure..."
fd -t f -e md -i hypervisor | head -20

Repository: cobaltcore-dev/openstack-hypervisor-operator

Length of output: 676


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Search for exact referenced playbook paths..."
rg -n --hidden --no-ignore-vcs \
  -S "docs/compute/kvm/playbooks/(hypervisoronboardingstuck|hypervisorevictionstuck|hypervisorevictedtoolong|hypervisortraitsyncfailed|hypervisoraggregatesyncfailed|hypervisorreconcileerrors|hypervisoroperatordown)" \
  . || true

echo
echo "2) List any files whose path includes docs/compute/kvm/playbooks/"
fd -t f -i --type f "hypervisoronboardingstuck" . || true
fd -t f -i --type f "hypervisorevictionstuck" . || true
fd -t f -i --type f "hypervisorevictedtoolong" . || true
fd -t f -i --type f "hypervisortraitsyncfailed" . || true
fd -t f -i --type f "hypervisoraggregatesyncfailed" . || true
fd -t f -i --type f "hypervisorreconcileerrors" . || true
fd -t f -i --type f "hypervisoroperatordown" . || true

echo
echo "3) Check if docs/compute/kvm/playbooks exists and show a sample listing..."
if [ -d "docs/compute/kvm/playbooks" ]; then
  ls -la "docs/compute/kvm/playbooks" | head -200
else
  echo "Directory not found: docs/compute/kvm/playbooks"
fi

echo
echo "4) Search for 'playbooks/' directory references in charts docs..."
rg -n "compute/kvm/playbooks/" -S . || true

Repository: cobaltcore-dev/openstack-hypervisor-operator

Length of output: 2620


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Search for exact referenced playbook paths..."
rg -n --hidden --no-ignore-vcs \
  -S "docs/compute/kvm/playbooks/(hypervisoronboardingstuck|hypervisorevictionstuck|hypervisorevictedtoolong|hypervisortraitsyncfailed|hypervisoraggregatesyncfailed|hypervisorreconcileerrors|hypervisoroperatordown)" \
  . || true

echo
echo "2) Locate any files by playbook name substring (any location, case-insensitive)..."
fd -t f -i "hypervisoronboardingstuck" .
fd -t f -i "hypervisorevictionstuck" .
fd -t f -i "hypervisorevictedtoolong" .
fd -t f -i "hypervisortraitsyncfailed" .
fd -t f -i "hypervisoraggregatesyncfailed" .
fd -t f -i "hypervisorreconcileerrors" .
fd -t f -i "hypervisoroperatordown" .

echo
echo "3) Check if docs/compute/kvm/playbooks exists and show a sample listing..."
if [ -d "docs/compute/kvm/playbooks" ]; then
  ls -la "docs/compute/kvm/playbooks" | head -200
else
  echo "Directory not found: docs/compute/kvm/playbooks"
fi

echo
echo "4) Search for playbooks/ directory references..."
rg -n "compute/kvm/playbooks/" -S . || true

Repository: cobaltcore-dev/openstack-hypervisor-operator

Length of output: 2620


Fix broken playbook doc references (docs/compute/kvm/playbooks/*) in operator.yaml

charts/openstack-hypervisor-operator/alerts/operator.yaml adds playbook links for:

  • hypervisoronboardingstuck, hypervisorevictionstuck, hypervisorevictedtoolong, hypervisortraitsyncfailed, hypervisoraggregatesyncfailed, hypervisorreconcileerrors, hypervisoroperatordown

But docs/compute/kvm/playbooks/ does not exist anywhere in this repo (only these references are present, including in alerts/eviction.yaml). Ensure the corresponding playbook docs/pages are added at docs/compute/kvm/playbooks/ (or update the paths to the correct location) before merging.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@charts/openstack-hypervisor-operator/alerts/operator.yaml` at line 14,
operator.yaml contains broken playbook links pointing at
docs/compute/kvm/playbooks/* (e.g., hypervisoronboardingstuck,
hypervisorevictionstuck, hypervisorevictedtoolong, hypervisortraitsyncfailed,
hypervisoraggregatesyncfailed, hypervisorreconcileerrors,
hypervisoroperatordown) that do not exist; either create corresponding
documentation pages at docs/compute/kvm/playbooks/<playbook-name> (preferably
markdown files with the playbook content and frontmatter) or update the paths in
charts/openstack-hypervisor-operator/alerts/operator.yaml (and any other files
referencing the same paths such as alerts/eviction.yaml) to point to the correct
existing doc location so all listed playbook links resolve.

Without a ServiceMonitor, Prometheus cannot scrape the operator pod.
This means HypervisorOperatorDown and HypervisorOperatorReconcileErrors
alerts are blind (no up metric, no reconcile_errors_total).

Enabled by default with 60s scrape interval.
@notandy notandy self-requested a review May 28, 2026 12:39
@christophrichtersap christophrichtersap merged commit 2676216 into main May 28, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants