Skip to content

CNTRLPLANE-3351: enable CPU resource request overrides for Azure self-managed e2e#78644

Open
bryan-cox wants to merge 1 commit intoopenshift:mainfrom
bryan-cox:CNTRLPLANE-3351
Open

CNTRLPLANE-3351: enable CPU resource request overrides for Azure self-managed e2e#78644
bryan-cox wants to merge 1 commit intoopenshift:mainfrom
bryan-cox:CNTRLPLANE-3351

Conversation

@bryan-cox
Copy link
Copy Markdown
Member

@bryan-cox bryan-cox commented Apr 30, 2026

What this PR does / why we need it:

Sets E2E_RESOURCE_REQUEST_OVERRIDES=1 for the e2e-azure-self-managed job to activate CPU request overrides on control plane pods. This prevents the scheduler from over-packing hosted clusters onto management nodes, which causes CPU starvation and e2e test timeouts.

The env var is declared in the hypershift-azure-run-e2e-self-managed step ref and consumed by the Go test code in openshift/hypershift (see openshift/hypershift#8385).

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-3351

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Bug Fixes
    • Improved Azure test configuration to prevent scheduler resource over-packing on management nodes.
    • Added configurable resource request override controls for test environments, enabling better CPU resource allocation and more reliable test execution.

…-managed e2e

Set E2E_RESOURCE_REQUEST_OVERRIDES=1 for the e2e-azure-self-managed job
to activate CPU request overrides on control plane pods. This prevents
the scheduler from over-packing hosted clusters onto management nodes,
which causes CPU starvation and e2e test timeouts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 30, 2026

@bryan-cox: This pull request references CNTRLPLANE-3351 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Sets E2E_RESOURCE_REQUEST_OVERRIDES=1 for the e2e-azure-self-managed job to activate CPU request overrides on control plane pods. This prevents the scheduler from over-packing hosted clusters onto management nodes, which causes CPU starvation and e2e test timeouts.

The env var is declared in the hypershift-azure-run-e2e-self-managed step ref and consumed by the Go test code in openshift/hypershift (see openshift/hypershift#8385).

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-3351

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Walkthrough

This change introduces a new environment variable (E2E_RESOURCE_REQUEST_OVERRIDES) to the HyperShift Azure E2E testing pipeline. The variable enables CPU resource request override annotations on HostedClusters during testing to prevent scheduler over-packing on management nodes. It is declared with a default empty value and activated in the main test job configuration.

Changes

Cohort / File(s) Summary
HyperShift Azure E2E Configuration
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yaml, ci-operator/step-registry/hypershift/azure/run-e2e-self-managed/hypershift-azure-run-e2e-self-managed-ref.yaml
Introduces new environment variable E2E_RESOURCE_REQUEST_OVERRIDES with default value "" and enables it ("1") in the e2e-azure-self-managed test job to apply resource request overrides to E2E HostedClusters.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: enabling CPU resource request overrides for Azure self-managed e2e testing, which matches the core objective of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR modifies only YAML configuration files for CI/CD infrastructure with no Ginkgo test code or test name declarations present, placing changes outside the scope of this stability check.
Test Structure And Quality ✅ Passed This PR contains only YAML configuration files with no Ginkgo test code to review.
Microshift Test Compatibility ✅ Passed This PR only modifies CI configuration YAML files to set an environment variable; no new Ginkgo e2e tests are added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR does not add any new Ginkgo e2e tests. The changes are limited to CI/CD configuration files that enable an environment variable for an existing test job.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only CI/CD configuration files (ci-operator test job definitions) that define environment variables for E2E testing, not deployment manifests, operator code, or controllers.
Ote Binary Stdout Contract ✅ Passed PR contains only YAML CI/CD configuration changes with no executable source code modifications that would violate OTE Binary Stdout Contract.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR contains only CI/CD YAML configuration changes, no new Ginkgo e2e test code with IPv4 or connectivity assumptions.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 30, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and devguyio April 30, 2026 14:11
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@bryan-cox: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-hypershift-main-e2e-azure-self-managed openshift/hypershift presubmit Ci-operator config changed
pull-ci-openshift-hypershift-release-5.1-e2e-azure-self-managed openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-5.0-e2e-azure-self-managed openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-4.23-e2e-azure-self-managed openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-4.22-e2e-azure-self-managed openshift/hypershift presubmit Registry content changed
pull-ci-openshift-priv-hypershift-main-e2e-azure-self-managed openshift-priv/hypershift presubmit Registry content changed
pull-ci-openshift-priv-hypershift-release-5.1-e2e-azure-self-managed openshift-priv/hypershift presubmit Registry content changed
pull-ci-openshift-priv-hypershift-release-5.0-e2e-azure-self-managed openshift-priv/hypershift presubmit Registry content changed
pull-ci-openshift-priv-hypershift-release-4.23-e2e-azure-self-managed openshift-priv/hypershift presubmit Registry content changed
pull-ci-openshift-priv-hypershift-release-4.22-e2e-azure-self-managed openshift-priv/hypershift presubmit Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 30, 2026

@bryan-cox: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants