OADP-5777: Require region in BSL validation when s3Url is set, fix nil config skipping region auto-detection#2109
Conversation
When a custom s3Url is configured, the BSL points to a non-AWS S3-compatible service (e.g. IBM Cloud Object Storage, MinIO) where region auto-discovery via AWS HeadBucket API is not valid. Previously, region was only enforced when s3ForcePathStyle was true or AWS discovery failed. This allowed validation to pass with a missing region if a same-named bucket happened to exist on AWS, causing Velero to fail at runtime. Fixes: openshift#2108 Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughRequire explicit AWS region for BackupStorageLocation when a custom Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR enhances BSL (BackupStorageLocation) validation to require an explicit region parameter when a custom s3Url is configured for AWS-provider BSLs. This prevents scenarios where the operator incorrectly attempts AWS region auto-discovery for non-AWS S3-compatible services (e.g., IBM Cloud Object Storage, MinIO), which can lead to validation passing with incorrect configuration but Velero failing at runtime.
Changes:
- Added validation logic to require
regionwhens3Urlis set in AWS BSL configuration - Updated error message to be more direct ("is required" vs "not automatically discoverable")
- Added comprehensive test cases for the new validation rule
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| internal/controller/bsl.go | Added check for s3Url in region validation logic and updated error message |
| internal/controller/bsl_test.go | Added three test cases covering s3Url with/without region and combination with s3ForcePathStyle |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/retest |
1 similar comment
|
/retest |
|
/retest required |
|
/test all |
|
/lgtm |
|
/override ci/prow/4.21-e2e-test-aws ci/prow/4.21-e2e-test-hcp-aws ci/prow/4.21-e2e-test-kubevirt-aws we dont have 4.21 tests defined so retest wont do anything there. |
|
/test ci/prow/4.23-e2e-test-aws |
|
@kaovilai: Overrode contexts on behalf of kaovilai: ci/prow/4.21-e2e-test-aws, ci/prow/4.21-e2e-test-hcp-aws, ci/prow/4.21-e2e-test-kubevirt-aws DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test 4.23-e2e-test-aws |
…configuration Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
shubham-pampattiwar
left a comment
There was a problem hiding this comment.
This is the right fix, but it tightens validation in a way that could break existing DPA configs on operator upgrade. Users with s3Url but no region who happened to have a discoverable bucket name will go from "working" to "validation error" after upgrading. Worth adding a line in the commit message or PR description acknowledging this as an intentional breaking change. No code change needed, just want to make sure QE covers the upgrade path.
Split the generic "region is required" error into three distinct messages based on the trigger condition: s3Url configured, s3ForcePathStyle enabled, or region auto-discovery failure. This gives users actionable feedback about why region is needed. Update E2E test expectation to match the s3ForcePathStyle path. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
added logging to help test https://redhat.atlassian.net/browse/OADP-5777 |
|
/cherry-pick oadp-1.6 |
|
@kaovilai: once the present PR merges, I will cherry-pick it on top of DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/common/common_test.go (1)
584-585: Capture and assert the returned error in the test call.At Line 584, checking the returned error will make this test fail fast if
UpdateBackupStorageLocationstarts returning non-nil errors in future changes.Suggested test hardening
- UpdateBackupStorageLocation(bslCopy, *bslSpecCopy, logr.Discard()) + err := UpdateBackupStorageLocation(bslCopy, *bslSpecCopy, logr.Discard()) + if err != nil { + t.Fatalf("UpdateBackupStorageLocation() returned error: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/common_test.go` around lines 584 - 585, The test currently calls UpdateBackupStorageLocation(bslCopy, *bslSpecCopy, logr.Discard()) without checking its return; modify the test to capture the returned error from UpdateBackupStorageLocation (e.g., err := UpdateBackupStorageLocation(...)) and add an assertion that err is nil (using t.Fatalf/t.Errorf or your test helper) so the test fails immediately if UpdateBackupStorageLocation returns a non-nil error in the future.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/common/common_test.go`:
- Around line 584-585: The test currently calls
UpdateBackupStorageLocation(bslCopy, *bslSpecCopy, logr.Discard()) without
checking its return; modify the test to capture the returned error from
UpdateBackupStorageLocation (e.g., err := UpdateBackupStorageLocation(...)) and
add an assertion that err is nil (using t.Fatalf/t.Errorf or your test helper)
so the test fails immediately if UpdateBackupStorageLocation returns a non-nil
error in the future.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 827e9c2d-7c5a-4178-97fc-124b52664ce6
📒 Files selected for processing (3)
internal/controller/bsl.gopkg/common/common.gopkg/common/common_test.go
|
/retest |
When DPA backupLocations has no config section, bslSpec.Config is nil, causing UpdateBackupStorageLocation to skip all AWS-specific logic including region auto-detection and checksumAlgorithm defaults. This is the root cause of OADP-5777 still reproducing after PR openshift#1740. Initialize config map when nil so AWS logic runs regardless of whether user specifies a config section. Add test coverage for nil config with both discoverable and undiscoverable buckets. Fixes: https://issues.redhat.com/browse/OADP-5777 Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
@kaovilai: This pull request references OADP-5777 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/common/common.go (1)
352-354:⚠️ Potential issue | 🟡 MinorHandle empty
s3Urlas “not configured” in auto-detect gating.Right now the branch checks only key existence. If
s3Urlis present but empty, region auto-detection is skipped unintentionally.Proposed fix
- if _, hasS3Url := bslSpec.Config["s3Url"]; !hasS3Url { + s3Url, hasS3Url := bslSpec.Config["s3Url"] + if !hasS3Url || strings.TrimSpace(s3Url) == "" { if _, hasRegion := bslSpec.Config["region"]; !hasRegion { if bslSpec.ObjectStorage != nil && bslSpec.ObjectStorage.Bucket != "" { // Attempt to auto-detect the bucket's region // AWS Security confirmed that GetBucketRegion works with anonymous credentials // for both public and private buckets (Engagement ID: CACenGS4Mha_KeJ=e3jBSLD6rPZ2iNtfuJUv9QJViaCOt7GVNDg) detectedRegion, err := aws.GetBucketRegion(bslSpec.ObjectStorage.Bucket) if err != nil { logger.Error(err, "Failed to auto-detect AWS bucket region", "bucket", bslSpec.ObjectStorage.Bucket) } else if detectedRegion != "" { logger.Info("Auto-detected AWS bucket region", "bucket", bslSpec.ObjectStorage.Bucket, "region", detectedRegion) bslSpec.Config["region"] = detectedRegion } } } }Also applies to: 358-363
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/common.go` around lines 352 - 354, The gating currently treats any present "s3Url" key as configured even if its value is an empty string; update the checks around bslSpec.Config["s3Url"] (and the similar block at the other occurrence) to treat empty string as not configured by checking both existence and that the value is non-empty (e.g., ensure val, ok := bslSpec.Config["s3Url"]; ok && val != "" before skipping region autodetect), so the subsequent region auto-detection using bslSpec.Config["region"] and bslSpec.ObjectStorage.Bucket runs when s3Url is absent or empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/common/common.go`:
- Around line 352-354: The gating currently treats any present "s3Url" key as
configured even if its value is an empty string; update the checks around
bslSpec.Config["s3Url"] (and the similar block at the other occurrence) to treat
empty string as not configured by checking both existence and that the value is
non-empty (e.g., ensure val, ok := bslSpec.Config["s3Url"]; ok && val != ""
before skipping region autodetect), so the subsequent region auto-detection
using bslSpec.Config["region"] and bslSpec.ObjectStorage.Bucket runs when s3Url
is absent or empty.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0ea6f568-d7f4-4b9c-8d7e-486fcfeaa778
📒 Files selected for processing (2)
pkg/common/common.gopkg/common/common_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/common/common_test.go
Update TestDPAReconciler_updateBSLFromSpec to expect checksumAlgorithm in config when bslSpec.Config starts nil, matching the new behavior. Add mock for GetBucketRegionFunc to prevent real AWS calls in test. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
/override ci/prow/4.23-e2e-test-aws |
|
@kaovilai: Overrode contexts on behalf of kaovilai: ci/prow/4.23-e2e-test-aws DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Reuse previously auto-detected region from existing BSL instead of calling AWS GetBucketRegion on every reconcile. Without this, each reconcile reads DPA spec (no region) → auto-detects → updates BSL → triggers new reconcile in a loop. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
|
/retest Pod scheduling timeout. |
|
@kaovilai: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
weshayutin
left a comment
There was a problem hiding this comment.
/LGTM
override 4.23 e2e
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai, shubham-pampattiwar, weshayutin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
36d616e
into
openshift:oadp-dev
|
@kaovilai: new pull request created: #2182 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Why the changes were made
This PR addresses two related issues with AWS BSL region handling:
1. Fix nil config skipping region auto-detection (OADP-5777)
Root cause: When DPA
backupLocationshas noconfig:section,bslSpec.Configis nil. TheUpdateBackupStorageLocationfunction guarded all AWS-specific logic (region auto-detection, checksumAlgorithm defaults) behindif bslSpec.Config != nil, so when config was omitted entirely, none of that logic ran.This is why PR #1740's auto-detection fix didn't work in practice — the nil config guard skipped it. Verified by reproducing Tareq's exact scenario (DPA with AWS provider, no config section, no region) on a live cluster.
Fix: Initialize
bslSpec.Configto an empty map when nil, so AWS logic runs regardless of whether user specifies a config section.2. Require region when s3Url is set
When a custom
s3Urlis configured, the BSL points to a non-AWS S3-compatible service (e.g. IBM Cloud Object Storage, MinIO) where region auto-discovery via AWS HeadBucket API is not valid. Previously, region was only enforced whens3ForcePathStylewas true or AWS discovery failed.Fix: Validate that region is specified at DPA validation time when
s3Urlis set, with context-specific error messages.3. Add logging to region auto-detection
Added
logr.Loggerparameter toUpdateBackupStorageLocationso auto-detection success/failure is visible in operator logs instead of silently falling through.Fixes: https://issues.redhat.com/browse/OADP-5777
Fixes: #2108
How to test the changes made
Unit tests
Manual verification (nil config fix)
Auto-detected AWS bucket region {"bucket": "...", "region": "..."}config.regionandconfig.checksumAlgorithmsetVerified on cluster
tkaovila-sts-oadpregion: us-east-1andchecksumAlgorithm: ""auto-populatedNote
Responses generated with Claude
Summary by CodeRabbit