Add support for using regex patterns in findSuite#3456
Add support for using regex patterns in findSuite#3456oharan2 wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
WalkthroughWhen a suite lookup misses the cache/DB, the loader now checks importability (exact names or regex). If importable, it creates and saves a new Suite row, logs the result, caches the ID, and returns it; otherwise it caches/returns nil as before. Changes
Sequence Diagram(s)sequenceDiagram
participant Loader as ProwLoader
participant Cache as Cache
participant DB as Database
participant Logger as Logger
Loader->>Cache: lookup suite by name
Cache-->>Loader: miss
Loader->>DB: query suite by name
DB-->>Loader: not found
Loader->>DB: IsTestSuiteImportable(name)?
DB-->>Loader: true / false
alt importable
Loader->>DB: create Suite row (Name=name)
DB-->>Loader: return Suite with ID
Loader->>Cache: cache suite ID
Loader->>Logger: log creation success
Loader-->>Client: return suite ID
else not importable
Loader->>Cache: cache nil
Loader->>Logger: (optional) no-import log
Loader-->>Client: return nil
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 15 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: oharan2 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/db/suites.go (1)
99-114: Consider using a map for exact suite name lookups.The linear scan through
testSuites(O(n)) is fine for ~50 entries, but a pre-builtmap[string]struct{}would give O(1) lookups and scale better if the list grows. This is optional as the current implementation works correctly.♻️ Optional refactor using a map
+var testSuitesSet map[string]struct{} + func init() { + testSuitesSet = make(map[string]struct{}, len(testSuites)) + for _, s := range testSuites { + testSuitesSet[s] = struct{}{} + } compiledTestSuitePatterns = make([]*regexp.Regexp, len(testSuitePatterns)) for i, p := range testSuitePatterns { compiledTestSuitePatterns[i] = regexp.MustCompile(p) } } func IsTestSuiteImportable(name string) bool { if name == "" { return false } - for _, s := range testSuites { - if s == name { - return true - } + if _, ok := testSuitesSet[name]; ok { + return true } for _, re := range compiledTestSuitePatterns { if re.MatchString(name) { return true } } return false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/db/suites.go` around lines 99 - 114, Replace the O(n) scan over testSuites in IsTestSuiteImportable with an O(1) map lookup: create a pre-built map (e.g., testSuiteSet map[string]struct{}) populated from testSuites at package init (or when testSuites is defined), then update IsTestSuiteImportable to first check presence via _, ok := testSuiteSet[name] before falling back to iterating compiledTestSuitePatterns; keep the empty-name guard and regex checks unchanged and ensure the map is kept in sync with testSuites where they are populated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/db/suites.go`:
- Around line 81-86: The slice literal testSuitePatterns is missing its closing
brace which breaks compilation; close the slice declaration by adding the
terminating `}` (and a trailing comma if you prefer for multi-line literals) so
that `var testSuitePatterns = []string{ ... }` is properly closed before the
subsequent `var compiledTestSuitePatterns []*regexp.Regexp` declaration.
---
Nitpick comments:
In `@pkg/db/suites.go`:
- Around line 99-114: Replace the O(n) scan over testSuites in
IsTestSuiteImportable with an O(1) map lookup: create a pre-built map (e.g.,
testSuiteSet map[string]struct{}) populated from testSuites at package init (or
when testSuites is defined), then update IsTestSuiteImportable to first check
presence via _, ok := testSuiteSet[name] before falling back to iterating
compiledTestSuitePatterns; keep the empty-name guard and regex checks unchanged
and ensure the map is kept in sync with testSuites where they are populated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 4d1bd736-ee9d-4573-80c4-54eb7e013c9d
📒 Files selected for processing (3)
pkg/dataloader/prowloader/prow.gopkg/db/suites.gopkg/db/suites_test.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/db/suites.go (1)
103-107: Use a precomputed set for exact suite lookups in the hot path.
IsTestSuiteImportableruns per unknown suite; linear scanningtestSuiteseach call is avoidable. Build a set once during init for O(1) exact matches.Refactor sketch
var compiledTestSuitePatterns []*regexp.Regexp +var testSuiteSet map[string]struct{} // Invalid regexes panic at process start. func init() { + testSuiteSet = make(map[string]struct{}, len(testSuites)) + for _, s := range testSuites { + testSuiteSet[s] = struct{}{} + } + compiledTestSuitePatterns = make([]*regexp.Regexp, len(testSuitePatterns)) for i, p := range testSuitePatterns { compiledTestSuitePatterns[i] = regexp.MustCompile(p) } } @@ - for _, s := range testSuites { - if s == name { - return true - } - } + if _, ok := testSuiteSet[name]; ok { + return true + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/db/suites.go` around lines 103 - 107, IsTestSuiteImportable currently does a linear scan over testSuites for each call; create a package-level map[string]struct{} (e.g., testSuiteSet) initialized once (in init or where testSuites is defined) populated from testSuites, then change IsTestSuiteImportable to check membership via a single lookup in testSuiteSet instead of iterating the slice; ensure the new set is built before IsTestSuiteImportable can be called and replace the for-loop that checks s == name with a direct map lookup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/db/suites.go`:
- Around line 82-83: The current regex literal `^lp-interop-` is too permissive
and should be tightened to enforce the documented format
`lp-interop-<product>--<suffix>`; replace the pattern `^lp-interop-` (in the
regex list in pkg/db/suites.go) with a stricter pattern such as
`^lp-interop-[^-]+--.+$` (or a more restrictive charset like
`^lp-interop-[A-Za-z0-9]+--[A-Za-z0-9-]+$`) so names must include a product
segment and the required double-hyphen suffix.
---
Nitpick comments:
In `@pkg/db/suites.go`:
- Around line 103-107: IsTestSuiteImportable currently does a linear scan over
testSuites for each call; create a package-level map[string]struct{} (e.g.,
testSuiteSet) initialized once (in init or where testSuites is defined)
populated from testSuites, then change IsTestSuiteImportable to check membership
via a single lookup in testSuiteSet instead of iterating the slice; ensure the
new set is built before IsTestSuiteImportable can be called and replace the
for-loop that checks s == name with a direct map lookup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: 1b9e0d64-ab2e-4328-94c7-780c283381ea
📒 Files selected for processing (1)
pkg/db/suites.go
| // LP interop naming: `lp-interop-<product>--<suffix>`. | ||
| `^lp-interop-`, |
There was a problem hiding this comment.
Regex pattern is broader than the documented suite format.
Line 82 documents lp-interop-<product>--<suffix>, but Line 83 (^lp-interop-) accepts malformed names (e.g., missing --<suffix>), which can import unintended suites.
Suggested tightening
var testSuitePatterns = []string{
// LP interop naming: `lp-interop-<product>--<suffix>`.
- `^lp-interop-`,
+ `^lp-interop-.+--.+$`,
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // LP interop naming: `lp-interop-<product>--<suffix>`. | |
| `^lp-interop-`, | |
| var testSuitePatterns = []string{ | |
| // LP interop naming: `lp-interop-<product>--<suffix>`. | |
| `^lp-interop-.+--.+$`, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/db/suites.go` around lines 82 - 83, The current regex literal
`^lp-interop-` is too permissive and should be tightened to enforce the
documented format `lp-interop-<product>--<suffix>`; replace the pattern
`^lp-interop-` (in the regex list in pkg/db/suites.go) with a stricter pattern
such as `^lp-interop-[^-]+--.+$` (or a more restrictive charset like
`^lp-interop-[A-Za-z0-9]+--[A-Za-z0-9-]+$`) so names must include a product
segment and the required double-hyphen suffix.
There was a problem hiding this comment.
The pattern lp-interop-<product>--<suffix> in the comment is an example
There was a problem hiding this comment.
@oharan2, understood — the comment is illustrative, and the intentionally broad ^lp-interop- pattern is correct to cover all LP interop suite name variations without over-constraining them. My apologies for the noise.
📝 Learnings were identified and not saved (knowledge base disabled). Enable
|
Scheduling required tests: |
|
@oharan2: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@stbenjam Question regarding the best approach: would it be better to address the |
There was a problem hiding this comment.
Hi Ori,
Thanks for working on this. However, the proposal adds complexity by requiring two arrays to be maintained going forward.
I think it's better to refactor the existing code to use regex patterns exclusively. Please see below suggestions. This actually reduces pkg/db/suites.go's footprint instead of increasing it.
The only trade-off is a one-time migration: each existing exact name in testSuites becomes ^name$ — it is mechanical, low-risk, and never needs to be done again.
What do yo think @neisw?
| // FROM `openshift-gce-devel.ci_analysis_us.junit` \ | ||
| // GROUP BY testsuite | ||
| // ORDER BY count desc | ||
| var testSuites = []string{ |
There was a problem hiding this comment.
We convert this array to a RegEx pattern array, remove the sippy entry and always create it in populateTestSuitesInDB():
var TestSuitePatterns = []*regexp.Regexp{
regexp.MustCompile(`^openshift-tests$`),
...
regexp.MustCompile(`^lp-interop-`),
}
func populateTestSuitesInDB(db *gorm.DB) error {
s := models.Suite{Name: "sippy"} // Sippy synthetic tests.
if err := db.Clauses(clause.OnConflict{UpdateAll: true}).Create(&s).Error; err != nil {
return errors.Wrapf(err, "error loading suite into db: sippy")
}
return nil
}Advantage:
- Only 1 array to maintain.
- The footprint of
populateTestSuitesInDB()is significantly reduced.
There was a problem hiding this comment.
I'm going to opt for tread lightly. Part of our e2e tests initialize a new db and load some synthetic data. It is possible it is relying on some of the expected suites to be populated.
// Runs when the DB is set up / migrated.
func populateTestSuitesInDB(db *gorm.DB) error {
There was a problem hiding this comment.
@neisw, do you have examples of the E2E tests you are concerned about? I have already used an AI to analyze this repository to determine how they are used, but I assume those E2E tests are hosted in other repos. If you can provide the locations of those E2E tests, I can run the exposure risk analysis.
My current proposal handles sippy as a synthetic test, but I suspect your concern pertains to other E2E test suites; is that correct?
There was a problem hiding this comment.
Alternatively, we could take a more conservative approach. Instead of replacing testSuites, we could introduce:
var testSuitePatterns = []*regexp.Regexp{
regexp.MustCompile(`^lp-interop--`),
...
}We can keep populateTestSuitesInDB() as is to pre-seed the database from testSuites. Then, we can implement the proposal outlined in this PR comment.
I believe this should work without introducing unnecessary complexity, while also addressing your concern that some tests may require the database to be pre-seeded. What do you think?
There was a problem hiding this comment.
I did not run AI to evaluate the impact. Just suggested minimizing impact. 2e-scripts looks like it is running a loader so it may work.
Seems like this would still have the duplication of code for saving the suite
There was a problem hiding this comment.
Looking it over a bit more and thinking about the single array I would actually prefer we don't scan a large(ish) list of potential matches each time we miss a suite. Seems keeping the list small to only the expected dynamic names is better.
Consider #3478 as an alternate approach
| if suite.ID == 0 { | ||
| pl.suiteCache[name] = nil | ||
| } else { | ||
| id := suite.ID | ||
| pl.suiteCache[name] = &id | ||
| // No row - the exact suite name is not in the database (for example, by populateTestSuitesInDB) | ||
| if !db.IsTestSuiteImportable(name) { | ||
| pl.suiteCache[name] = nil | ||
| return nil | ||
| } | ||
| // If IsTestSuiteImportable() returns true (for example, pattern match), add the Suite row. | ||
| suite.Name = name | ||
| tx := pl.dbc.DB.Save(suite) | ||
| if tx.Error != nil { | ||
| log.WithError(tx.Error).Warningf("failed to create suite %q", name) | ||
| return nil | ||
| } | ||
| log.WithField("suite", name).Info("created new test suite") | ||
| } | ||
| return pl.suiteCache[name] | ||
| id := suite.ID | ||
| pl.suiteCache[name] = &id | ||
| return &id |
There was a problem hiding this comment.
With a single array implementation, the solution is more straightforward (no call to an external function):
| if suite.ID == 0 { | |
| pl.suiteCache[name] = nil | |
| } else { | |
| id := suite.ID | |
| pl.suiteCache[name] = &id | |
| // No row - the exact suite name is not in the database (for example, by populateTestSuitesInDB) | |
| if !db.IsTestSuiteImportable(name) { | |
| pl.suiteCache[name] = nil | |
| return nil | |
| } | |
| // If IsTestSuiteImportable() returns true (for example, pattern match), add the Suite row. | |
| suite.Name = name | |
| tx := pl.dbc.DB.Save(suite) | |
| if tx.Error != nil { | |
| log.WithError(tx.Error).Warningf("failed to create suite %q", name) | |
| return nil | |
| } | |
| log.WithField("suite", name).Info("created new test suite") | |
| } | |
| return pl.suiteCache[name] | |
| id := suite.ID | |
| pl.suiteCache[name] = &id | |
| return &id | |
| if suite.ID == 0 { | |
| pl.suiteCache[name] = nil | |
| for _, re := range db.TestSuitePatterns { | |
| if re.MatchString(name) { | |
| suite.Name = name | |
| result := pl.dbc.DB.Where("name = ?", name).FirstOrCreate(suite) | |
| if result.Error == nil { | |
| if result.RowsAffected > 0 { | |
| log.WithField("suite", name).Info("created new test suite") | |
| } | |
| pl.suiteCache[name] = &suite.ID | |
| } else { | |
| log.WithError(result.Error).Warningf("failed to create suite %q", name) | |
| } | |
| break | |
| } | |
| } | |
| } else { | |
| pl.suiteCache[name] = &suite.ID | |
| } | |
| return pl.suiteCache[name] |
| id := suite.ID | ||
| pl.suiteCache[name] = &id | ||
| // No row - the exact suite name is not in the database (for example, by populateTestSuitesInDB) | ||
| if !db.IsTestSuiteImportable(name) { |
There was a problem hiding this comment.
Could IsTestSuiteImportable do the save of the suite? Claude noted the duplication of logic. Wondering if the save could be refactored within db/suites so that both populateTestSuitesInDB and IsTestSuiteImportable used the same method.
Duplicate pattern found in:
- pkg/dataloader/prowloader/prow.go:1177-1181
- pkg/db/suites.go:85-92 (in populateTestSuitesInDB)
Both locations contain identical suite creation logic:
suite.Name = name
tx := DB.Save(suite)
if tx.Error != nil {
log.WithError(...).Warningf(...)
return ...
}
log.WithField("suite", name).Info("created new test suite")
Recommendation: Extract into shared function:
// In pkg/db/suites.go
func createSuite(db *gorm.DB, name string) (*models.Suite, error)
| log.WithField("suite", name).Info("created new test suite") | ||
| } | ||
| return pl.suiteCache[name] | ||
| id := suite.ID |
There was a problem hiding this comment.
I was curious if ID would be set properly in the struct here since it is created during the save. The indication is that it should be. I wonder if it is still worth a check to verify it isn't 0 after the save though. Worst case would be that the first time it is save that job would be ignored if the ID was still 0. Next query it should exists.
● Analysis of suite.ID Population After Save()
You've identified a critical question! Let me trace through the code path:
suite := &models.Suite{} // ID = 0, Name = ""
pl.dbc.DB.Where("name = ?", name).Find(&suite) // Find fails, suite.ID still 0
if suite.ID == 0 {
suite.Name = name // ID = 0, Name = "lp-interop-foo"
tx := pl.dbc.DB.Save(suite) // INSERT operation
// ... error handling ...
}
id := suite.ID //
pl.suiteCache[name] = &id
return &id
Expected GORM Behavior: YES, it should be populated
With GORM + PostgreSQL:
- Save() on a struct with ID == 0 triggers an INSERT
- PostgreSQL generates the ID via nextval()
- GORM uses the RETURNING id clause to retrieve it
- GORM populates suite.ID in the struct after the INSERT
|
Will be completed in #3478 |
Building on ci-test-mapping #616; this PR provides complmentry support for loading desired test suites based on a matching regex pattern,
instead of holding list of static suite names, ref, and reducing the need to add every suite name explicitly.
Main changes:
testSuitelist, addtestSuitePatternsand compile them on initTestIsTestSuiteImportablefor find matches, and use it infindSuiteSummary by CodeRabbit
New Features
Bug Fixes
Tests