TRT-2642: Add PostgreSQL data provider, seed data, and dev setup for Component Readiness by stbenjam · Pull Request #3481 · openshift/sippy

stbenjam · 2026-04-29T11:02:44Z

Summary

Adds a PostgreSQL implementation of the dataprovider.DataProvider interface for Component Readiness, enabling local development and e2e testing without BigQuery
Rewrites seed data to generate deterministic test results covering all CR statuses (regressions, missing data, not significant, etc.) across releases 4.19–4.22
Updates scripts/e2e.sh to use the postgres provider and seed data, making GCS credentials optional
Adds scripts/dev-setup.sh and make dev target to stand up a seeded PostgreSQL + Redis environment for local development (optionally starts the sippy server with --serve / SERVE=1)

NOTE We are not at a place where we could run real component readiness against prod's database, it is too much data, but postgres with seeded data allows extremely rapid deployment against controlled data. Should make dev experience nicer, and Claude more successful once I add more tests in a follow-up PR.

Test plan

make dev seeds database and prints connection info
make dev SERVE=1 seeds and starts sippy server
make e2e passes with postgres data provider
Component Readiness views load correctly against seeded data

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- PostgreSQL is now supported as a component-readiness data provider.
- New dev workflow to provision local Postgres+Redis and optionally start the server; Makefile target added to invoke it.
- Serve command can run with the Postgres-backed provider.
Chores
- Deterministic synthetic data seeding for component readiness.
- Updated component readiness views targeting 4.22 and aligned e2e release constants.
- Improved e2e orchestration, coverage merging, readiness priming, and test-run handling.

Adds a PostgreSQL-based DataProvider implementation for Component Readiness, enabling local development and testing without BigQuery. The seed-data command creates deterministic test data covering all CR statuses (regression, improvement, missing, fallback, etc.) and syncs regressions. The e2e script uses seed data with the postgres provider, removing the BigQuery credential requirement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds scripts/dev-setup.sh which stands up PostgreSQL and Redis containers seeded with deterministic Component Readiness data. Starting the sippy server is optional (--serve flag) so developers using devcontainers can connect to the database directly. Accessible via: make dev (seed only) or make dev SERVE=1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-merge-bot · 2026-04-29T11:02:46Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

openshift-ci-robot · 2026-04-29T11:02:48Z

@stbenjam: This pull request references TRT-2642 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

Adds a PostgreSQL implementation of the dataprovider.DataProvider interface for Component Readiness, enabling local development and e2e testing without BigQuery

Rewrites seed data to generate deterministic test results covering all CR statuses (regressions, missing data, not significant, etc.) across releases 4.19–4.22

Updates scripts/e2e.sh to use the postgres provider and seed data, making GCS credentials optional

Adds scripts/dev-setup.sh and make dev target to stand up a seeded PostgreSQL + Redis environment for local development (optionally starts the sippy server with --serve / SERVE=1)

Test plan

make dev seeds database and prints connection info

make dev SERVE=1 seeds and starts sippy server

make e2e passes with postgres data provider

Component Readiness views load correctly against seeded data

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-04-29T11:02:58Z

Walkthrough

Adds a Postgres-backed component-readiness data provider, deterministic synthetic DB seeding for local development, a dev Makefile target and a Podman/Docker dev-setup script to run Postgres+Redis and seed data, updates the serve command to accept --data-provider=postgres, and adjusts component-readiness views for OpenShift 4.22.

Changes

Cohort / File(s)	Summary
Makefile & Dev scripts `Makefile`, `scripts/dev-setup.sh`, `scripts/e2e.sh`	Adds `dev` Makefile target; new `scripts/dev-setup.sh` to run Postgres+Redis, seed DB, and optionally start `./sippy serve --data-provider postgres`. Converts `scripts/e2e.sh` to POSIX `sh`, makes GCS credential flag conditional, merges coverage profiles, and primes component-readiness views.
Seeding logic `cmd/sippy/seed_data.go`	Replaces randomized "ourtests" seeding with deterministic synthetic suite/job/test/run generator; enforces safe-DSN checks, optional schema init, short-circuits if ProwJobs exist, refreshes materialized views, and syncs component readiness regressions and triages from `config/e2e-views.yaml`.
Serve command `cmd/sippy/serve.go`	Adds `--data-provider=postgres` option, updates help/validation, bypasses Google Cloud validation when `postgres` is chosen, and initializes the Postgres-backed component-readiness provider at runtime.
Postgres data provider `pkg/api/componentreadiness/dataprovider/postgres/provider.go`	New `PostgresProvider` and constructor implementing `dataprovider.DataProvider`: nil-cache fallback, cache accessor, variant parsing/filtering, and methods for querying job variants, releases, release dates, unique variant values, base/sample test statuses, job-run test status, job runs, job variant values, and variant lookup.
Configuration `config/e2e-views.yaml`	Updates view from `4.20-main` to `4.22-main`, adjusts release windows and relative offsets, narrows variant/grouping keys and allowed values (e.g., `Installer: ipi`, adds `arm64`), lowers new-test pass-rate threshold, and disables metrics/Jira automation while keeping regression tracking and cache priming enabled.
E2E utilities & scripts `test/e2e/util/e2erequest.go`, `e2e-scripts/...`	Bumps E2E `Release`/`BaseRelease` to `4.22`/`4.21`; e2e Kubernetes scripts switch from `sippy-load-job` to `sippy-seed-job` and call `sippy seed-data` (no GCS creds). Test runner scripts add explicit readiness polling and component-readiness cache priming.
E2E runner: test orchestration `e2e-scripts/sippy-e2e-sippy-e2e-test-commands.sh`	Adds post–port-forward readiness gate (curl `/api/health`), primes component readiness views, and ensures failures on priming are surfaced immediately.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Sippy as Sippy Serve
    participant Provider as PostgresProvider
    participant Cache as Redis Cache
    participant DB as PostgreSQL DB

    Client->>Sippy: GET /api/component_readiness
    Sippy->>Provider: QueryJobVariants / QueryReleases / QueryBaseTestStatus
    Provider->>Cache: Check cache for query
    alt Cache hit
        Cache-->>Provider: Cached result
    else Cache miss
        Provider->>DB: SQL queries (prow_jobs, prow_job_runs, test_results)
        DB-->>Provider: Rows
        Provider->>Provider: Aggregate, filter by variants, compute statuses
        Provider->>Cache: Store computed results
    end
    Provider-->>Sippy: TestStatus / JobRunStats
    Sippy-->>Client: JSON response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

Ignore

❌ Failed checks (1 error, 2 warnings)

Check name	Status	Explanation	Resolution
Ote Binary Stdout Contract	❌ Error	The sippy binary writes version info to stdout in PersistentPreRun, corrupting JSON communication with OpenShift Tests Extension.	Change fmt.Fprintf(os.Stdout, ...) to fmt.Fprintf(os.Stderr, ...) in PrintVersion function at cmd/sippy/version.go line 60.
Docstring Coverage	⚠️ Warning	Docstring coverage is 23.53% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Coverage For New Features	⚠️ Warning	PR introduces PostgresProvider (784 lines) and rewrites seed_data.go (658 lines added) without corresponding unit tests, while BigQuery provider has established test patterns.	Add unit tests for PostgresProvider implementation and rewritten seed_data.go command following existing BigQuery provider test patterns.

✅ Passed checks (14 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding PostgreSQL data provider, seed data generation, and dev setup tooling for Component Readiness, all of which are well-represented in the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Go Error Handling	✅ Passed	Go error handling patterns are properly implemented across all modified files with consistent error checking and wrapping.
Sql Injection Prevention	✅ Passed	PostgreSQL provider and seed_data.go properly use parameterized queries with $N placeholders and prepared statements, avoiding string concatenation or unsafe interpolation patterns.
Excessive Css In React Should Use Styles	✅ Passed	This custom check is not applicable to the provided pull request. The check targets React components with excessive inline CSS styling, but all modified files are backend infrastructure code.
Single Responsibility And Clear Naming	✅ Passed	PostgreSQL provider demonstrates strong adherence to single responsibility with focused struct scope, clear method naming conventions, and appropriate interface implementation without generic anti-patterns.
Stable And Deterministic Test Names	✅ Passed	The PR uses standard Go testing with testing.T and t.Run() subtests, not Ginkgo tests. All subtest names are static, descriptive strings without dynamic content.
Test Structure And Quality	✅ Passed	This pull request does not contain any Ginkgo test code changes, so the check does not apply.
Microshift Test Compatibility	✅ Passed	This PR does not add any new Ginkgo e2e tests; it only contains infrastructure modifications and utility updates.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	PR does not add new Ginkgo e2e tests; only modifies constants in test/e2e/util/e2erequest.go.
Topology-Aware Scheduling Compatibility	✅ Passed	E2E test infrastructure resources (PostgreSQL, Redis, sippy server pods) use default scheduler with no topology-specific scheduling constraints or affinity rules that would break on SNO, TNF, TNA, or HyperShift topologies.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	This PR does not add any new Ginkgo e2e tests. Changes are infrastructure updates: PostgreSQL provider, seed scripts, and e2e orchestration modifications only.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-04-29T11:03:35Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [stbenjam]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The default metrics port 12112 may already be in use by a running sippy instance. Use 22112 by default, overridable via SIPPY_METRICS_PORT. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This reverts commit a8be842.

stbenjam · 2026-04-29T11:11:16Z

Likely to be replaced by #3476

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/e2e.sh (1)
85-103: ⚠️ Potential issue | 🟠 Major

Wait for PostgreSQL before seeding, and fail if seeding does.

The fixed sleep 5 is brittle here. When Postgres is still starting, ./sippy seed-data can fail and the script keeps going because that exit status is ignored. That leaves the rest of E2E booting against a half-initialized DB and turns startup timing into test flake.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e.sh` around lines 85 - 103, The script uses a brittle fixed sleep
before running GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and
ignores failures; replace the sleep+blind seed with a readiness loop that waits
for PostgreSQL to accept connections (e.g., using pg_isready or attempting a
simple psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a
timeout, and ensure the seed command's exit status is propagated (fail the
script on non-zero exit) so that ./sippy seed-data errors cause an immediate
exit instead of continuing the E2E flow; update references to SIPPY_E2E_DSN,
PSQl_PORT, and the ./sippy seed-data invocation accordingly.

🧹 Nitpick comments (1)

pkg/api/componentreadiness/dataprovider/postgres/provider.go (1)
536-558: Reuse fetchJobVariants here instead of duplicating the lookup/parsing block.

This is the same batch prow_jobs variant fetch you already implemented for queryTestStatus, so keeping a second copy here makes the two paths easy to drift apart. Generalizing the helper to accept job IDs would keep the provider behavior aligned. As per coding guidelines, "Check pkg/util/ for existing helper functions before adding inline utility logic. Avoid calling the same utility function multiple times with identical arguments in the same code path."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go` around lines
536 - 558, Replace the duplicated batch fetch/parse block with a call to the
existing helper fetchJobVariants (the same one used by queryTestStatus): collect
the unique job IDs as you already do into ids, then call fetchJobVariants(ids)
and use its returned map to populate jobVariantMap instead of re-running the raw
query and parseVariants; ensure you propagate any error returned by
fetchJobVariants exactly as the current error handling does so behavior remains
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 66-68: The current DSN check only blocks "amazonaws.com" hosts;
change it to allow seeding only when the DSN host is local (e.g., "localhost",
"127.0.0.1", or a unix socket) and reject remote hosts otherwise unless an
explicit force flag is provided. Update the logic that currently inspects
f.DBFlags.DSN to parse the DSN (or extract host) and return an error like
"refusing to seed synthetic data into non-local database; pass --force to
override" when the host is not local and a new boolean force flag (e.g.,
SeedForce or Force) on the same options struct is false; if the force flag is
true, proceed. Reference f.DBFlags.DSN for the DSN source and add/consult the
new force flag on the same config struct used in seed_data.go.
- Around line 593-641: The current logic reduces testableRuns by 2 which causes
tests with counts.total == runCount to miss two real results; instead leave
testableRuns equal to runCount (do not subtract 2) and ensure any per-test loop
bounds use min(counts.total, testableRuns); then create two additional
infra-only runs beyond the runCount (append two extra runIDs and insert
ProwJobRun entries marked as infra failures) so the two infra runs exist without
stealing real test counts—adjust references in this block to testableRuns,
runCount, counts.total, and runIDs accordingly and mark those extra runs via
runsWithFailure as needed.

In `@Makefile`:
- Around line 65-66: The dev target currently only depends on sippy which only
creates a placeholder sippy-ng/build/index.js; update the Makefile so the
frontend is built before running dev (when SERVE=1) by having dev depend on the
frontend build target (or invoke the frontend build step) prior to calling
./scripts/dev-setup.sh; reference the make targets/steps sippy, dev, and the
frontend build (e.g., sippy-ng build target or equivalent) and ensure the real
sippy-ng/build/index.js is produced before ./scripts/dev-setup.sh runs.

In `@scripts/dev-setup.sh`:
- Around line 57-64: Replace the fixed sleep with a readiness loop that waits
for PostgreSQL to accept connections before running ./sippy seed-data;
specifically, in scripts/dev-setup.sh remove the hardcoded sleep 5 and instead
poll the database using a connection check (eg. pg_isready or psql -h localhost
-p "$PSQL_PORT" -U postgres -d postgres -c '\q') in a retry loop with a short
sleep interval and an overall timeout, then proceed to set DSN and run ./sippy
seed-data --init-database --database-dsn="$DSN" only after the check succeeds.

---

Outside diff comments:
In `@scripts/e2e.sh`:
- Around line 85-103: The script uses a brittle fixed sleep before running
GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and ignores
failures; replace the sleep+blind seed with a readiness loop that waits for
PostgreSQL to accept connections (e.g., using pg_isready or attempting a simple
psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a timeout, and
ensure the seed command's exit status is propagated (fail the script on non-zero
exit) so that ./sippy seed-data errors cause an immediate exit instead of
continuing the E2E flow; update references to SIPPY_E2E_DSN, PSQl_PORT, and the
./sippy seed-data invocation accordingly.

---

Nitpick comments:
In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go`:
- Around line 536-558: Replace the duplicated batch fetch/parse block with a
call to the existing helper fetchJobVariants (the same one used by
queryTestStatus): collect the unique job IDs as you already do into ids, then
call fetchJobVariants(ids) and use its returned map to populate jobVariantMap
instead of re-running the raw query and parseVariants; ensure you propagate any
error returned by fetchJobVariants exactly as the current error handling does so
behavior remains unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 191735b7-8fe8-4b8f-82c5-4c968d53009e

📥 Commits

Reviewing files that changed from the base of the PR and between c33fabe and dbe98b2.

📒 Files selected for processing (7)

Makefile
cmd/sippy/seed_data.go
cmd/sippy/serve.go
config/e2e-views.yaml
pkg/api/componentreadiness/dataprovider/postgres/provider.go
scripts/dev-setup.sh
scripts/e2e.sh

openshift-ci · 2026-04-29T11:14:01Z

@coderabbitai[bot]: changing LGTM is restricted to collaborators

Details

In response to this:

Actionable comments posted: 4

[!CAUTION]
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/e2e.sh (1)
85-103: ⚠️ Potential issue | 🟠 Major

Wait for PostgreSQL before seeding, and fail if seeding does.

The fixed sleep 5 is brittle here. When Postgres is still starting, ./sippy seed-data can fail and the script keeps going because that exit status is ignored. That leaves the rest of E2E booting against a half-initialized DB and turns startup timing into test flake.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e.sh` around lines 85 - 103, The script uses a brittle fixed sleep
before running GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and
ignores failures; replace the sleep+blind seed with a readiness loop that waits
for PostgreSQL to accept connections (e.g., using pg_isready or attempting a
simple psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a
timeout, and ensure the seed command's exit status is propagated (fail the
script on non-zero exit) so that ./sippy seed-data errors cause an immediate
exit instead of continuing the E2E flow; update references to SIPPY_E2E_DSN,
PSQl_PORT, and the ./sippy seed-data invocation accordingly.

🧹 Nitpick comments (1)

pkg/api/componentreadiness/dataprovider/postgres/provider.go (1)
536-558: Reuse fetchJobVariants here instead of duplicating the lookup/parsing block.

This is the same batch prow_jobs variant fetch you already implemented for queryTestStatus, so keeping a second copy here makes the two paths easy to drift apart. Generalizing the helper to accept job IDs would keep the provider behavior aligned. As per coding guidelines, "Check pkg/util/ for existing helper functions before adding inline utility logic. Avoid calling the same utility function multiple times with identical arguments in the same code path."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go` around lines
536 - 558, Replace the duplicated batch fetch/parse block with a call to the
existing helper fetchJobVariants (the same one used by queryTestStatus): collect
the unique job IDs as you already do into ids, then call fetchJobVariants(ids)
and use its returned map to populate jobVariantMap instead of re-running the raw
query and parseVariants; ensure you propagate any error returned by
fetchJobVariants exactly as the current error handling does so behavior remains
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 66-68: The current DSN check only blocks "amazonaws.com" hosts;
change it to allow seeding only when the DSN host is local (e.g., "localhost",
"127.0.0.1", or a unix socket) and reject remote hosts otherwise unless an
explicit force flag is provided. Update the logic that currently inspects
f.DBFlags.DSN to parse the DSN (or extract host) and return an error like
"refusing to seed synthetic data into non-local database; pass --force to
override" when the host is not local and a new boolean force flag (e.g.,
SeedForce or Force) on the same options struct is false; if the force flag is
true, proceed. Reference f.DBFlags.DSN for the DSN source and add/consult the
new force flag on the same config struct used in seed_data.go.
- Around line 593-641: The current logic reduces testableRuns by 2 which causes
tests with counts.total == runCount to miss two real results; instead leave
testableRuns equal to runCount (do not subtract 2) and ensure any per-test loop
bounds use min(counts.total, testableRuns); then create two additional
infra-only runs beyond the runCount (append two extra runIDs and insert
ProwJobRun entries marked as infra failures) so the two infra runs exist without
stealing real test counts—adjust references in this block to testableRuns,
runCount, counts.total, and runIDs accordingly and mark those extra runs via
runsWithFailure as needed.

In `@Makefile`:
- Around line 65-66: The dev target currently only depends on sippy which only
creates a placeholder sippy-ng/build/index.js; update the Makefile so the
frontend is built before running dev (when SERVE=1) by having dev depend on the
frontend build target (or invoke the frontend build step) prior to calling
./scripts/dev-setup.sh; reference the make targets/steps sippy, dev, and the
frontend build (e.g., sippy-ng build target or equivalent) and ensure the real
sippy-ng/build/index.js is produced before ./scripts/dev-setup.sh runs.

In `@scripts/dev-setup.sh`:
- Around line 57-64: Replace the fixed sleep with a readiness loop that waits
for PostgreSQL to accept connections before running ./sippy seed-data;
specifically, in scripts/dev-setup.sh remove the hardcoded sleep 5 and instead
poll the database using a connection check (eg. pg_isready or psql -h localhost
-p "$PSQL_PORT" -U postgres -d postgres -c '\q') in a retry loop with a short
sleep interval and an overall timeout, then proceed to set DSN and run ./sippy
seed-data --init-database --database-dsn="$DSN" only after the check succeeds.

---

Outside diff comments:
In `@scripts/e2e.sh`:
- Around line 85-103: The script uses a brittle fixed sleep before running
GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and ignores
failures; replace the sleep+blind seed with a readiness loop that waits for
PostgreSQL to accept connections (e.g., using pg_isready or attempting a simple
psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a timeout, and
ensure the seed command's exit status is propagated (fail the script on non-zero
exit) so that ./sippy seed-data errors cause an immediate exit instead of
continuing the E2E flow; update references to SIPPY_E2E_DSN, PSQl_PORT, and the
./sippy seed-data invocation accordingly.

---

Nitpick comments:
In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go`:
- Around line 536-558: Replace the duplicated batch fetch/parse block with a
call to the existing helper fetchJobVariants (the same one used by
queryTestStatus): collect the unique job IDs as you already do into ids, then
call fetchJobVariants(ids) and use its returned map to populate jobVariantMap
instead of re-running the raw query and parseVariants; ensure you propagate any
error returned by fetchJobVariants exactly as the current error handling does so
behavior remains unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 191735b7-8fe8-4b8f-82c5-4c968d53009e

📥 Commits

Reviewing files that changed from the base of the PR and between c33fabe and dbe98b2.

📒 Files selected for processing (7)

Makefile
cmd/sippy/seed_data.go
cmd/sippy/serve.go
config/e2e-views.yaml
pkg/api/componentreadiness/dataprovider/postgres/provider.go
scripts/dev-setup.sh
scripts/e2e.sh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-merge-bot · 2026-04-29T11:14:21Z

Scheduling required tests:
/test e2e

Create 2 additional infra-only runs beyond runCount instead of stealing from the test result pool — tests with counts.total == runCount now get all their results. Also refactor fetchJobVariants into fetchJobVariantsByIDs to return errors instead of silently logging, and deduplicate the variant fetching in queryTestDetails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Poll with psql in a retry loop (1s interval, 30s timeout) instead of a hardcoded sleep 5, so seed-data only runs after PostgreSQL is actually accepting connections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

cmd/sippy/seed_data.go (1)
753-759: Sort the variant entries before storing them.

Go map iteration is randomized, so this helper produces a different pq.StringArray order across runs. That weakens the determinism this seed path is trying to provide.
Proposed fix
 func variantMapToArray(m map[string]string) pq.StringArray {
 	result := make([]string, 0, len(m))
 	for k, v := range m {
 		result = append(result, k+":"+v)
 	}
+	sort.Strings(result)
 	return result
 }
Also add sort to the imports.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/sippy/seed_data.go` around lines 753 - 759, variantMapToArray produces
nondeterministic ordering because Go map iteration is randomized; modify
variantMapToArray to collect the map keys, sort them (add "sort" to the
imports), then iterate the sorted keys to build and return the pq.StringArray
(preserving the "key:value" format) so the output order is deterministic across
runs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 633-654: The synthetic infra-only runs (the branch where i >=
runCount that sets overallResult = v1.JobInternalInfrastructureFailure) must
also set the model's InfrastructureFailure column; update the Updates map in the
loop that updates models.ProwJobRun so that when the condition i >= runCount is
true you add "infrastructure_failure": true (and ensure it's false for non-infra
runs), leaving the existing overall_result, succeeded and failed updates intact;
locate this in the for loop over runIDs that uses runsWithFailure and the
Updates(map[string]any{...}) call.
- Around line 567-576: The synthetic run timestamps are being spaced using
runCount but you generate totalRuns (runCount + infraRuns), causing the last
timestamp to fall past the release end; change the interval calculation to use
totalRuns-1 so timestamps lie within [start,end] (i.e. interval := window /
time.Duration(totalRuns-1)) and guard the divide when totalRuns <= 1 (set
interval = 0). Also fix the loop that currently iterates incorrectly (for i :=
range totalRuns) to iterate over the runIDs/slice (for i := range runIDs or for
i := 0; i < totalRuns; i++) so you fill all runs correctly; refer to
releaseTimeWindow, runCount, infraRuns, totalRuns, runIDs and the timestamp
generation block.

In `@scripts/dev-setup.sh`:
- Around line 57-67: The readiness loop currently runs the psql binary on the
host (the `psql -h localhost ...` check using variables `timeout` and
`elapsed`), which fails on machines without a local psql; change the probe to
run inside the DB container instead by replacing the host `psql` invocation with
a container exec call (use `docker exec` or `podman exec`) to run `pg_isready`
or `psql -U postgres -d postgres -c '\q'` in the running DB container; ensure
the loop still uses `timeout`/`elapsed` and exits with the same error behavior
if the container-side probe never succeeds, and prefer an env var (e.g.,
POSTGRES_CONTAINER) or auto-detect docker vs podman to locate the container.
- Around line 83-100: The script builds optional GCS args into the unquoted
GCS_ARGS and then expands it into the ./sippy serve command, which breaks when
$GCS_SA_JSON_PATH contains spaces; change the logic to construct argv safely
using positional parameters: if GCS_SA_JSON_PATH is set, run set -- "$@"
--google-service-account-credential-file "$GCS_SA_JSON_PATH" (or initialize args
via set --) and then invoke ./sippy serve ... "$@" so the optional flag is
passed as separate quoted argv entries; update references to GCS_ARGS,
GCS_SA_JSON_PATH, and the ./sippy serve invocation accordingly.

---

Nitpick comments:
In `@cmd/sippy/seed_data.go`:
- Around line 753-759: variantMapToArray produces nondeterministic ordering
because Go map iteration is randomized; modify variantMapToArray to collect the
map keys, sort them (add "sort" to the imports), then iterate the sorted keys to
build and return the pq.StringArray (preserving the "key:value" format) so the
output order is deterministic across runs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d372fc89-f556-47e3-adc6-71376f2fe10e

📥 Commits

Reviewing files that changed from the base of the PR and between dbe98b2 and 6f4faac.

📒 Files selected for processing (3)

cmd/sippy/seed_data.go
pkg/api/componentreadiness/dataprovider/postgres/provider.go
scripts/dev-setup.sh

✅ Files skipped from review due to trivial changes (1)

pkg/api/componentreadiness/dataprovider/postgres/provider.go

openshift-ci · 2026-04-29T12:09:05Z

@coderabbitai[bot]: changing LGTM is restricted to collaborators

Details

In response to this:

Actionable comments posted: 4

🧹 Nitpick comments (1)

cmd/sippy/seed_data.go (1)
753-759: Sort the variant entries before storing them.

Go map iteration is randomized, so this helper produces a different pq.StringArray order across runs. That weakens the determinism this seed path is trying to provide.
Proposed fix
func variantMapToArray(m map[string]string) pq.StringArray {
	result := make([]string, 0, len(m))
	for k, v := range m {
		result = append(result, k+":"+v)
	}
+	sort.Strings(result)
	return result
}
Also add sort to the imports.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/sippy/seed_data.go` around lines 753 - 759, variantMapToArray produces
nondeterministic ordering because Go map iteration is randomized; modify
variantMapToArray to collect the map keys, sort them (add "sort" to the
imports), then iterate the sorted keys to build and return the pq.StringArray
(preserving the "key:value" format) so the output order is deterministic across
runs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 633-654: The synthetic infra-only runs (the branch where i >=
runCount that sets overallResult = v1.JobInternalInfrastructureFailure) must
also set the model's InfrastructureFailure column; update the Updates map in the
loop that updates models.ProwJobRun so that when the condition i >= runCount is
true you add "infrastructure_failure": true (and ensure it's false for non-infra
runs), leaving the existing overall_result, succeeded and failed updates intact;
locate this in the for loop over runIDs that uses runsWithFailure and the
Updates(map[string]any{...}) call.
- Around line 567-576: The synthetic run timestamps are being spaced using
runCount but you generate totalRuns (runCount + infraRuns), causing the last
timestamp to fall past the release end; change the interval calculation to use
totalRuns-1 so timestamps lie within [start,end] (i.e. interval := window /
time.Duration(totalRuns-1)) and guard the divide when totalRuns <= 1 (set
interval = 0). Also fix the loop that currently iterates incorrectly (for i :=
range totalRuns) to iterate over the runIDs/slice (for i := range runIDs or for
i := 0; i < totalRuns; i++) so you fill all runs correctly; refer to
releaseTimeWindow, runCount, infraRuns, totalRuns, runIDs and the timestamp
generation block.

In `@scripts/dev-setup.sh`:
- Around line 57-67: The readiness loop currently runs the psql binary on the
host (the `psql -h localhost ...` check using variables `timeout` and
`elapsed`), which fails on machines without a local psql; change the probe to
run inside the DB container instead by replacing the host `psql` invocation with
a container exec call (use `docker exec` or `podman exec`) to run `pg_isready`
or `psql -U postgres -d postgres -c '\q'` in the running DB container; ensure
the loop still uses `timeout`/`elapsed` and exits with the same error behavior
if the container-side probe never succeeds, and prefer an env var (e.g.,
POSTGRES_CONTAINER) or auto-detect docker vs podman to locate the container.
- Around line 83-100: The script builds optional GCS args into the unquoted
GCS_ARGS and then expands it into the ./sippy serve command, which breaks when
$GCS_SA_JSON_PATH contains spaces; change the logic to construct argv safely
using positional parameters: if GCS_SA_JSON_PATH is set, run set -- "$@"
--google-service-account-credential-file "$GCS_SA_JSON_PATH" (or initialize args
via set --) and then invoke ./sippy serve ... "$@" so the optional flag is
passed as separate quoted argv entries; update references to GCS_ARGS,
GCS_SA_JSON_PATH, and the ./sippy serve invocation accordingly.

---

Nitpick comments:
In `@cmd/sippy/seed_data.go`:
- Around line 753-759: variantMapToArray produces nondeterministic ordering
because Go map iteration is randomized; modify variantMapToArray to collect the
map keys, sort them (add "sort" to the imports), then iterate the sorted keys to
build and return the pq.StringArray (preserving the "key:value" format) so the
output order is deterministic across runs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d372fc89-f556-47e3-adc6-71376f2fe10e

📥 Commits

Reviewing files that changed from the base of the PR and between dbe98b2 and 6f4faac.

📒 Files selected for processing (3)

cmd/sippy/seed_data.go
pkg/api/componentreadiness/dataprovider/postgres/provider.go
scripts/dev-setup.sh

✅ Files skipped from review due to trivial changes (1)

pkg/api/componentreadiness/dataprovider/postgres/provider.go

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-merge-bot · 2026-04-29T12:21:00Z

Scheduling required tests:
/test e2e

openshift-merge-bot · 2026-04-29T17:59:50Z

Scheduling required tests:
/test e2e

Replace 'sippy load' (slow GCS import) with 'sippy seed-data' for CI e2e tests. This seeds deterministic synthetic data instead of importing real data from GCS, which is much faster and provides the release data (4.22, 4.21, etc.) that the tests expect. Also adds --data-provider postgres to the serve command, a wait-for-API health check after port-forward setup, and cache priming for component readiness views before running tests. GCS credentials are still provisioned for BigQuery-dependent tests (e.g. TestRegressionCacheLoader). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh (1)

329-338: Quote the $seed_pod variable to prevent word splitting.

While pod names typically don't contain spaces, quoting is a best practice that prevents potential issues.

♻️ Proposed fix

-seed_pod=$(${KUBECTL_CMD} -n sippy-e2e get pod --selector=job-name=sippy-seed-job --output=jsonpath='{.items[0].metadata.name}')
-${KUBECTL_CMD} -n sippy-e2e logs ${seed_pod} > ${ARTIFACT_DIR}/sippy-seed.log 2>&1
+seed_pod=$(${KUBECTL_CMD} -n sippy-e2e get pod --selector=job-name=sippy-seed-job --output=jsonpath='{.items[0].metadata.name}')
+${KUBECTL_CMD} -n sippy-e2e logs "${seed_pod}" > "${ARTIFACT_DIR}/sippy-seed.log" 2>&1
 
 if [ ${seedRetVal} -ne 0 ]; then
   echo
   echo "=== SIPPY SEED JOB FAILURE DIAGNOSTICS ==="
   echo "=== Job status ==="
   ${KUBECTL_CMD} -n sippy-e2e describe job sippy-seed-job
   echo "=== Job pod status ==="
-  ${KUBECTL_CMD} -n sippy-e2e describe pod ${seed_pod}
+  ${KUBECTL_CMD} -n sippy-e2e describe pod "${seed_pod}"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh` around lines 329 - 338,
The variable seed_pod is used unquoted when passed to kubectl and describe
commands, risking word-splitting; update usages of ${seed_pod} to be quoted
(e.g., "${seed_pod}") wherever it's expanded (lines calling ${KUBECTL_CMD} -n
sippy-e2e logs ${seed_pod} and describe pod ${seed_pod}), and also quote other
expansions like ${ARTIFACT_DIR} and the kubectl command expansions where
appropriate to follow best practices (retain variable names KUBECTL_CMD,
ARTIFACT_DIR, seedRetVal, and sippy-seed-job for locating the spots).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh`:
- Around line 329-338: The variable seed_pod is used unquoted when passed to
kubectl and describe commands, risking word-splitting; update usages of
${seed_pod} to be quoted (e.g., "${seed_pod}") wherever it's expanded (lines
calling ${KUBECTL_CMD} -n sippy-e2e logs ${seed_pod} and describe pod
${seed_pod}), and also quote other expansions like ${ARTIFACT_DIR} and the
kubectl command expansions where appropriate to follow best practices (retain
variable names KUBECTL_CMD, ARTIFACT_DIR, seedRetVal, and sippy-seed-job for
locating the spots).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2699b215-65fb-4dc5-aeef-bac7f0417ee8

📥 Commits

Reviewing files that changed from the base of the PR and between 6f4faac and 9513902.

📒 Files selected for processing (6)

cmd/sippy/seed_data.go
e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh
e2e-scripts/sippy-e2e-sippy-e2e-test-commands.sh
scripts/dev-setup.sh
scripts/e2e.sh
test/e2e/util/e2erequest.go

✅ Files skipped from review due to trivial changes (2)

test/e2e/util/e2erequest.go
cmd/sippy/seed_data.go

openshift-merge-bot · 2026-04-29T19:57:20Z

Scheduling required tests:
/test e2e

openshift-ci · 2026-04-29T20:07:20Z

@stbenjam: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e	`9513902`	link	true	`/test e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

stbenjam and others added 2 commits April 29, 2026 06:35

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 29, 2026

openshift-ci Bot requested review from petr-muller and xueqzhan April 29, 2026 11:03

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 29, 2026

stbenjam and others added 2 commits April 29, 2026 07:03

Use non-default metrics port in dev-setup to avoid conflicts

a8be842

The default metrics port 12112 may already be in use by a running sippy instance. Use 22112 by default, overridable via SIPPY_METRICS_PORT. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert "Use non-default metrics port in dev-setup to avoid conflicts"

2f5a0c6

This reverts commit a8be842.

stbenjam commented Apr 29, 2026

View reviewed changes

Comment thread scripts/dev-setup.sh

Copy link
Copy Markdown

Member Author

stbenjam Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely to be replaced by #3476

coderabbitai Bot requested changes Apr 29, 2026

View reviewed changes

Comment thread cmd/sippy/seed_data.go

Comment thread cmd/sippy/seed_data.go Outdated

Comment thread Makefile

Comment thread scripts/dev-setup.sh

stbenjam and others added 2 commits April 29, 2026 07:58

Replace fixed sleep with PostgreSQL readiness check in dev-setup

6f4faac

Poll with psql in a retry loop (1s interval, 30s timeout) instead of a hardcoded sleep 5, so seed-data only runs after PostgreSQL is actually accepting connections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

stbenjam mentioned this pull request Apr 29, 2026

NO-JIRA: Add a dev container setup and a basic MCP server #3476

Merged

coderabbitai Bot requested changes Apr 29, 2026

View reviewed changes

Comment thread cmd/sippy/seed_data.go

Comment thread cmd/sippy/seed_data.go

Comment thread scripts/dev-setup.sh

Comment thread scripts/dev-setup.sh Outdated

Address coderabbit comments

377968a

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

Conversation

stbenjam commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

openshift-merge-bot Bot commented Apr 29, 2026

Uh oh!

openshift-ci-robot commented Apr 29, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks failed

❌ Failed checks (1 error, 2 warnings)

Uh oh!

openshift-ci Bot commented Apr 29, 2026

Uh oh!

stbenjam Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci Bot commented Apr 29, 2026

Uh oh!

openshift-merge-bot Bot commented Apr 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci Bot commented Apr 29, 2026

Uh oh!

openshift-merge-bot Bot commented Apr 29, 2026

Uh oh!

openshift-merge-bot Bot commented Apr 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-merge-bot Bot commented Apr 29, 2026

Uh oh!

openshift-ci Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stbenjam commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Apr 29, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading