TRT-2642: Add PostgreSQL data provider, seed data, and dev setup for Component Readiness#3481
TRT-2642: Add PostgreSQL data provider, seed data, and dev setup for Component Readiness#3481stbenjam wants to merge 8 commits intoopenshift:mainfrom
Conversation
Adds a PostgreSQL-based DataProvider implementation for Component Readiness, enabling local development and testing without BigQuery. The seed-data command creates deterministic test data covering all CR statuses (regression, improvement, missing, fallback, etc.) and syncs regressions. The e2e script uses seed data with the postgres provider, removing the BigQuery credential requirement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds scripts/dev-setup.sh which stands up PostgreSQL and Redis containers seeded with deterministic Component Readiness data. Starting the sippy server is optional (--serve flag) so developers using devcontainers can connect to the database directly. Accessible via: make dev (seed only) or make dev SERVE=1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@stbenjam: This pull request references TRT-2642 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds a Postgres-backed component-readiness data provider, deterministic synthetic DB seeding for local development, a dev Makefile target and a Podman/Docker dev-setup script to run Postgres+Redis and seed data, updates the serve command to accept Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Sippy as Sippy Serve
participant Provider as PostgresProvider
participant Cache as Redis Cache
participant DB as PostgreSQL DB
Client->>Sippy: GET /api/component_readiness
Sippy->>Provider: QueryJobVariants / QueryReleases / QueryBaseTestStatus
Provider->>Cache: Check cache for query
alt Cache hit
Cache-->>Provider: Cached result
else Cache miss
Provider->>DB: SQL queries (prow_jobs, prow_job_runs, test_results)
DB-->>Provider: Rows
Provider->>Provider: Aggregate, filter by variants, compute statuses
Provider->>Cache: Store computed results
end
Provider-->>Sippy: TestStatus / JobRunStats
Sippy-->>Client: JSON response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: stbenjam The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The default metrics port 12112 may already be in use by a running sippy instance. Use 22112 by default, overridable via SIPPY_METRICS_PORT. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This reverts commit a8be842.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/e2e.sh (1)
85-103:⚠️ Potential issue | 🟠 MajorWait for PostgreSQL before seeding, and fail if seeding does.
The fixed
sleep 5is brittle here. When Postgres is still starting,./sippy seed-datacan fail and the script keeps going because that exit status is ignored. That leaves the rest of E2E booting against a half-initialized DB and turns startup timing into test flake.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/e2e.sh` around lines 85 - 103, The script uses a brittle fixed sleep before running GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and ignores failures; replace the sleep+blind seed with a readiness loop that waits for PostgreSQL to accept connections (e.g., using pg_isready or attempting a simple psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a timeout, and ensure the seed command's exit status is propagated (fail the script on non-zero exit) so that ./sippy seed-data errors cause an immediate exit instead of continuing the E2E flow; update references to SIPPY_E2E_DSN, PSQl_PORT, and the ./sippy seed-data invocation accordingly.
🧹 Nitpick comments (1)
pkg/api/componentreadiness/dataprovider/postgres/provider.go (1)
536-558: ReusefetchJobVariantshere instead of duplicating the lookup/parsing block.This is the same batch
prow_jobsvariant fetch you already implemented forqueryTestStatus, so keeping a second copy here makes the two paths easy to drift apart. Generalizing the helper to accept job IDs would keep the provider behavior aligned. As per coding guidelines, "Check pkg/util/ for existing helper functions before adding inline utility logic. Avoid calling the same utility function multiple times with identical arguments in the same code path."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go` around lines 536 - 558, Replace the duplicated batch fetch/parse block with a call to the existing helper fetchJobVariants (the same one used by queryTestStatus): collect the unique job IDs as you already do into ids, then call fetchJobVariants(ids) and use its returned map to populate jobVariantMap instead of re-running the raw query and parseVariants; ensure you propagate any error returned by fetchJobVariants exactly as the current error handling does so behavior remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 66-68: The current DSN check only blocks "amazonaws.com" hosts;
change it to allow seeding only when the DSN host is local (e.g., "localhost",
"127.0.0.1", or a unix socket) and reject remote hosts otherwise unless an
explicit force flag is provided. Update the logic that currently inspects
f.DBFlags.DSN to parse the DSN (or extract host) and return an error like
"refusing to seed synthetic data into non-local database; pass --force to
override" when the host is not local and a new boolean force flag (e.g.,
SeedForce or Force) on the same options struct is false; if the force flag is
true, proceed. Reference f.DBFlags.DSN for the DSN source and add/consult the
new force flag on the same config struct used in seed_data.go.
- Around line 593-641: The current logic reduces testableRuns by 2 which causes
tests with counts.total == runCount to miss two real results; instead leave
testableRuns equal to runCount (do not subtract 2) and ensure any per-test loop
bounds use min(counts.total, testableRuns); then create two additional
infra-only runs beyond the runCount (append two extra runIDs and insert
ProwJobRun entries marked as infra failures) so the two infra runs exist without
stealing real test counts—adjust references in this block to testableRuns,
runCount, counts.total, and runIDs accordingly and mark those extra runs via
runsWithFailure as needed.
In `@Makefile`:
- Around line 65-66: The dev target currently only depends on sippy which only
creates a placeholder sippy-ng/build/index.js; update the Makefile so the
frontend is built before running dev (when SERVE=1) by having dev depend on the
frontend build target (or invoke the frontend build step) prior to calling
./scripts/dev-setup.sh; reference the make targets/steps sippy, dev, and the
frontend build (e.g., sippy-ng build target or equivalent) and ensure the real
sippy-ng/build/index.js is produced before ./scripts/dev-setup.sh runs.
In `@scripts/dev-setup.sh`:
- Around line 57-64: Replace the fixed sleep with a readiness loop that waits
for PostgreSQL to accept connections before running ./sippy seed-data;
specifically, in scripts/dev-setup.sh remove the hardcoded sleep 5 and instead
poll the database using a connection check (eg. pg_isready or psql -h localhost
-p "$PSQL_PORT" -U postgres -d postgres -c '\q') in a retry loop with a short
sleep interval and an overall timeout, then proceed to set DSN and run ./sippy
seed-data --init-database --database-dsn="$DSN" only after the check succeeds.
---
Outside diff comments:
In `@scripts/e2e.sh`:
- Around line 85-103: The script uses a brittle fixed sleep before running
GOCOVERDIR="$COVDIR" ./sippy seed-data against the database and ignores
failures; replace the sleep+blind seed with a readiness loop that waits for
PostgreSQL to accept connections (e.g., using pg_isready or attempting a simple
psql/pg_connect against SIPPY_E2E_DSN/localhost:$PSQL_PORT) with a timeout, and
ensure the seed command's exit status is propagated (fail the script on non-zero
exit) so that ./sippy seed-data errors cause an immediate exit instead of
continuing the E2E flow; update references to SIPPY_E2E_DSN, PSQl_PORT, and the
./sippy seed-data invocation accordingly.
---
Nitpick comments:
In `@pkg/api/componentreadiness/dataprovider/postgres/provider.go`:
- Around line 536-558: Replace the duplicated batch fetch/parse block with a
call to the existing helper fetchJobVariants (the same one used by
queryTestStatus): collect the unique job IDs as you already do into ids, then
call fetchJobVariants(ids) and use its returned map to populate jobVariantMap
instead of re-running the raw query and parseVariants; ensure you propagate any
error returned by fetchJobVariants exactly as the current error handling does so
behavior remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 191735b7-8fe8-4b8f-82c5-4c968d53009e
📒 Files selected for processing (7)
Makefilecmd/sippy/seed_data.gocmd/sippy/serve.goconfig/e2e-views.yamlpkg/api/componentreadiness/dataprovider/postgres/provider.goscripts/dev-setup.shscripts/e2e.sh
|
@coderabbitai[bot]: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Scheduling required tests: |
Create 2 additional infra-only runs beyond runCount instead of stealing from the test result pool — tests with counts.total == runCount now get all their results. Also refactor fetchJobVariants into fetchJobVariantsByIDs to return errors instead of silently logging, and deduplicate the variant fetching in queryTestDetails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Poll with psql in a retry loop (1s interval, 30s timeout) instead of a hardcoded sleep 5, so seed-data only runs after PostgreSQL is actually accepting connections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
cmd/sippy/seed_data.go (1)
753-759: Sort the variant entries before storing them.Go map iteration is randomized, so this helper produces a different
pq.StringArrayorder across runs. That weakens the determinism this seed path is trying to provide.Proposed fix
func variantMapToArray(m map[string]string) pq.StringArray { result := make([]string, 0, len(m)) for k, v := range m { result = append(result, k+":"+v) } + sort.Strings(result) return result }Also add
sortto the imports.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/sippy/seed_data.go` around lines 753 - 759, variantMapToArray produces nondeterministic ordering because Go map iteration is randomized; modify variantMapToArray to collect the map keys, sort them (add "sort" to the imports), then iterate the sorted keys to build and return the pq.StringArray (preserving the "key:value" format) so the output order is deterministic across runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/sippy/seed_data.go`:
- Around line 633-654: The synthetic infra-only runs (the branch where i >=
runCount that sets overallResult = v1.JobInternalInfrastructureFailure) must
also set the model's InfrastructureFailure column; update the Updates map in the
loop that updates models.ProwJobRun so that when the condition i >= runCount is
true you add "infrastructure_failure": true (and ensure it's false for non-infra
runs), leaving the existing overall_result, succeeded and failed updates intact;
locate this in the for loop over runIDs that uses runsWithFailure and the
Updates(map[string]any{...}) call.
- Around line 567-576: The synthetic run timestamps are being spaced using
runCount but you generate totalRuns (runCount + infraRuns), causing the last
timestamp to fall past the release end; change the interval calculation to use
totalRuns-1 so timestamps lie within [start,end] (i.e. interval := window /
time.Duration(totalRuns-1)) and guard the divide when totalRuns <= 1 (set
interval = 0). Also fix the loop that currently iterates incorrectly (for i :=
range totalRuns) to iterate over the runIDs/slice (for i := range runIDs or for
i := 0; i < totalRuns; i++) so you fill all runs correctly; refer to
releaseTimeWindow, runCount, infraRuns, totalRuns, runIDs and the timestamp
generation block.
In `@scripts/dev-setup.sh`:
- Around line 57-67: The readiness loop currently runs the psql binary on the
host (the `psql -h localhost ...` check using variables `timeout` and
`elapsed`), which fails on machines without a local psql; change the probe to
run inside the DB container instead by replacing the host `psql` invocation with
a container exec call (use `docker exec` or `podman exec`) to run `pg_isready`
or `psql -U postgres -d postgres -c '\q'` in the running DB container; ensure
the loop still uses `timeout`/`elapsed` and exits with the same error behavior
if the container-side probe never succeeds, and prefer an env var (e.g.,
POSTGRES_CONTAINER) or auto-detect docker vs podman to locate the container.
- Around line 83-100: The script builds optional GCS args into the unquoted
GCS_ARGS and then expands it into the ./sippy serve command, which breaks when
$GCS_SA_JSON_PATH contains spaces; change the logic to construct argv safely
using positional parameters: if GCS_SA_JSON_PATH is set, run set -- "$@"
--google-service-account-credential-file "$GCS_SA_JSON_PATH" (or initialize args
via set --) and then invoke ./sippy serve ... "$@" so the optional flag is
passed as separate quoted argv entries; update references to GCS_ARGS,
GCS_SA_JSON_PATH, and the ./sippy serve invocation accordingly.
---
Nitpick comments:
In `@cmd/sippy/seed_data.go`:
- Around line 753-759: variantMapToArray produces nondeterministic ordering
because Go map iteration is randomized; modify variantMapToArray to collect the
map keys, sort them (add "sort" to the imports), then iterate the sorted keys to
build and return the pq.StringArray (preserving the "key:value" format) so the
output order is deterministic across runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: d372fc89-f556-47e3-adc6-71376f2fe10e
📒 Files selected for processing (3)
cmd/sippy/seed_data.gopkg/api/componentreadiness/dataprovider/postgres/provider.goscripts/dev-setup.sh
✅ Files skipped from review due to trivial changes (1)
- pkg/api/componentreadiness/dataprovider/postgres/provider.go
|
@coderabbitai[bot]: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Scheduling required tests: |
|
Scheduling required tests: |
Replace 'sippy load' (slow GCS import) with 'sippy seed-data' for CI e2e tests. This seeds deterministic synthetic data instead of importing real data from GCS, which is much faster and provides the release data (4.22, 4.21, etc.) that the tests expect. Also adds --data-provider postgres to the serve command, a wait-for-API health check after port-forward setup, and cache priming for component readiness views before running tests. GCS credentials are still provisioned for BigQuery-dependent tests (e.g. TestRegressionCacheLoader). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh (1)
329-338: Quote the$seed_podvariable to prevent word splitting.While pod names typically don't contain spaces, quoting is a best practice that prevents potential issues.
♻️ Proposed fix
-seed_pod=$(${KUBECTL_CMD} -n sippy-e2e get pod --selector=job-name=sippy-seed-job --output=jsonpath='{.items[0].metadata.name}') -${KUBECTL_CMD} -n sippy-e2e logs ${seed_pod} > ${ARTIFACT_DIR}/sippy-seed.log 2>&1 +seed_pod=$(${KUBECTL_CMD} -n sippy-e2e get pod --selector=job-name=sippy-seed-job --output=jsonpath='{.items[0].metadata.name}') +${KUBECTL_CMD} -n sippy-e2e logs "${seed_pod}" > "${ARTIFACT_DIR}/sippy-seed.log" 2>&1 if [ ${seedRetVal} -ne 0 ]; then echo echo "=== SIPPY SEED JOB FAILURE DIAGNOSTICS ===" echo "=== Job status ===" ${KUBECTL_CMD} -n sippy-e2e describe job sippy-seed-job echo "=== Job pod status ===" - ${KUBECTL_CMD} -n sippy-e2e describe pod ${seed_pod} + ${KUBECTL_CMD} -n sippy-e2e describe pod "${seed_pod}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh` around lines 329 - 338, The variable seed_pod is used unquoted when passed to kubectl and describe commands, risking word-splitting; update usages of ${seed_pod} to be quoted (e.g., "${seed_pod}") wherever it's expanded (lines calling ${KUBECTL_CMD} -n sippy-e2e logs ${seed_pod} and describe pod ${seed_pod}), and also quote other expansions like ${ARTIFACT_DIR} and the kubectl command expansions where appropriate to follow best practices (retain variable names KUBECTL_CMD, ARTIFACT_DIR, seedRetVal, and sippy-seed-job for locating the spots).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@e2e-scripts/sippy-e2e-sippy-e2e-setup-commands.sh`:
- Around line 329-338: The variable seed_pod is used unquoted when passed to
kubectl and describe commands, risking word-splitting; update usages of
${seed_pod} to be quoted (e.g., "${seed_pod}") wherever it's expanded (lines
calling ${KUBECTL_CMD} -n sippy-e2e logs ${seed_pod} and describe pod
${seed_pod}), and also quote other expansions like ${ARTIFACT_DIR} and the
kubectl command expansions where appropriate to follow best practices (retain
variable names KUBECTL_CMD, ARTIFACT_DIR, seedRetVal, and sippy-seed-job for
locating the spots).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2699b215-65fb-4dc5-aeef-bac7f0417ee8
📒 Files selected for processing (6)
cmd/sippy/seed_data.goe2e-scripts/sippy-e2e-sippy-e2e-setup-commands.she2e-scripts/sippy-e2e-sippy-e2e-test-commands.shscripts/dev-setup.shscripts/e2e.shtest/e2e/util/e2erequest.go
✅ Files skipped from review due to trivial changes (2)
- test/e2e/util/e2erequest.go
- cmd/sippy/seed_data.go
|
Scheduling required tests: |
|
@stbenjam: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
dataprovider.DataProviderinterface for Component Readiness, enabling local development and e2e testing without BigQueryscripts/e2e.shto use the postgres provider and seed data, making GCS credentials optionalscripts/dev-setup.shandmake devtarget to stand up a seeded PostgreSQL + Redis environment for local development (optionally starts the sippy server with--serve/SERVE=1)NOTE We are not at a place where we could run real component readiness against prod's database, it is too much data, but postgres with seeded data allows extremely rapid deployment against controlled data. Should make dev experience nicer, and Claude more successful once I add more tests in a follow-up PR.
Test plan
make devseeds database and prints connection infomake dev SERVE=1seeds and starts sippy servermake e2epasses with postgres data provider🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Chores