chore(deps): update ghcr.io/cloudnative-pg/cloudnative-pg docker tag to v1.30.0#56
Open
renovate[bot] wants to merge 1 commit into
Open
chore(deps): update ghcr.io/cloudnative-pg/cloudnative-pg docker tag to v1.30.0#56renovate[bot] wants to merge 1 commit into
renovate[bot] wants to merge 1 commit into
Conversation
bce835b to
01bc70b
Compare
abd2f8e to
876bf90
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
1.29.0→1.30.0Warning
Some dependencies could not be looked up. Check the Dependency Dashboard for more information.
Release Notes
cloudnative-pg/cloudnative-pg (ghcr.io/cloudnative-pg/cloudnative-pg)
v1.30.0Compare Source
Release date: Jun 29, 2026
Important changes
Updated the deprecation notice for native (in-tree) Barman Cloud support to reflect that it will now be removed in CloudNativePG 1.31.0, rather than 1.30.0. Users are still encouraged to migrate to the Barman Cloud Plugin. (#11083)
The
clusterreference is now immutable on theDatabase,Pooler,Publication,Subscription, andScheduledBackupresources. Pointing one of these objects at a different cluster has no well-defined semantics and previously left the controllers in an inconsistent state; the update is now rejected at the API server via a CEL validation rule. (#10743)Features
Primary
Leasefor safe primary election: introduced a KubernetesLeaseobject (named after the cluster) that acts as a mutex serializing primary promotion: the instance manager must hold the lease before acting as primary and releases it on clean shutdown so replicas can promote without waiting for the full TTL. Timings are configurable via the new.spec.primaryLeasestanza. The lease is a promotion gate, not a fence. Primary isolation remains responsible for fencing. (#10627)DatabaseRoleCRD for declarative role management: introduced aDatabaseRolecustom resource that manages a PostgreSQL role as a standalone Kubernetes object, instead of declaring it inline in theCluster's.spec.managed.rolesstanza. Each role gets its own lifecycle, status, and RBAC, which suits GitOps workflows and lets role definitions live next to the applications that own them. The spec reuses the sameRoleConfigurationstructure as the inline method, so migrating a role is a matter of moving the stanza into its own manifest. AdatabaseRoleReclaimPolicyfield (retain, the default, ordelete) controls what happens to the role when the resource is deleted, mirroring persistent volumes. (#6155)TLS client certificates for declarative roles: a
DatabaseRolecan now include aclientCertificateblock to have the operator automatically generate and renew a TLS client certificate, signed by the cluster's client CA and stored in a<databaserole-name>-client-certSecret. This enables password-free PostgreSQLcertauthentication; the Secret is cleaned up when the feature is disabled or theDatabaseRoleis deleted. (#10896)PgBouncer image management via image catalogs: the
Poolerresource can now reference an entry in anImageCatalogorClusterImageCatalogthrough the newspec.pgbouncer.imageCatalogReffield, centralizing PgBouncer image management. When a catalog entry is updated, all referencingPoolersare automatically reconciled and roll out the new image without any change to their spec. The resolved image is reported instatus.image, and a newstatus.phase(active,paused,inactive, orfailed), also surfaced as aPhasecolumn inkubectl get pooler, summarizes the lifecycle. (#10568)Enhancements
Enabled
pg_upgradein-place major upgrades to PostgreSQL 19 or later for clusters that use Image Volume extensions, building on the extension-path support added topg_upgradein PostgreSQL 19. During the upgradeJob, the source- and target-version extension images are mounted side by side, so the old server keeps its libraries and a failed upgrade reverts cleanly. (#10366)Added TLS support for the
Poolermetrics endpoint via.spec.monitoring.tls.enabled. When enabled, the metrics server is served over HTTPS, reusing the certificate and key from.spec.pgbouncer.clientTLSSecretand reloading it on every handshake to support rotation without a restart; the generatedPodMonitorscrapes overhttpsaccordingly. (#10466)Added a label selector to the
Clusterscale subresource (status.selector), making aClustera validtargetReffor the Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA), which can now map aClusterto its instance pods. Contributed by @sebv004. (#8996)The operator now emits a
WarningPrimaryStatusCheckFailedevent on theClusterwhen the primary pod isReadyfrom the kubelet perspective but the operator's/pg/statuscheck fails and failover is deferred, giving users visibility into the deferral viakubectl describe cluster. (#10509)Added the
ENABLE_WEBHOOK_NAMESPACE_SUFFIXflag, which suffixes the operator's webhook configuration names with-<OPERATOR_NAMESPACE>so that multiple operator instances can coexist on the same cluster. The operator only looks up these configurations; users must create and maintain them. Contributed by @maxlengdell. (#10420)The operator now reloads a CNPG-i plugin automatically when its pods are rolled: it watches the
EndpointSlicesbacking pluginServicesand re-enqueues every cluster using the plugin once the new pods becomeReady, so an upgraded plugin is picked up without waiting for the next resync. (#10836)Instance serial numbers are now assigned by reusing the lowest free slot among existing instance names, instead of always incrementing a global counter. Pod and PVC names stay stable across instance recreation (for example, an instance recreated after a node drain comes back with the same name), and serials freed by deleted instances are reclaimed. A new
Initializedcluster condition reports whether the cluster has completed its first bootstrap, andstatus.latestGeneratedNodeis deprecated: it is no longer written, but is preserved on the CRD for backward compatibility. (#10548)Defaulting and validation now run during reconciliation as a fallback when admission webhooks are unavailable, or configured to ignore failures, so the operator no longer reconciles invalid or incomplete specs. Missing defaults are applied directly, and validation failures are surfaced in the resource status instead of failing silently later. (#10874)
Security
CVE-2026-55769/GHSA-x8c2-3p4r-v9r6:search_pathpinning on operator-issued connections: a database owner could plant overloaded built-in operators in thepublicschema and alter thesearch_pathso that operator introspection probes, running as the cluster superuser, resolved those overloads beforepg_catalog, aCWE-426privilege-escalation chain (same class asCVE-2018-1058) that could lead to in-pod RCE viaCOPY ... FROM PROGRAM. The operator now pinssearch_path = pg_catalog, public, pg_tempon every pooled connection so it ships in the startup message and takes precedence over tenant-controlled defaults. (#10774,GHSA-x8c2-3p4r-v9r6)GHSA-7qwx-x8ff-3px9: authenticated operator-to-instance-manager calls: the instance manager's remote webserver relied on network isolation rather than authentication for its operator-only control endpoints, so any party able to reach the pod's status port could invoke them, disrupting backup orchestration and WAL archival and reading operational metadata. (The upgrade endpoint is SHA-256-pinned, so this did not permit arbitrary code execution.) The operator now generates an in-memory ECDSA P-256 client certificate at startup and reconciles its SHA-256 fingerprint into the cluster status; the instance manager rejects requests to sensitive endpoints that do not present a matching certificate. This hardening is not backported; earlier releases should continue to restrict the status port with aNetworkPolicy. (#10579,GHSA-7qwx-x8ff-3px9)CVE-2026-55765/GHSA-w3gf-xc94-wvmj: operator-side SCRAM-SHA-256 password encoding: the operator now SCRAM-SHA-256 encodes cleartext role passwords before issuingCREATE/ALTER ROLE ... PASSWORD, so the literal PostgreSQL parses (and that extensions such aspg_stat_statementsorpgauditmay capture) is the SCRAM verifier rather than the cleartext secret. Pre-hashed (MD5 or SCRAM) values are forwarded unchanged, and the per-Secret annotationcnpg.io/passwordPassthrough: "enabled"opts out. (#10724,GHSA-w3gf-xc94-wvmj)Changes
Added support for Kubernetes 1.36. (#10900)
Updated the default PostgreSQL version to 18.4. (#10719)
Updated the Kubernetes versions used to test the operator on public cloud providers. (#10720, #10563, #11033)
Fixes
Fixed declarative
Database,Publication, andSubscriptionobjects reporting a stale primary-side status forever after their cluster was demoted to a replica; the controller now re-checks the replica condition and watches theClusterso a demotion is detected promptly. (#10871)Fixed non-sequential pod names (for example
-1,-3) caused by the instance serial counter being advanced before the correspondingJoband PVCs were created; the bump is now persisted only after those resources exist. (#10491)Fixed deletion of a
Database,Publication, orSubscriptiongetting stuck inTerminatingon a replica cluster, where the replica gate ran before the finalizer reconciler and the finalizer was never released. On a replica the PostgreSQL object is left to the primary cluster. (#10853)Fixed a conflicting duplicate
DatabaseorSubscriptionwith adeletereclaim policy dropping the PostgreSQL object owned by the surviving CR; the drop is now gated on a recorded reconciliation. (#10870)Fixed the
postgressuperuser being left locked out after superuser access was disabled and then re-enabled, because the cached secret version was not invalidated and the password was never re-applied. Diagnosed by @mhartmann-jaconi. (#10834)Fixed backups getting stuck in the
startedphase when the instance manager running them was restarted (for example by the in-place upgrade following an operator upgrade) before the backup reachedrunning; the reconciliation is now rescheduled so the lost session is detected. (#10859)Fixed
exec/attachstreaming to negotiate WebSocket with a SPDY fallback, restoring compatibility both with Kubernetes versions that have removed SPDY and with platforms such as OpenShift that reject WebSocket exec upgrades. Contributed by @bartscheers. (#10876, #10933)Fixed resource leaks when concurrent
Backupobjects raced: backups now run in strict creation-time order, so an already-executing backup is never preempted by a newer one and its replication slot and PostgreSQL session are no longer orphaned on the primary. Contributed by @GabriFedi97. (#10747)Fixed role reconciliation clearing the password on a PostgreSQL role when the referenced Secret could not be fetched; the role is now left untouched until the Secret becomes available, and per-action errors are aggregated for better visibility. (#10053)
Fixed a bootstrap failure where a metrics-exporter setup error (commonly a duplicate-key race with the controller) rolled back
streaming_replicacreation and wedged replica joins. The metrics-exporter step now runs in a separate transaction. Contributed by @BlaiseAntony. (#10749)Fixed a
ScheduledBackupcontroller loop that occurred when aBackupwas created but its status patch never landed; the controller now adopts an existingBackupfor the next iteration instead of looping onAlreadyExists. (#10612)Fixed a nil-pointer panic when reconciling a
PoolerwhoseClusterhas been deleted. (#10667)Fixed bootstrap log handling so that all named log pipes (
postgres,postgres.csv, andpostgres.json) get consumers duringWithActiveInstance, preventing regular files from being created in place of the named pipes. (#10043)Fixed generation of invalid IPv6 URLs by wrapping the address in square brackets. Contributed by @Infinoid. (#10682)
Fixed an external cluster plugin still being treated as active when its configuration set
enabled: false. (#10932)Fixed a race during bootstrap recovery from an object store where the restore job could read a stale
Cluster(primary not yet recorded and timeline still unset) and have its.historyfiles rejected by the split-brain guard. When this happened, recovery stopped at the base backup's timeline and silently dropped transactions committed on later timelines. History files are now allowed while the cluster timeline is unset. Contributed by @dennispidun. (#10818)Fixed a race where deleting an instance's PVCs could leave the instance permanently stuck: if the data PVC was removed while a WAL PVC was still terminating, the operator recreated the instance bound to the terminating volume, leaving the Pod unschedulable and blocking all further reconciliation. The operator now waits for terminating PVCs to be fully removed before recreating or reattaching an instance, and surfaces the wait through a log line and the cluster phase. (#11017)
Fixed a cache race during cluster creation when the server and client CA resolve to the same Secret (the default): a stale informer cache triggered a redundant
Createthat failed withAlreadyExistsand could leave the cluster stuck inUnable to create required cluster objects. The operator now reuses the already-fetched CA Secret when the names match. (#10989)Fixed the
pg_basebackupbootstrap path overwriting or failing on a pre-existingPGDATA(for example after a replica Pod restart) by enforcing the same pre-flight directory check already applied by the other bootstrap methods; this also protects statically provisioned PVCs from being silently overwritten. (#11006)Fixed the
Clusterphase flapping betweenHealthyand a plugin-failure phase when a post-reconcile plugin hook returned an error; theHealthyphase is now registered as the last step of a successful reconciliation, so a loop that ends in a plugin error never reportsHealthy. Contributed by @GabriFedi97. (#10421)Fixed stale certificate data and partial reads after an external server's Secret was rotated (for example a CA bundle shrinking from two certificates to one): the file is now written atomically, so libpq always reads either the old or the new value, never a mix. Contributed by @Anand-240. (#10975)
Fixed plugin connectivity to use the plugin
ServiceFQDN instead of its short name, avoiding failures when a cluster-level proxy is automatically injected into pods. Contributed by @kdautrey. (#10921)Fixed excessive operator log noise from the per-request
Clustercreate/update validation webhook messages, now logged atdebuginstead ofinfo. (#10984)Fixed
spec.postgresql.parametersaccepting keys that are not valid PostgreSQL parameter names, which could inject arbitrary directives intopostgresql.conf; key names are now validated by the webhook. (#11029)Fixed a switchover deadlock when a WAL-archiver plugin was enabled on an existing cluster: with
primaryUpdateMethod: switchoverthe primary could not be rolled out because a clean demotion needs the archiver sidecar that is still missing. The operator now recreates the primary Pod in place so the sidecar is injected and archiving resumes. The check also covers plugins that inject the archiver as a native sidecar (an init container withrestartPolicy: Always), such as the Barman Cloud plugin. (#11032, #11059)Fixed a cluster staying in
Setting up primaryindefinitely when the instance-creation Job exhausted its backoff limit; the operator now detects the terminal Job failure and marks the cluster unrecoverable, naming the failed Job and pointing to its logs. (#11035)Fixed a first-primary bootstrap deadlock where a status-patch conflict after the data PVC was created but before the initialization Job was started left the orphan Pending PVC counted as an instance, blocking the bootstrap gate; the PVC-state reconciler now recreates the bootstrap Job reusing the assigned serial. (#11039)
Fixed external cluster names and secret selector references being joined into filesystem paths without validation, letting a
..component or path separator escape the external secrets directory when the instance manager dumps connection material; these values are now rejected at the validating webhook and re-checked at the write site. Reported by @r0binak. (#11045)Fixed a backup getting stuck in
pendingforever: the concurrent-backup gate ran on every reconcile and could overwrite an already-completed phase written asynchronously by the instance manager. The gate now runs only while the backup phase is still unset orpending. (#11056)Fixed a declarative
VolumeSnapshotbackup being permanently marked as failed when a stale cache made the operator re-create a snapshot it had already provisioned, failing withAlreadyExists. The operator now tolerates the collision when the existing snapshot carries this backup's label and adopts it; a collision with a foreign snapshot still surfaces as an error. (#11071)Fixed a volume snapshot backup being discarded on a transient instance-manager connection error (for example a dial timeout from a brief pod-network disruption) during the finalize step, even when its snapshots were already provisioned; such network errors are now retried instead of treated as terminal. (#11069)
Fixed a replica switchover losing its
status.demotionTokenwhen a reconcile was requeued between storing the token and cleaning up the transition metadata (for example a cleanup patch failing against a flaky webhook); the empty no-change token is no longer patched back over the stored value. (#11075)cnpgplugin:Fixed
kubectl cnpg psqlon Windows, where execution relied on a Unix-only system call and failed with "not supported by windows"; Windows now launcheskubectl execas a child process. Contributed by @Utkarsh-sharma47. (#10972)Fixed an unbounded memory leak in
kubectl cnpg logs -fon busy clusters, where a per-log-group timer was never released; timers are now reused across iterations. Contributed by @Anand-240. (#10976)Supported versions
v1.29.2Compare Source
Release date: Jun 29, 2026
Important changes
Updated the deprecation notice for native (in-tree) Barman Cloud support to reflect that it will now be removed in CloudNativePG 1.31.0, rather than 1.30.0. Users are still encouraged to migrate to the Barman Cloud Plugin. (#11083)
The
clusterreference is now immutable on theDatabase,Pooler,Publication,Subscription, andScheduledBackupresources. Pointing one of these objects at a different cluster has no well-defined semantics and previously left the controllers in an inconsistent state; the update is now rejected at the API server via a CEL validation rule. (#10743)Enhancements
Enabled
pg_upgradein-place major upgrades to PostgreSQL 19 or later for clusters that use Image Volume extensions, building on the extension-path support added topg_upgradein PostgreSQL 19. During the upgradeJob, the source- and target-version extension images are mounted side by side, so the old server keeps its libraries and a failed upgrade reverts cleanly. (#10366)Added a label selector to the
Clusterscale subresource (status.selector), making aClustera validtargetReffor the Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA), which can now map aClusterto its instance pods. Contributed by @sebv004. (#8996)The operator now emits a
WarningPrimaryStatusCheckFailedevent on theClusterwhen the primary pod isReadyfrom the kubelet perspective but the operator's/pg/statuscheck fails and failover is deferred, giving users visibility into the deferral viakubectl describe cluster. (#10509)The operator now reloads a CNPG-i plugin automatically when its pods are rolled: it watches the
EndpointSlicesbacking pluginServicesand re-enqueues every cluster using the plugin once the new pods becomeReady, so an upgraded plugin is picked up without waiting for the next resync. (#10836)Security and Supply Chain
CVE-2026-55769/GHSA-x8c2-3p4r-v9r6:search_pathpinning on operator-issued connections: a database owner could plant overloaded built-in operators in thepublicschema and alter thesearch_pathso that operator introspection probes, running as the cluster superuser, resolved those overloads beforepg_catalog, aCWE-426privilege-escalation chain (same class asCVE-2018-1058) that could lead to in-pod RCE viaCOPY ... FROM PROGRAM. The operator now pinssearch_path = pg_catalog, public, pg_tempon every pooled connection so it ships in the startup message and takes precedence over tenant-controlled defaults. (#10774,GHSA-x8c2-3p4r-v9r6)CVE-2026-55765/GHSA-w3gf-xc94-wvmj: operator-side SCRAM-SHA-256 password encoding: the operator now SCRAM-SHA-256 encodes cleartext role passwords before issuingCREATE/ALTER ROLE ... PASSWORD, so the literal PostgreSQL parses (and that extensions such aspg_stat_statementsorpgauditmay capture) is the SCRAM verifier rather than the cleartext secret. Pre-hashed (MD5 or SCRAM) values are forwarded unchanged, and the per-Secret annotationcnpg.io/passwordPassthrough: "enabled"opts out. (#10724,GHSA-w3gf-xc94-wvmj)Changes
Added support for Kubernetes 1.36. (#10900)
Updated the default PostgreSQL version to 18.4. (#10719)
Updated the Kubernetes versions used to test the operator on public cloud providers. (#10720, #10563, #11033)
Fixes
Fixed
spec.postgresql.parametersaccepting keys that are not valid PostgreSQL parameter names, which could inject arbitrary directives intopostgresql.conf; key names are now validated by the webhook. (#11029)Fixed declarative
Database,Publication, andSubscriptionobjects reporting a stale primary-side status forever after their cluster was demoted to a replica; the controller now re-checks the replica condition and watches theClusterso a demotion is detected promptly. (#10871)Fixed non-sequential pod names (for example
-1,-3) caused by the instance serial counter being advanced before the correspondingJoband PVCs were created; the bump is now persisted only after those resources exist. (#10491)Fixed a switchover deadlock when a WAL-archiver plugin was enabled on an existing cluster: with
primaryUpdateMethod: switchoverthe primary could not be rolled out because a clean demotion needs the archiver sidecar that is still missing. The operator now recreates the primary Pod in place so the sidecar is injected and archiving resumes. The check also covers plugins that inject the archiver as a native sidecar (an init container withrestartPolicy: Always), such as the Barman Cloud plugin. (#11032, #11059)Fixed a cluster staying in
Setting up primaryindefinitely when the instance-creation Job exhausted its backoff limit; the operator now detects the terminal Job failure and marks the cluster unrecoverable, naming the failed Job and pointing to its logs. (#11035)Fixed deletion of a
Database,Publication, orSubscriptiongetting stuck inTerminatingon a replica cluster, where the replica gate ran before the finalizer reconciler and the finalizer was never released. On a replica the PostgreSQL object is left to the primary cluster. (#10853)Fixed a conflicting duplicate
DatabaseorSubscriptionwith adeletereclaim policy dropping the PostgreSQL object owned by the surviving CR; the drop is now gated on a recorded reconciliation. (#10870)Fixed the
postgressuperuser being left locked out after superuser access was disabled and then re-enabled, because the cached secret version was not invalidated and the password was never re-applied. Diagnosed by @mhartmann-jaconi. (#10834)Fixed backups getting stuck in the
startedphase when the instance manager running them was restarted (for example by the in-place upgrade following an operator upgrade) before the backup reachedrunning; the reconciliation is now rescheduled so the lost session is detected. (#10859)Fixed
exec/attachstreaming to negotiate WebSocket with a SPDY fallback, restoring compatibility both with Kubernetes versions that have removed SPDY and with platforms such as OpenShift that reject WebSocket exec upgrades. Contributed by @bartscheers. (#10876, #10933)Fixed resource leaks when concurrent
Backupobjects raced: backups now run in strict creation-time order, so an already-executing backup is never preempted by a newer one and its replication slot and PostgreSQL session are no longer orphaned on the primary. Contributed by @GabriFedi97. (#10747)Fixed role reconciliation clearing the password on a PostgreSQL role when the referenced Secret could not be fetched; the role is now left untouched until the Secret becomes available, and per-action errors are aggregated for better visibility. (#10053)
Fixed a bootstrap failure where a metrics-exporter setup error (commonly a duplicate-key race with the controller) rolled back
streaming_replicacreation and wedged replica joins. The metrics-exporter step now runs in a separate transaction. Contributed by @BlaiseAntony. (#10749)Fixed a
ScheduledBackupcontroller loop that occurred when aBackupwas created but its status patch never landed; the controller now adopts an existingBackupfor the next iteration instead of looping onAlreadyExists. (#10612)Fixed a nil-pointer panic when reconciling a
PoolerwhoseClusterhas been deleted. (#10667)Fixed bootstrap log handling so that all named log pipes (
postgres,postgres.csv, andpostgres.json) get consumers duringWithActiveInstance, preventing regular files from being created in place of the named pipes. (#10043)Fixed generation of invalid IPv6 URLs by wrapping the address in square brackets. Contributed by @Infinoid. (#10682)
Fixed an external cluster plugin still being treated as active when its configuration set
enabled: false. (#10932)Fixed a race during bootstrap recovery from an object store where the restore job could read a stale
Cluster(primary not yet recorded and timeline still unset) and have its.historyfiles rejected by the split-brain guard. When this happened, recovery stopped at the base backup's timeline and silently dropped transactions committed on later timelines. History files are now allowed while the cluster timeline is unset. Contributed by @dennispidun. (#10818)Fixed a cache race during cluster creation when the server and client CA resolve to the same Secret (the default): a stale informer cache triggered a redundant
Createthat failed withAlreadyExistsand could leave the cluster stuck inUnable to create required cluster objects. The operator now reuses the already-fetched CA Secret when the names match. (#10989)Fixed the
pg_basebackupbootstrap path overwriting or failing on a pre-existingPGDATA(for example after a replica Pod restart) by enforcing the same pre-flight directory check already applied by the other bootstrap methods; this also protects statically provisioned PVCs from being silently overwritten. (#11006)Fixed the
Clusterphase flapping betweenHealthyand a plugin-failure phase when a post-reconcile plugin hook returned an error; theHealthyphase is now registered as the last step of a successful reconciliation, so a loop that ends in a plugin error never reportsHealthy. Contributed by @GabriFedi97. (#10421)Fixed stale certificate data and partial reads after an external server's Secret was rotated (for example a CA bundle shrinking from two certificates to one): the file is now written atomically, so libpq always reads either the old or the new value, never a mix. Contributed by @Anand-240. (#10975)
Fixed plugin connectivity to use the plugin
ServiceFQDN instead of its short name, avoiding failures when a cluster-level proxy is automatically injected into pods. Contributed by @kdautrey. (#10921)Fixed excessive operator log noise from the per-request
Clustercreate/update validation webhook messages, now logged atdebuginstead ofinfo. (#10984)Fixed a first-primary bootstrap deadlock where a status-patch conflict after the data PVC was created but before the initialization Job was started left the orphan Pending PVC counted as an instance, blocking the bootstrap gate; the PVC-state reconciler now recreates the bootstrap Job reusing the assigned serial. (#11039)
Fixed external cluster names and secret selector references being joined into filesystem paths without validation, letting a
..component or path separator escape the external secrets directory when the instance manager dumps connection material; these values are now rejected at the validating webhook and re-checked at the write site. Reported by @r0binak. (#11045)Fixed a backup getting stuck in
pendingforever: the concurrent-backup gate ran on every reconcile and could overwrite an already-completed phase written asynchronously by the instance manager. The gate now runs only while the backup phase is still unset orpending. (#11056)Fixed a declarative
VolumeSnapshotbackup being permanently marked as failed when a stale cache made the operator re-create a snapshot it had already provisioned, failing withAlreadyExists. The operator now tolerates the collision when the existing snapshot carries this backup's label and adopts it; a collision with a foreign snapshot still surfaces as an error. (#11071)Fixed a volume snapshot backup being discarded on a transient instance-manager connection error (for example a dial timeout from a brief pod-network disruption) during the finalize step, even when its snapshots were already provisioned; such network errors are now retried instead of treated as terminal. (#11069)
Fixed a replica switchover losing its
status.demotionTokenwhen a reconcile was requeued between storing the token and cleaning up the transition metadata (for example a cleanup patch failing against a flaky webhook); the empty no-change token is no longer patched back over the stored value. (#11075)cnpgplugin:Fixed
kubectl cnpg psqlon Windows, where execution relied on a Unix-only system call and failed with "not supported by windows"; Windows now launcheskubectl execas a child process. Contributed by @Utkarsh-sharma47. (#10972)Fixed an unbounded memory leak in
kubectl cnpg logs -fon busy clusters, where a per-log-group timer was never released; timers are now reused across iterations. Contributed by @Anand-240. (#10976)v1.29.1Compare Source
Release date: May 8, 2026
Security and Supply Chain
CVE-2026-44477/GHSA-423p-g724-fr39: metrics exporter privilege escalation: the metrics exporter no longer authenticates as thepostgressuperuser. It now uses a dedicatedcnpg_metrics_exporterrole withpg_monitorprivileges only, closing a chain that let a low-privilege database user gain PostgreSQL superuser. (GHSA-423p-g724-fr39)Upgrade impact: custom monitoring queries that read user-owned tables, or use
target_databases: '*'against databases wherePUBLIC CONNECThas been revoked, need explicitGRANTstatements tocnpg_metrics_exporter. See "Custom query privileges and safety" and "Manually creating the metrics exporter role" in the monitoring documentation.For replica clusters, upgrade the source primary cluster before any replica clusters that consume from it. The
cnpg_metrics_exporterrole is created on the source primary and replicates downstream; a replica cluster upgraded first will scrape against a missing role until the source primary upgrades. The manual-recovery section linked above also covers replica clusters.Schema-qualified catalog references in default monitoring queries: hardened the shipped monitoring configuration and documentation samples by qualifying every
pg_catalogobject explicitly. Unqualified references resolve throughsearch_path, which a database user can manipulate to shadow built-in objects. (#10576)Discoverable SBOM and provenance attestations: SBOM and SLSA provenance attached to operator container images now follow the OCI 1.1 Referrers spec, so standard registry tooling and supply-chain scanners can discover them automatically. (#10601)
CVE remediation in
github.com/jackc/pgx/v5: bumped to v5.9.2 to pick up upstream fixes forCVE-2026-33816(memory-safety inpgproto3) andGHSA-j88v-2chj-qfwx(SQL injection via simple-protocol dollar-quoted string handling). (#10437, #10499)CVE remediation in the Go runtime: built with Go 1.26.3 to pick up upstream fixes in
crypto/x509,crypto/tls,net/http, andnet(CVE-2026-32280, CVE-2026-32281, CVE-2026-33810, CVE-2026-33814, CVE-2026-33811, CVE-2026-39825). (#10463, #10647)Build pipeline hardening: the Go 1.26.3 bump also addresses CVE-2026-42501 (
cmd/gomodule-checksum validation), reducing supply-chain exposure during release builds. The affected code paths are not reachable from the running operator. (#10647)Changes
VerifyPeerCertificatetoVerifyConnection, which runs on every completed handshake (the former is skipped on resumed TLS 1.3 sessions). Session resumption is not enabled in CloudNativePG today, so this has no observable effect, but it future-proofs verification if session caching is introduced later. (#10478)Fixes
Fixed a failover window where the former primary kept its primary label. If it returned during failover (for example, after a transient network partition), the
-rwservice kept routing to it, replicas could reconnect, and committed writes were lost topg_rewind. The old primary is now labeledunhealthyto isolate it from service traffic during failover. (#10409)Fixed failover not being triggered when the node hosting the primary becomes unreachable. The operator now reads the pod's
Readycondition (flipped toFalseby the node controller when the kubelet stops reporting) instead ofContainersReady, which stays stale asTruein that scenario. Combined with the spurious-failover guard (#10445), failover triggers only when Kubernetes itself marks the pod not Ready. (#10448)Fixed spurious failovers caused by transient failures on the primary's HTTP status endpoint. (#10445)
Fixed escaping of backslashes and control characters in PostgreSQL configuration values. Previously, such characters in parameters like
log_line_prefixcould corrupt the configuration file or be silently stripped at runtime. (#10515)Fixed
restore_commandconstruction to shell-quote each argument. Values such as adestinationPathcontaining whitespace (for example,s3://my bucket/wal) were word-split by the POSIX shell and passed to the WAL restore tool as separate arguments. (#10518)Tightened
recoveryTargetvalidation in the admission webhook:targetXIDmust now be a non-negative 32-bit integer, andtargetNamemust be shorter than 64 bytes and free of ASCII control characters. Malformed values are rejected at admission instead of failing later during PostgreSQL recovery. (#10565)Fixed snapshot restores failing when leftover
pgsql_tmp*directories were present in the data directory. (#10447)Fixed a deadlock occurring when PVC storage size and resource requests are changed simultaneously. (#10427)
Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.