pull from upstream by fvalle1 · Pull Request #3 · Elemento-Modular-Cloud/kops

fvalle1 · 2026-05-15T14:09:33Z

No description provided.

fvalle1 · 2026-05-15T14:10:33Z

@Paolo-Beci @iliy27 let's rebase this :)

Adds a ChannelsBuilder that emits /etc/kubernetes/manifests/kops-channels.manifest. The pod runs one container per channel URL on a 60s interval; the bootstrap-channel container additionally patches the local node with control-plane labels via --bootstrap-node-labels and the downward API. The pod is system-node-critical because it owns the labels addons target for scheduling, and uses hostNetwork so VFS can reach the cloud metadata service before CNI is up. At this commit the static pod and protokube both apply channels in parallel; that is safe because apply is idempotent via manifest-hash annotations. The protokube side is removed in the next commit.

Now that the kops-channels static pod owns both responsibilities, drop the protokube-side reconciliation: the channels exec wrapper, the --channels and --node-name flags, the labeler call, and the host-side install of /opt/kops/bin/channels in the nodeup builder. The KubeBoot struct sheds Channels and NodeName; the sync loop is now an idle keep-alive for the gossip goroutines and will be removed alongside the legacy gossip code path.

The first apply fails while a control-plane node's apiserver is still starting; retry every 5s until it succeeds rather than waiting a full interval, which delays cluster bootstrap. Also reuse a cached kube client per iteration.

The kubelet maxPods calculation runs for AmazonVPC and Cilium-ENI networking and falls back to DefaultMachineType when the IMDS instance-type lookup fails. NewConfig only set DefaultMachineType for AmazonVPC, so a Cilium-ENI node would dereference a nil pointer if IMDS was unavailable.

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Use kms:ViaService condition on KMS data actions

Signed-off-by: Moshe Vayner <moshe@vayner.me>

chore(channels): bump k8s versions in alpha channel

Adds the first Linode cloudup infrastructure task: VPC create/update support. This intentionally does not add Linode instances, volumes, load balancers, DNS, or full cluster bring-up support. Those should land in follow-up PRs. Signed-off-by: Moshe Vayner <moshe@vayner.me>

linode: Add VPC cloudup task

nodeup: populate DefaultMachineType for Cilium-ENI clusters

Update coredns to v1.14.3

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Downgrade coredns to v1.14.2

Upgrade containerd to v2.3.0

aws: Use amazonaws.com suffix for kms:ViaService in all partitions

channels: move from protokube to a static pod

chore(channels): promote to stable, bump node images, update recommended kOps versions

Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.3 to 7.0.0. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@df4cb1c...9c091bb) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: 7.0.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

…ctions/checkout-7.0.0 build(deps): bump actions/checkout from 6.0.3 to 7.0.0

In e2e, `kops create cluster --channel=alpha` reads the channel from the kops master branch, so a PR's edits to channels/alpha or channels/stable are never exercised by its own e2e jobs. When kops is built from the PR checkout, the deployer now rewrites --channel to a file:// path into that checkout's channels/ directory (defaulting to alpha when --channel is unset), so the build uses the PR's channels. Downloaded release/marker binaries don't match the checkout and keep using master's channels.

scaletest: bind etcd metrics to all interfaces

e2e: test the PR's own channels, not master's

The externalTrafficPolicy=Local source-IP-preservation tests only fail on Cilium (the client IP is SNATed to a pod IP instead of being preserved), tracked upstream in cilium/cilium#37613. Move the "implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes" skip into the Cilium block next to its sibling so other CNIs run the test. The hostNetwork "function for service endpoints" test was fixed in k8s 1.37 by kubernetes/kubernetes#139819 (it now reads spec.nodeName via the Downward API instead of os.Hostname()), so drop its skip gate from < 1.38 to < 1.37. Also clean up stale/incorrect issue references in the surrounding comments (wrong Azure issue, superseded hostname WIP PR, and the unrelated #129221).

tests/e2e: refine externalTrafficPolicy=Local and hostNetwork skips

Grow slices for explicit indexes while processing --set paths, so paths like cluster.spec.addons[0].manifest can create the first element.

Configure the scenario to install Gateway API CRDs via cluster.spec.addons, using the Gateway API version documented by Istio 1.29.

Allow setting missing slice elements from the command line

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Add managed Karpenter EC2NodeClass and NodePool

Preperation for something like #18495. Moving away from direct comparison (== or !=) on IG role. Using helper methods such as HasNode() or HasControlPlane(). Also added a hack test so we don't backtrack. Should help prepare for supporting more control plane roles.

The "Services should implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes" test was previously gated to Cilium only, but the e2e-kops-aws-cni-* periodic jobs show it also fails on flannel, kopeio and kube-router: the client source IP is SNATed to a pod IP instead of being preserved (kube-router instead times out reaching the local endpoint). It is the sole failure in those three jobs' latest runs. Move it out of the Cilium block into a condition covering cilium, flannel, kopeio and kube-router. amazon-vpc, calico and kindnet preserve the source IP and continue running the test. The sibling "externalTrafficPolicy=Local for type=NodePort" test passes on every non-Cilium CNI, so it stays gated to Cilium only.

…port tests/e2e: skip implement-NodePort ETP=Local test on more CNIs

Azure retired the pinned Ubuntu 24.04 daily images from the uksouth marketplace, so VMSS creation fails with PlatformImageNotFound and every Azure e2e job dies in the Up phase: The platform image 'Canonical:ubuntu-24_04-lts:server:24.04.202606120' is not available. `az vm image show` (the query path a deployment uses) confirms 24.04.202606120 (amd64) and 24.04.202606110 (arm64) are no longer available, while `az vm image list` still lists them from a stale catalog. 24.04.202606060 is the newest version that `az vm image show` confirms deployable for both server and server-arm64, so pin both arches to it.

chore(channels): pin Azure noble image to a deployable version

Switching from comparison on Role to helper.

The "Services should implement NodePort and HealthCheckNodePort correctly when ExternalTrafficPolicy changes" test fails on calico on GCE but passes on calico on AWS. On GCE the VPC drops packets with arbitrary calico pod-CIDR source/dest addresses, so calico must IPIP-encapsulate inter-node pod traffic (routes go via tunl0). The IPIP/masquerade path rewrites the ETP=Local NodePort traffic's source to the node's tunnel address (a pod-CIDR IP) instead of preserving the client IP. On AWS kops disables the EC2 source/dest check, so calico routes pod traffic natively over the VPC (dev ens5, no encapsulation) and the source IP is preserved. Extend the skip to calico when the cloud provider is GCE. amazon-vpc and kindnet continue running the test on both clouds.

tests/e2e: also skip implement-NodePort ETP=Local test on calico+GCE

Bumps [actions/setup-go](https://github.com/actions/setup-go) from 6.4.0 to 6.5.0. - [Release notes](https://github.com/actions/setup-go/releases) - [Commits](actions/setup-go@4a36011...924ae3a) --- updated-dependencies: - dependency-name: actions/setup-go dependency-version: 6.5.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

…ctions/setup-go-6.5.0 build(deps): bump actions/setup-go from 6.4.0 to 6.5.0

Make cloud-controller-manager pods tolerate all taints

No new functionality yet. Added 4 new role placholders, etcd, scheduler, ccm and kcm. Sets up the CLI API as well as the accessor functions.

Add an experimental roles feature flag.

hakman and others added 29 commits May 18, 2026 08:53

nodeup: use shared system-component env vars for kops-channels

ee3e924

Use kms:ViaService condition on KMS data actions

749245b

./hack/update-expected.sh

10f1ac1

./hack/update-expected.sh

1d7888e

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Merge pull request #18363 from rifelpet/kms-decrypt

10588d8

Use kms:ViaService condition on KMS data actions

chore(channels): bump k8s versions in alpha channel

c8c5320

Signed-off-by: Moshe Vayner <moshe@vayner.me>

Merge pull request #18367 from moshevayner/k8s-releases-2026-05-18

4c4a546

chore(channels): bump k8s versions in alpha channel

Update coredns to v1.14.3

60f2e71

hack/update-expected.sh

24fb2f8

Merge pull request #18316 from moshevayner/linode-pr-vpc

dfcdbd0

linode: Add VPC cloudup task

Merge pull request #18365 from hakman/populate-DefaultMachineType

32e00a7

nodeup: populate DefaultMachineType for Cilium-ENI clusters

Merge pull request #18368 from Jefftree/update-coredns-1.14.3

ef1b0f3

Update coredns to v1.14.3

Upgrade containerd to v2.3.0

23fe61d

./hack/generate-asset-hashes.sh

4879843

./hack/update-expected.sh

4d1618f

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Downgrade coredns to v1.14.2

f766315

./hack/update-expected.sh

6450345

Merge pull request #18369 from hakman/coredns-1.14.2

6ca9207

Downgrade coredns to v1.14.2

Merge pull request #18364 from hakman/containerd-2.3.0

6e75cb6

Upgrade containerd to v2.3.0

Use amazonaws.com suffix for kms:ViaService in all partitions

addc93b

Merge pull request #18372 from rifelpet/kms-decrypt

6c16dd1

aws: Use amazonaws.com suffix for kms:ViaService in all partitions

tests/e2e/scalability: decouple client HTTP traffic in kops

bd22ab6

Merge pull request #18328 from hakman/channels

881d72e

channels: move from protokube to a static pod

Seed the node certificate lifetime skew hash with node name

a5422b1

kubernetes-prow Bot and others added 30 commits June 20, 2026 14:33

Merge pull request #18501 from hakman/channels-chores

9488c60

chore(channels): promote to stable, bump node images, update recommended kOps versions

scale-test: bind etcd metrics to all interfaces

6486f1f

Merge pull request #18505 from kubernetes/dependabot/github_actions/a…

595d5e8

…ctions/checkout-7.0.0 build(deps): bump actions/checkout from 6.0.3 to 7.0.0

Merge pull request #18503 from Jefftree/scale-etcd-metrics-listen-all

c1047de

scaletest: bind etcd metrics to all interfaces

Merge pull request #18504 from hakman/channels-e2e

7a0fb7e

e2e: test the PR's own channels, not master's

Update CCM pods to tolerate all taints

6750330

./hack/update-expected.sh

d770192

Merge pull request #18509 from rifelpet/skip-regex-etp-local-cilium

b4ac9fc

tests/e2e: refine externalTrafficPolicy=Local and hostNetwork skips

Allow setting missing slice elements from the command line

a35c985

Grow slices for explicit indexes while processing --set paths, so paths like cluster.spec.addons[0].manifest can create the first element.

tests/ai-conformance: install Gateway API CRDs through kOps addons

09e689d

Configure the scenario to install Gateway API CRDs via cluster.spec.addons, using the Gateway API version documented by Istio 1.29.

channels: apply direct manifests from spec.addons

6787d0a

Merge pull request #18511 from hakman/fix-set-on-slice-elements

01499f4

Allow setting missing slice elements from the command line

Add managed Karpenter EC2NodeClass and NodePool

f5aeae1

Signed-off-by: Ciprian Hacman <ciprian@hakman.dev>

Merge pull request #18497 from hakman/karpenter-nodeclass-nodepool

09b99cc

Add managed Karpenter EC2NodeClass and NodePool

Merge pull request #18515 from rifelpet/skip-etp-local-implement-node…

86c6410

…port tests/e2e: skip implement-NodePort ETP=Local test on more CNIs

Merge pull request #18516 from hakman/azure-fix-image

d61d193

chore(channels): pin Azure noble image to a deployable version

Merge pull request #18514 from cheftako/has_methods

0598fb7

Switching from comparison on Role to helper.

Merge pull request #18517 from rifelpet/skip-etp-local-gce-calico

07f4dd0

tests/e2e: also skip implement-NodePort ETP=Local test on calico+GCE

Merge pull request #18518 from kubernetes/dependabot/github_actions/a…

5fc08aa

…ctions/setup-go-6.5.0 build(deps): bump actions/setup-go from 6.4.0 to 6.5.0

Merge pull request #18510 from rifelpet/ccm-tolerate-all-taints

95f5a62

Make cloud-controller-manager pods tolerate all taints

Add an experimental roles feature flag.

0e45aa5

No new functionality yet. Added 4 new role placholders, etcd, scheduler, ccm and kcm. Sets up the CLI API as well as the accessor functions.

Merge pull request #18519 from cheftako/multipleControlRoles

568d0e0

Add an experimental roles feature flag.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pull from upstream#3

pull from upstream#3
fvalle1 wants to merge 1828 commits into
Elemento-Modular-Cloud:masterfrom
kubernetes:master

fvalle1 commented May 15, 2026

Uh oh!

fvalle1 commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Uh oh!

Conversation

fvalle1 commented May 15, 2026

Uh oh!

fvalle1 commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants