Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leverage nodeAffinity rather than nodeSelector to target Control Plane nodes #2803

Merged
merged 1 commit into from
Feb 27, 2024

Conversation

vdkotkar
Copy link
Contributor

@vdkotkar vdkotkar commented Feb 21, 2024

What this PR does / why we need it:
The current CSI driver deployment manifests use a nodeSelector to target control-plane nodes for scheduling controller pods. Specifically an empty label value. Some distributions may leverage other label values, or populate the label value. This causes the CSI controller pods to not get scheduled.
The fix is to remove the nodeSelector and instead add a nodeAffinity that matches the label rather than the label AND its contents.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #2644

Testing done:
Verified that CSI controller pods are scheduled on master nodes and are in running state:

# kubectl get pods -n vmware-system-csi
NAME                                      READY   STATUS    RESTARTS       AGE
vsphere-csi-controller-7d6647cb8f-d7rnf   7/7     Running   1 (14h ago)   15h
vsphere-csi-controller-7d6647cb8f-hglbd   7/7     Running   2 (14h ago)   15h
vsphere-csi-controller-7d6647cb8f-whxrf   7/7     Running   1 (14h ago)   15h
vsphere-csi-node-2tvq9                    3/3     Running   3 (16h ago)    16h
vsphere-csi-node-2xglr                    3/3     Running   3 (16h ago)    16h
vsphere-csi-node-4wcqc                    3/3     Running   2 (16h ago)    16h
vsphere-csi-node-kbxcq                    3/3     Running   6 (16h ago)    16h
vsphere-csi-node-mbbmr                    3/3     Running   5 (16h ago)    16h
vsphere-csi-node-v58lh                    3/3     Running   3 (16h ago)    16h
vsphere-csi-webhook-74c94597b9-fq2cz      1/1     Running   0              16h

Ran block vanilla e2e tests to validate that CSI driver deployment is working fine and there is no regression.

Jenkins E2E Test Results: 
------------------------------

Ran 1 of 816 Specs in 411.769 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 815 Skipped
PASS

--
Ran 14 of 816 Specs in 6618.541 seconds
FAIL! -- 12 Passed | 2 Failed | 0 Pending | 802 Skipped
...

Special notes for your reviewer:

Release note:

Leverage nodeAffinity rather than nodeSelector to target Control Plane nodes

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 21, 2024
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 21, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @vdkotkar. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 21, 2024
@svcbot-qecnsdp
Copy link

Started vanilla Block pipeline... Build Number: 2551

@svcbot-qecnsdp
Copy link

Block vanilla build status: FAILURE 
Stage before exit: testbed-deployment 

@svcbot-qecnsdp
Copy link

Started vanilla Block pipeline... Build Number: 2552

@svcbot-qecnsdp
Copy link

Block vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
------------------------------

Ran 1 of 816 Specs in 411.769 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 815 Skipped
PASS

Ginkgo ran 1 suite in 8m0.537202804s
Test Suite Passed
--
Ran 14 of 816 Specs in 6618.541 seconds
FAIL! -- 12 Passed | 2 Failed | 0 Pending | 802 Skipped
--- FAIL: TestE2E (6618.63s)
FAIL

Ginkgo ran 1 suite in 1h50m33.482246457s

Test Suite Failed

@vdkotkar vdkotkar changed the title [WIP] Leverage nodeAffinity rather than nodeSelector to target Control Plane nodes Leverage nodeAffinity rather than nodeSelector to target Control Plane nodes Feb 22, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 22, 2024
@divyenpatel
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 26, 2024
Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28, vdkotkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 27, 2024
@k8s-ci-robot k8s-ci-robot merged commit 54ad3fe into kubernetes-sigs:master Feb 27, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Leverage Affinity rather than nodeSelector to target Control Plane Nodes
6 participants