Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Bump k8s.io and controller-runtime dependencies #10069

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

timflannagan
Copy link
Contributor

@timflannagan timflannagan commented Sep 19, 2024

Description

API changes

Code changes

CI changes

Docs changes

Context

Interesting decisions

Testing steps

Notes for reviewers

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

BOT NOTES:
resolves #9683

timflannagan and others added 2 commits September 19, 2024 18:18
Signed-off-by: timflannagan <timflannagan@gmail.com>

Co-authored-by: Sam Heilbron <samheilbron@gmail.com>
Co-authored-by: Tyler Schade <tyler.schade@solo.io>
Signed-off-by: timflannagan <timflannagan@gmail.com>
@timflannagan timflannagan requested a review from a team as a code owner September 19, 2024 19:03
Signed-off-by: timflannagan <timflannagan@gmail.com>
Signed-off-by: timflannagan <timflannagan@gmail.com>
@solo-changelog-bot
Copy link

Issues linked to changelog:
#9683

Signed-off-by: timflannagan <timflannagan@gmail.com>
Note: these clients were manually generated using a solo-kit
that points to my local fork that implements solo-io#564.

The gateway & gloo clients were updated to adopt recently support
for generics throughout the 1.31 client-go release. Namely, listers
and clients adopt this new approach.

The nested extauth and graphql APIs have updated hack/update-codegen.sh
bash scripts checked in with this commit, but I think we need to update
the solo-kit.json configuration for those directories since we weren't
previously committing their k8s clients.

Similarly, the "gloosnapshot" API doesn't need k8s clients generated too.

Signed-off-by: timflannagan <timflannagan@gmail.com>
Signed-off-by: timflannagan <timflannagan@gmail.com>
Signed-off-by: timflannagan <timflannagan@gmail.com>
node_version='v1.29.2@sha256:51a1434a5397193442f0be2a297b488b6c919ce8a3931be0ce822606ea5ca245'
kubectl_version='v1.29.2'
kind_version='v0.20.0'
node_version='v1.31.0@sha256:53df588e04085fd41ae12de0c3fe4c72f7013bba32a20e7325357a1ac94ba865'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick note: go.mod specifies the 1.31.1 patch version, but I didn't see a 1.31.1 sha image in the kind releases, so I left this here. I don't think it matters too much w.rt. patch version skew between k8s server and client versions.

@@ -1,6 +1,6 @@
node_version='v1.25.16@sha256:5da57dfc290ac3599e775e63b8b6c49c0c85d3fec771cd7d55b45fae14b38d3b'
kubectl_version='v1.25.16'
node_version='v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're jumping from 1.29 to 1.31 in Gloo, so updating the min supported k8s version to 1.27 to maintain the N-3 matrix.

@@ -0,0 +1,12 @@
changelog:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Fix this changelog.

@@ -35,3 +35,7 @@ func (s *switchAdapter) On(name string) {
func (s *switchAdapter) Off(name string) {
s.gauge.WithLabelValues(name).Set(0.0)
}

func (s *switchAdapter) SlowpathExercised(name string) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed for controller-runtime 0.18.x due to client-go leaderelection changes.

@@ -10,10 +10,9 @@ ROOT_PKG=github.com/solo-io/gloo/projects/gateway/pkg/api/v1
CLIENT_PKG=${ROOT_PKG}/kube/client
APIS_PKG=${ROOT_PKG}/kube/apis

# Below code is copied from https://github.com/weaveworks/flagger/blob/master/hack/update-codegen.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is generated by solo-kit. We updated the Go template file in solo-io/solo-kit#560. Note, this file is technically bugged and I have an open PR for fixing this in solo-io/solo-kit#564.

@@ -65,7 +66,7 @@ var _ = Describe("RetryOnUnavailableClientConstructor", func() {
// sanity check
resp, err := client.Validate(rootCtx, &validation.GlooValidationServiceRequest{})
Expect(err).NotTo(HaveOccurred())
Expect(resp).To(Equal(res))
Expect(resp).To(matchers.MatchProto(res))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed due to the protobuf jump that was clumped with these dependency bumps.

Comment on lines +102 to +108
Controller: config.Controller{
// see https://github.com/kubernetes-sigs/controller-runtime/issues/2937
// in short, our tests reuse the same name (reasonably so) and the controller-runtime
// package does not reset the stack of controller names between tests, so we disable
// the name validation here.
SkipNameValidation: ptr.To(true),
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment calls out why this is needed, but this due to a recent c-r change that enforces stricter validation for controller names.

"gen_kube_types": true,
"gen_kube_types": false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to #10079. For context, solo-kit was generating hack/*-codegen.sh bash scripts for these nested directories that were relevant, so toggling this off / removing this option helps us manage maintenance.

Copy link
Contributor Author

@timflannagan timflannagan Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick note on these new uniquehash.go files. This is needed as we had a bug in GME's AccessPolicy caching that required us to introduce a new primitive in this library. See https://github.com/solo-io/gloo-mesh-enterprise/pull/17392 for more information. We aren't using this new method, but still wanted to provide context on these generated files as we're bumping skv2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff in these client-gen generated files is a bit confusing. Basically, client-go had a series of improvements in 1.31 to help adopt generics and cut down on the amount of generated code for consumers of this library. The gentype package defines the common Get/List/etc. interfaces now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above comment about client-go generated code, client-go refactored the listers implementation to adopt a generics-based approach. See kubernetes/kubernetes#121574 for more information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to highlight this change in the sea of generated code changes. The primary change here is the removal of typecasting to corev1 listers which was causing the regression suite to fail. IMO, doing this is a violation of our own lister abstraction (that manages corev1 listers under-the-hood) and any net new issues with performance regressions could be handled as a follow-up in solo-kit.

Comment on lines +54 to +56
EnableGatewayController: &wrappers.BoolValue{
Value: true,
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to confirm with Tyler or Sam why this change is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, let's chat about this, I have some context and questions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We followed up in Slack. We discussed how this is due to some unknown proto changes that affect boolean values to be overridden in our tests. This impacts just this test, because we require the EnableGatewayController (edge gw) to be true in Settings, but since we define some other values in the same struct, the default true value is not being respected, and instead the overriding empty value is being used so it is false.

Our plan is two-fold:

  1. Keep this temporary solution to merge the large code. This way this PR doesn't go out of date
  2. Immediately after, investigate what proto changes could lead to this and provide an explanation and fix

cc @timflannagan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubernetes 1.30/1.31
2 participants