Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glooctl check retries #10017

Merged
merged 10 commits into from
Sep 11, 2024
Merged

Glooctl check retries #10017

merged 10 commits into from
Sep 11, 2024

Conversation

sheidkamp
Copy link
Contributor

@sheidkamp sheidkamp commented Sep 11, 2024

Description

Update the retry approach that is used to connect to the control plane when querying proxies as part of glooctl check

Has similar changes as #9966, but updates the delay type as well as the retry attempts. That PR also has a timeout increase that is not present here. After merging this PR, we should decide if we want to include that timeout as well (and let it go in as a community PR to give the contributor credit).

Code changes

  • Updated delay type from BackOffDelay to FixedDelay to keep UI responsive
  • Updated the delay interval from 100ms to 250ms and the number of retries from 5 to 60 to allow up to 15s for the port forwarding to start working.

Testing steps

Validated the root cause by:

  • lowering the number of retries to 1
  • running make build-cli
  • Use the built glooctl to run check and get a connection refused error:
 _output/glooctl-darwin-arm64 check  
----------
glooctl binary version (1.0.0-sah1) differs from server components (v1.18.0-beta20) by at least a minor version.
Consider running:
_output/glooctl-darwin-arm64 upgrade --release=v1.18.0-beta20
----------

Checking Deployments... OK
Checking Pods... OK
Checking Upstreams... OK
Checking UpstreamGroups... OK
Checking AuthConfigs... OK
Checking RateLimitConfigs... OK
Checking VirtualHostOptions... OK
Checking RouteOptions... OK
Checking Secrets... OK
Checking VirtualServices... OK
Checking Gateways... OK
Checking Proxies... 7 Errors!
Checking rate limit server... OK

Skipping Kubernetes Gateway resources check -- Kubernetes Gateway integration not enabled

Skipping Gloo Instance check -- Gloo Federation not detected.
Error: 7 errors occurred:
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60957: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60960: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60963: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60966: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60969: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60972: connect: connection refused"
	* rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp [::1]:60975: connect: connection refused"

Same changeset applied to v1.17 and the updated glooctl was sent to the client having the issue, and the issue went away: https://solo-io-corp.slack.com/archives/C01B65XL57C/p1725466552622139?thread_ts=1723143874.312349&cid=C01B65XL57C

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@github-actions github-actions bot added keep pr updated signals bulldozer to keep pr up to date with base branch work in progress signals bulldozer to keep pr open (don't auto-merge) labels Sep 11, 2024
Copy link

github-actions bot commented Sep 11, 2024

Visit the preview URL for this PR (updated for commit d736eb8):

https://gloo-edge--pr10017-glooctl-check-retrie-9kvq29pr.web.app

(expires Wed, 18 Sep 2024 18:34:40 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

@solo-changelog-bot
Copy link

Issues linked to changelog:
#10020

@sheidkamp sheidkamp removed the work in progress signals bulldozer to keep pr open (don't auto-merge) label Sep 11, 2024
@soloio-bulldozer soloio-bulldozer bot merged commit bdcfd6b into main Sep 11, 2024
18 checks passed
@soloio-bulldozer soloio-bulldozer bot deleted the glooctl-check-retries branch September 11, 2024 20:26
sheidkamp added a commit that referenced this pull request Sep 11, 2024
Co-authored-by: soloio-bulldozer[bot] <48420018+soloio-bulldozer[bot]@users.noreply.github.com>
Co-authored-by: changelog-bot <changelog-bot>
tjons pushed a commit that referenced this pull request Sep 12, 2024
Co-authored-by: soloio-bulldozer[bot] <48420018+soloio-bulldozer[bot]@users.noreply.github.com>
Co-authored-by: changelog-bot <changelog-bot>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keep pr updated signals bulldozer to keep pr up to date with base branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants