Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve rate limiter latency logging and add component-base metric #88134

Merged
merged 1 commit into from
Feb 27, 2020

Conversation

jennybuckley
Copy link

@jennybuckley jennybuckley commented Feb 13, 2020

What type of PR is this?
/sig api-machinery
/priority important-soon
/kind feature
/cc @lavalamp

What this PR does / why we need it:
Follow up to #87740

Adds a more severe and infrequent log message as suggested in #87740 (comment)

Also adds a new client-go metric to track the rate limiter latency across all invocations.

Does this PR introduce a user-facing change?:

Add `rest_client_rate_limiter_duration_seconds` metric to component-base to track client side rate limiter latency in seconds. Broken down by verb and URL.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Feb 13, 2020
@jennybuckley
Copy link
Author

/retest

@jennybuckley
Copy link
Author

jennybuckley commented Feb 13, 2020

/cc @logicalhan
Does this seem reasonable?

@jennybuckley jennybuckley changed the title Improve rate limiter latency logging and metrics Improve rate limiter latency logging and add component-base metric Feb 13, 2020
@fedebongio
Copy link
Contributor

/assign @logicalhan

@jennybuckley
Copy link
Author

@logicalhan addressed comments

Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

We should probably add a clock to the logger so that we can add some unit tests for this. Then we can probably verify that this thing is doing what we think it's doing.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 25, 2020
@jennybuckley
Copy link
Author

Ah there's something wrong with it now.

@logicalhan
Copy link
Member

Ah there's something wrong with it now.

I told you I didn't test it.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 25, 2020
@jennybuckley
Copy link
Author

I'll add a unit test :)

@logicalhan
Copy link
Member

I'll add a unit test :)

If you add a clock to the logger, it'll make testing much easier.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 25, 2020
@logicalhan
Copy link
Member

/retest

Copy link
Member

@logicalhan logicalhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

if time.Since(b.lastLogTime) > b.minLogInterval {
klog.V(level).Info(message)
b.lastLogTime = time.Now()
func (b *throttledLogger) attemptToLog() (klog.Level, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a docstring on the return values. My first super quick skim, I was super confused by the -1 (it makes sense once I read the code though).

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 26, 2020
@lavalamp
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jennybuckley, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2020
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 650220f into kubernetes:master Feb 27, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Feb 27, 2020
IonutBajescu added a commit to IonutBajescu/kubernetes that referenced this pull request Feb 24, 2021
We've found that the rate limiting metric wasn't exporting any metrics,
in spite of clearly seeing the metric in the disassembled binary.

As it turns out, the rest_client_rate_limiter_duration_seconds metric has
been added as part of the logging improvements, but it appears to have been
accidentally forgotten to be registered.

kubernetes#88134
k8s-publishing-bot pushed a commit to kubernetes/component-base that referenced this pull request Apr 9, 2021
We've found that the rate limiting metric wasn't exporting any metrics,
in spite of clearly seeing the metric in the disassembled binary.

As it turns out, the rest_client_rate_limiter_duration_seconds metric has
been added as part of the logging improvements, but it appears to have been
accidentally forgotten to be registered.

kubernetes/kubernetes#88134

Kubernetes-commit: da9ffb8458d49b2f5710e1acd640b436d7163b3e
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
We've found that the rate limiting metric wasn't exporting any metrics,
in spite of clearly seeing the metric in the disassembled binary.

As it turns out, the rest_client_rate_limiter_duration_seconds metric has
been added as part of the logging improvements, but it appears to have been
accidentally forgotten to be registered.

kubernetes#88134
chenchun pushed a commit to chenchun/kubernetes that referenced this pull request Mar 20, 2024
…!893)

Fix rest_client_rate_limiter_duration_seconds not registered
Fix rest_client_rate_limiter_duration_seconds not registered

We've found that the rate limiting metric wasn't exporting any metrics,
in spite of clearly seeing the metric in the disassembled binary.

As it turns out, the rest_client_rate_limiter_duration_seconds metric has
been added as part of the logging improvements, but it appears to have been
accidentally forgotten to be registered.

kubernetes#88134
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants