Restore tempo_request_duration_seconds metrics for querier_api_* requests #3403
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
Recent changes (I suspect #3300) broke middleware on our querier requests. We have HTTP middleware that provides tracing and metrics. Requests sent to the querier are routed slightly differently and due to this change we didn't get metrics anymore for the HTTP routes.
Issue
So
tempo_request_duration_seconds
does not show up for routes likequerier_api_search_tags
andquerier_api_traces_traceid
. GRPC routes were still instrumented correctly (/tempopb.Querier/FindTraceByID
and/tempopb.Querier/SearchTags
).How it works
Normal middleware flow:
When we set
stream_over_http_enabled: false
this will beWhen we set
stream_over_http_enabled: true
we switch up the order a bit:The bug: requests from query-frontend to querier are handled differently, they are pulled by the frontend worker process.
The frontend worker will pull a request from the frontend, unpack the GRPC requests and then push the HTTP request to the regular HTTP handler. This handler was the router (marked
(2)
), since this handler is after instrumentation we didn't get any metrics for it. By instead passing the start of the middleware chain ((1)
) we fix this.Adding
HTTPRouter
andHTTPHandler
toTempoServer
should make this more explicit.Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]