Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trace query API #252

Merged
merged 11 commits into from
Mar 30, 2022
Merged

Trace query API #252

merged 11 commits into from
Mar 30, 2022

Conversation

esnible
Copy link
Contributor

@esnible esnible commented Mar 16, 2022

This PR adds proxying of the Jaeger V2 HTTP query API and the Jaeger UI assets.

With this change, the Jaeger UI is available at http://localhost:8080/api/traces/v1/test-oidc/search/

Note that the first time logging in, it redirects away from traces. Not sure if that is correct or not.

Note that there are currently no multi-tenant backends for Jaeger, so the UI currently sees all traces. The work to divide the tenants will be done in Jaeger itself or another front-end, probably not in observatorium/api.

Copy link
Contributor

@jessicalins jessicalins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, thanks @esnible - just a few suggestions/questions from my side
I think would be nice to also add a traces_test.go ?

api/traces/v1/http.go Outdated Show resolved Hide resolved
api/traces/v1/http.go Outdated Show resolved Hide resolved
api/traces/v1/http.go Outdated Show resolved Hide resolved
api/traces/v1/http.go Outdated Show resolved Hide resolved

// The <base href=> tag generated by Jaeger to tell the UI where to fetch static
// assets and query /api
const expectedBaseTag = `<base href="/" data-inject-target="BASE_URL"/>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is Jaeger used by Obs API in a specific version? or would we expect the <base href=..> tag to be changed at some point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Jaeger API is versioned, but the UI is not. This handler is for the UI.

I want to bring @pavolloffay into this discussion. Jaeger does something weird, which was introduced in jaegertracing/jaeger-ui#42 . The introduced <base> to support reverse proxies. I don't like their solution and I am doing something of the opposite.

To support some reverse proxy, Jaeger introduced a mechanism where the path known to the proxy is told to Jaeger. So if I have two tenants, tenant1 and tenant2, I can start a Jaeger instance that is told the base URL is /api/traces/v1/tenant1 and another that is told the base URL is /api/traces/v1/tenant2. The reverse proxy is responsible for remembering the host (or IP) of each tenant and forwarding properly.

What Observatorium does with this PR is offer the endpoint {host}/api/traces/v1/{tenant}/{suffix} and proxy adding header x-tenant: ... to {jaeger}/{suffix}. All queries are sent to the same --traces.read.endpoint={host:port}

The approach I am taking will be great when Jaeger has a multi-tenant back-end. This is in-plan, see jaegertracing/jaeger#3427 . (We might need a temporary work-around to get to that approach, and I plan to start on that work next).

The other approach is that when we add multitenancy to Jaeger we have Jaeger itself alter the BASE_URL instead of doing it in jaegerUIResponseModifier(). If that approach is taken we still don't need to version the <base> tag, we merely delete jaegerUIResponseModifier() from the Observatorium/API code.

@pavolloffay , Does the Jaeger community want to pass the tenant to the UI as a segment in the path or in a header?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we don't have control if this tag gets updated in Jaeger UI right? I think then the response won't get properly modified?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay , Does the Jaeger community want to pass the tenant to the UI as a segment in the path or in a header?

There was no decision on this so far. But I think it will be a header to avoid changing the current APIs.

What Observatorium does with this PR is offer the endpoint {host}/api/traces/v1/{tenant}/{suffix} and proxy adding header x-tenant: ... to {jaeger}/{suffix}. All queries are sent to the same --traces.read.endpoint={host:port}

As you mentioned this is the final goal but it does not work with the Jaeger at the moment as there is no (soft) multitenancy support in Jaeger. As a temporary solution the API service could use tenant name/id as part (prefix) of the endpoint for the read endpoint.

main.go Outdated Show resolved Hide resolved
api/traces/v1/http.go Show resolved Hide resolved
@@ -18,7 +18,7 @@ type contextKey string
const (
prefixKey contextKey = "prefix"

prefixHeader string = "X-Forwarded-Prefix"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to export this in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm trying to do is have Observatorium replace <base href="/"> with <base href="/api/traces/v1/{tenant}/"> for the reasons I explain at #252 (comment)

In the place I am doing this inside of httputil.ReverseProxy.ModifyResponse, and at this point the HTTP request Context doesn't have the prefix that it used. I could try to reconstruct the value applied by MiddlewareSetPrefixHeader(), but instead of doing that I just lookup that value Observatorium supplied for X-Forwarded-Prefix which happens to be exactly the value /api/traces/v1/{tenant}/ that I need to make the UI work.

I don't need to get the value from here, I could just use the literal "X-Forwarded-Prefix" myself. Or we can talk about the whole approach. What is your preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have misunderstood. I only meant if we need have that constant accessible from other packages (i.e. the constant name is capitalized), if we don't use the constant outside of here.

main.go Outdated
),
)

// Static UI (.js and .css), not protected by RBAC -- any user can see UI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there's any reason we do not want to enforce authorization specifically on UI?

Also I'm wondering if following the convention we have with other signals would make sense here as well, i.e. having everything organize under one handler instead of splitting it into API, UI etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I go to http://localhost:8080/api/traces/v1/test-oidc/search/ I am redirected to http://localhost:5556/dex/auth/local/login?back=&state=vcqh2yns46maemaiyfh2k33xm where I am presented with a Dex login screen. So we do enforce authentication through the r.Use(authentication.WithAccessToken()) earlier in the function.

I felt like the RBAC permission for traces for for reading and writing traces, not UI. By not doing RBAC the UI comes up the UI tries to fetch trace "service names", and the user sees the Unauthorized message within a nice-looking Jaeger UI. Do you want me to add RBAC here to block the UI from coming up at all?

I originally used a single handler until I chose to use httputil.ReverseProxy.ModifyResponse. I didn't want that logic to run over the /static resources of the Jaeger UI, only the generated HTML. I split into two handlers to make that logic more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation Ed. I get splitting the handlers, and yes, we do authentication, but I'm still wondering about the authorization. Can you elaborate on what you mean by the user will see Jaeger UI with unauthorized message? Where will this authorization check happen? I'm still kind of leaning more towards covering UI with authorization fully. I can think of theoretical example, where a tenant has e.g. only metrics allowed, but some other tenants might have both metrics and traces allowed. In such cases, what is the value of having that UI accessible to unauthorized tenant? For me it would be cleaner to cut off the tenant completely and prevent any kind of potential 'snooping', if they don't have authorization in the first place. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By not doing RBAC for the UI resources we allow the UI to load and make a failing API query for traces.

If I remove RBAC permission for traces the UI for http://localhost:8080/api/traces/v1/test-oidc/search/ looks like this:

no-trace-rbac

I thought it was cute to allow it to work this way. I am happy to secure the UI resources behind RBAC permission to read traces if that is your request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just trying to think what's the most reasonable solution here, but I'm open to other opinions. I'm currently leaning more towards authorizing on all endpoints, since if the tenants is not authorized to read / write trace signals, there's no reason for the tenant to even be accessing the UI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense what @matej-g proposes - also it may avoid unnecessary loading of the UI + trial to fetch traces. So maybe is better to consistently to add authorization on all endpoints

NewHandler(labels prometheus.Labels, handler http.Handler) http.HandlerFunc
}

type nopInstrumentHandler struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is "mandatory" to have this type? I would avoid it if it is optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavolloffay I didn't write this code. Exactly the same code is in the log handler api/logs/v1/http.go and the metrics handler api/metrics/v1/http.go. I just copied it from there so that this code would be as close as possible to the rest of Observatorium.

@matej-g can you address this? Keep in or remove?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well how would the code look like if it was removed? I guess it's simply the default no-op option for instrumentation handler.

I'd say we could, again, improve on this, as all 3 signals use virtually the same code, but that's a question for a refactoring PR.

proxy/proxy.go Outdated Show resolved Hide resolved
test/e2e/interactive_test.go Outdated Show resolved Hide resolved
test/e2e/services.go Outdated Show resolved Hide resolved
@jessicalins jessicalins mentioned this pull request Mar 23, 2022
jessicalins
jessicalins previously approved these changes Mar 25, 2022
Copy link
Contributor

@jessicalins jessicalins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

api/traces/v1/http.go Show resolved Hide resolved
main.go Outdated
r.Use(authentication.WithTenantHeader(cfg.traces.tenantHeader, tenantIDs))

// V2 query API, protected by RBAC
r.Mount("/api/traces/v1/{tenant}/api",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking closely, I'm wondering why we're departing here from how the paths are mounted for metrics and traces. If I'm looking at the code correctly, we're mounting these paths, but we're handling everything on them via /*. Why aren't we handling only specific paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment. I have restricted things.
Now the API handler only proxies GET for /api/traces, /api/dependencies, and /api/services. The UI and static asset handlers still take /* but only proxy GET.

I am a bit confused about the code that I needed to do this. I am doing stripTenantPrefix("/api/traces/v1"() then

		r.Get("/traces*", c.instrument.NewHandler(
			prometheus.Labels{"group": "tracesv1api", "handler": "api"},
			proxyRead))

I do not understand why it works when I use r.Get("/traces*"...). I had expected to be doing `r.Get("/api/traces*"...)

Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
Signed-off-by: Ed Snible <snible@us.ibm.com>
@esnible
Copy link
Contributor Author

esnible commented Mar 28, 2022

Let me give testing instructions.

There are two ways to test this. One way is to run make test-interactive. This allows the API to be tested but not the UI (at least on the Mac). (The reason is that Observatorium expects OIDC to have the same auth URL from the point of view of the browser and the point of view of Observatorium. From the browser the auth URL needs to be something like "localhost:55000/login" and from Observatorium it needs to be something like "interactive-test-dex:5555/login".)

The fun, interactive way that I test is by running three programs in three windows:

docker run --rm --name dex --publish 5556:5556 --volume /tmp/dex.yaml:/tmp/dex.yaml --volume /tmp/dex.pem:/tmp/dex.pem --volume /tmp/dex.key:/tmp/dex.key dexidp/dex:v2.30.0 dex serve /tmp/dex.yaml
cd ~/go/src/github.com/jaegertracing/jaeger
ADMIN_HTTP_HOST_PORT=localhost:14269 HTTP_SERVER_HOST_PORT=localhost:5778 COLLECTOR_GRPC_SERVER_HOST_PORT=localhost:14250 COLLECTOR_HTTP_SERVER_HOST_PORT=localhost:14268 QUERY_HTTP_SERVER_HOST_PORT=localhost:16686 QUERY_GRPC_SERVER_HOST_PORT=localhost:16685 PROCESSOR_JAEGER_BINARY_SERVER_HOST_PORT=localhost:9999 QUERY_UI_CONFIG=/tmp/ui.json make run-all-in-one
cd ~/go/src/github.com/observatorium/api 
go run main.go -tenants.config=/tmp/http-dex-tenants-email-claim.yaml --rbac.config=/tmp/rbac.yaml -traces.write.endpoint localhost:4317 -grpc.listen localhost:8317 -traces.read.endpoint http://localhost:16686 -logs.write.endpoint localhost:8080 -web.internal.listen localhost:8081 -web.listen localhost:8080 -log.level debug

(The config files needed for the above can be copied out of a make test-interactive e2e folder, I think.)

Copy link
Contributor

@matej-g matej-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me now, stellar job @esnible! ⭐

Copy link
Contributor

@jessicalins jessicalins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work! 🌟

I can also see the Jaeger UI + query for traces when using make test-interactive

Screenshot from 2022-03-30 10-09-49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants