Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional vs required attributes in HTTP conventions and beyond #2114

Closed
lmolkova opened this issue Nov 10, 2021 · 26 comments · Fixed by #2469
Closed

Optional vs required attributes in HTTP conventions and beyond #2114

lmolkova opened this issue Nov 10, 2021 · 26 comments · Fixed by #2469
Assignees
Labels
area:semantic-conventions Related to semantic conventions semconv:HTTP spec:trace Related to the specification/trace directory

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Nov 10, 2021

Based on instrumentation SIG meeting discussion

Problem

Optional attributes in semantic conventions leave the decision to populate them on the instrumentation library.
However, this decision should be in the user/backend/distro hands: this is where you know what information you'd like to see and pay for.
So one library that doesn't populate e.g. http.request_content_length (let's put aside the need to have it on spans) makes the decision for all backends and users of this library.

Proposal

TL;DR: Instrumentation should collect all useful attributes. When some attributes are truly optional (i.e. rarely needed), there should be a configuration option for them. This issue does not try to address configuration.

  • Instrumentation should populate all generally useful attributes (and we should probably clean them up at least for HTTP).
  • When there is a need for optional attributes (e.g. because the collection is too expensive and/or attributes are not generally useful), instrumentation libraries should provide configuration options and populate them in opt-in mode
    • is seems the current approach to leave it up to language+per group of attributes (e.g. http headers) might not scale well, but we can address it separately
  • The consumption side (distro/backend/user) should decide which attributes to export. E.g. exporter can ignore http.scheme if it has no use for it. Nothing new here.

What does it mean for HTTP (example)

This is a general semantic convention discussion, but if we go through this exercise for HTTP, here's what I suggest.

All of the below is just an example of this approach, we can discuss each of those suggestions separately from this issue

  1. Let's pick one set of common attributes to make collection consistent (perhaps a specific set worth a separate discussion)
  • http.url for clients
  • e.g. http.scheme, net.host.name, net.host.port, http.target for servers
    • it boils down to unification on http.host net.host.name and http.server_name
    • if scheme is only there to reconstruct url - do we really need it?
  1. It leaves us with a few more generally-useful attributes:
  • client and server:
    • http.method
    • http.status_code
  • server only:
    • http.route - I assume if your web framework supports it, you'd want it
    • http.server_name - it seems it's supported by a few instrumentations and if it is supported, it's important and generally-useful
  1. Following attributes should be collected from headers through header collection configuration - we should not have two different attributes for each of them. they are optional in a way that user/distro enables them

    • http.host
    • http.user_agent
    • http.request_content_length
    • http.response_content_length
    • http.client_ip - is there a way to populate it which is not X-Forwarded-For?
  2. it leaves us with probably not-so-commonly-used attributes which I cautiously suggest removing unless there is some important scenario behind them

    • http.flavor does anyone really need it?
    • http.request_content_length_uncompressed - it's not always HTTP client responsibility to decompress or to await for decompression to finish - the instrumentation is probably feasible in a subset of cases - should we keep it?
    • http.response_content_length_uncompressed

As a result, we'd have common attributes every instrumentation should populate, headers behind configuration and group 4 (which I hope we can remove)

@lmolkova lmolkova added the spec:trace Related to the specification/trace directory label Nov 10, 2021
@lmolkova
Copy link
Contributor Author

/cc @denisivan0v @pyohannes @anuraaga

@anuraaga
Copy link
Contributor

I definitely believe instrumentation should be able to fill all conventions as proposed here - instrumentation needs to provide that baseline which can be then trimmed down to needed bits through configuration or similar.

I don't know if there's much point in having classifiers like required and optional at that point - I can imagine backends providing a recommended semantic configuration, and users may customize this further. I don't think either of these would really be interested in whether it's labeled required here, they have much more knowledge of what conventions are useful with that backend, or in that user app, then a spec author.

Some reasons I can think of for excluding attributes

  • Backend completely ignores anyways
  • Backend can't handle high cardinality attributes
  • Backend deems certain attributes useful in their experience and try to reduce user cloud bill
  • User knows all their APIs use POST and don't need that information

These seem quite diverse and there are probably more. If OTel instrumentation is able to fill all conventions in order to provide a baseline, I'm not sure how to go further to navigate these waters and make any recommendations on what is required or not.

@kenfinnigan
Copy link
Member

These are very interesting points I hadn't really thought of, and I agree with @anuraaga's comments around applicability of required/optional delineation for backends.

However, I can see the benefit for some level of "required" information to enforce that all backends are able to understand those attributes and their meaning. Granted, it doesn't guarantee a backend actually utilizes the attribute in visualizations, or other ways.

For me, it comes down to whether it is a goal of OpenTelemetry to define a "required" set of attributes that all vendor backends need to understand/utilize? Or is it sufficient to define the possible set of attributes and their meaning?

If defining required attributes is not a goal, differentiating between attributes in required and optional terms likely does not add much value, and it can be left up to vendors and organizations as to what their needs are.

@pyohannes
Copy link
Contributor

For me, it comes down to whether it is a goal of OpenTelemetry to define a "required" set of attributes that all vendor backends need to understand/utilize?

OpenTelemetry doesn't define any requirements for vendor backends, but it does for instrumentation. By doing that, it should give vendor backends a baseline they can rely on, but it's completely up to vendor backends which attributes they utilize.

TL;DR: Instrumentation should collect all useful attributes. When some attributes are truly optional (i.e. rarely needed), there should be a configuration option for them. This issue does not try to address configuration.

But "truly optional" attributes would still be defined in the semantic conventions?

@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 11, 2021

I guess we're talking about similar things, is this a fair summary?

  1. Some attributes that are always worth collecting despite perf impact
  2. Others that are useful, but not for everyone. Since collecting them adds overhead, we might want to optimize collection.
  3. Some additional, niche things, like custom headers - convention exists for prefix only or doesn't exist at all

And we need to pick a default level for instrumentation: 1 or 1+2. And then instrumentation allows to enable 2 or disable it depending on default we choose.

3 is always opt-in if defined.

On the backend side, backend is always free to use or drop attribute, I don't see any need to spec backend behavior here.

@kenfinnigan
Copy link
Member

Given vendors can pick and choose which attributes they utilize, we should look to have the smallest set of required attributes that aligns with all vendor backends.

@yurishkuro
Copy link
Member

Why can't the backend functionality gracefully degrade when certain attributes are not present? Can someone give an example of an attribute whose absence would be catastrophic/non-recoverable for the backend?

@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 11, 2021

Why can't the backend functionality gracefully degrade when certain attributes are not present?

It can, the problem is that there is no path to a great experience when certain instrumentation chooses not to expose optional attributes.

E.g. http.route is optional currently, meaning that any instrumentation should feel free to skip it. Without route, there is no good way to query spans or build metrics for the given high-cardinality path. And there is nothing users or backends can do to improve it without changing instrumentation.

@lmolkova
Copy link
Contributor Author

Given vendors can pick and choose which attributes they utilize, we should look to have the smallest set of required attributes that aligns with all vendor backends.

assuming we pick the bare minimum, what do we do with other useful attributes? Do we require instrumentation to add them behind configuration? add depending on the author's choice? not add at all? I'm trying to create some clarity on this particular part and find defaults.

@kenfinnigan
Copy link
Member

Given vendors can pick and choose which attributes they utilize, we should look to have the smallest set of required attributes that aligns with all vendor backends.

assuming we pick the bare minimum, what do we do with other useful attributes? Do we require instrumentation to add them behind configuration? add depending on the author's choice? not add at all? I'm trying to create some clarity on this particular part and find defaults.

As you suggested previously, it feels like there are really 3 levels we need. The first is the bare minimum required attributes that most/all vendor backends will support, second is a group of attributes the community believes are beneficial but backends may not recognize, and then all other attributes.

I can see 2 and 3 being behind instrumentation flags, but 2 is enabled by default as the community sees benefit in those attributes

@anuraaga
Copy link
Contributor

For me, it comes down to whether it is a goal of OpenTelemetry to define a "required" set of attributes that all vendor backends need to understand/utilize? Or is it sufficient to define the possible set of attributes and their meaning?

I think currently the definition of a backend only involves OTLP

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/vendors.md

It would be a big addition to add attribute processing too as something a backend must do to support OTel. I'm not sure there is much benefit to it though - for example, even saying it's ok to not be any visualization as long as the attribute is supported. Does that mean it's actually supported?

I would expect then recommendations getting even more technical - all required attributes must be indexed and usable as filters in queries, for example. If going that far then I think the classifications do then provide action items for backends. But I guess OTel is not generally meant to spec out backends in such detail.

Otherwise I'm not sure what actually gets gained by having classifications vs not. It seems to only make certain attributes opt-in and others opt-out, but that seems to make the experience less intuitive, especially if some backend semantic config will just get copied in anyways.

@yurishkuro
Copy link
Member

Why can't the backend functionality gracefully degrade when certain attributes are not present?

It can, the problem is that there is no path to a great experience when certain instrumentation chooses not to expose optional attributes.

What is the path to great experience if the application chooses to disable tracing altogether? Disabling tracing is just an extreme case of not sending all useful attributes, i.e. verbosity level 0. The more data they send, the better/richer experience they get. I think any backend would still have to deal with apps that simply cannot use common instrumentation (maybe because they run on proprietary frameworks), and may not send the data that is 100% according to the spec. And again, the closer they are to the spec, the better service they get.

@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 12, 2021

If user disables tracing, it's user's choice. using uncommon instrumentation or writing a custom one is also a user choice.
The goal of this issue is to make sure common instrumentations give user the choice to send more or less data. Currently this choice is on instrumentation

@iNikem
Copy link
Contributor

iNikem commented Nov 13, 2021

The goal of this issue is to make sure common instrumentations give user the choice to send more or less data.

Now I am confused :) Is this issue about providing configuration mechanism or about marking some attributes as required?

I think that http.route is a perfect counter-example to your arguments :) You say

E.g. http.route is optional currently, meaning that any instrumentation should feel free to skip it. Without route, there is no good way to query spans or build metrics for the given high-cardinality path. And there is nothing users or backends can do to improve it without changing instrumentation.

I understand it that you want to mark http.route as required attribute for SERVER spans. But it is not always possible for auto-instrumentation to provide it. My HTTP server may not use declarative routing that auto-instrumentation can easilyt understand. E.g. Netty server handler can use totally app specific if branching or whatnot.

And this is my general feedback: for me, required attribute means that ALL instrumentation libraries (including auto-instrumentations) MUST provide this attribute. So it is more about the capabilities of instrumentations than about the need of clients.

@anuraaga
Copy link
Contributor

route is a nice example - I feel it provides information on both sides.

  • route can be hard to populate with instrumentation as it generally needs a well-designed underlying framework

  • route is basically necessary for useful metrics for HTTP server

I feel the second point is more important though, and applies to tracing as well. I suspect most people that have used tracing know how important route is to be there when searching. I would call instrumentation that doesn't populate it to be "low quality instrumentation". It is fine for this to exist, instrumentation can only go as far as the framework being instrumented. It's why we generally use higher level frameworks, and why they naturally are easier to instrument. I don't feel the words required and optional capture this well. It's why I actually hope the distinction can just disappear, it doesn't seem to be helping anyone in practice.

@arminru arminru added area:semantic-conventions Related to semantic conventions semconv:HTTP labels Nov 16, 2021
@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 17, 2021

I agree that required and optional words describe technical limitations that instrumentations and frameworks have, but do not help new instrumentation authors understand what and when they should provide.

So I'd change it to the following categories:

  • MUST/required -> provide always, no configuration necessary (we believe all frameworks and instrumentation should be able to provide them)
  • the rest of the attributes that are currently SHOULD/optional:
    • attributes instrumentation should provide if library/framework supports them (e.g. http.route) -> provide if available no configuration necessary
    • either opt-out (e.g. http.user_agent?) -> provide by default, but allow to disable - based on the app configuration
    • or opt-in (e.g. custom headers) -> don't provide by default, but allow to enable - based on app configuration

since provide always is not really different from provide if available, we can merge those two for simplicity into provide if available

@kenfinnigan
Copy link
Member

What would be the expected result for a provide always attribute if it's not there? Is it an error, or it falls back to a provide if available?

My one concern with merging provide always and provide if available is making attributes which should be there, less likely to be there as there is no longer an always requirement.

@pyohannes
Copy link
Contributor

My one concern with merging provide always and provide if available is making attributes which should be there, less likely to be there as there is no longer an always requirement.

I think @lmolkova's intention of merging those is to come with something like always provide if available, or must provide if available.

@kenfinnigan
Copy link
Member

Thanks @pyohannes, that helped my mental model.

I like the addition of always to indicate the required nature of the attribute.

@anuraaga
Copy link
Contributor

@kenfinnigan I think users need to be in control of their data (always 😂 ) - I can't imagine having any attribute that is not allowed to be disabled somehow, that seems to take too much control away. I am a fan of merging into provide if available personally.

@kenfinnigan
Copy link
Member

I see your point. I've been coming at it from the perspective of a minimum set of information provided to backends, but I appreciate OTeL doesn't control that aspect

@lmolkova
Copy link
Contributor Author

lmolkova commented Nov 22, 2021

it sounds like we have some preliminary consensus here, I'll send a PR to HTTP spec as a start. Once we have an agreement there, we can generalize it beyond HTTP (and it would require some tooling change around yml).

@lmolkova
Copy link
Contributor Author

lmolkova commented Mar 25, 2022

I'm making a revision of what OTel instrumentations populate (as a step to understand how to clarify what's required and possible) around sets of client and server attributes.

.NET

Attribute Client Server
http.url yes yes (constructed from fragments)
net.peer.ip no yes
net.peer.port no yes
net.peer.name no no
http.host yes (extracted from url) yes
net.host.ip no no
net.host.name no no
net.host.port no no
http.server_name no no
http.route no yes
http.target no yes
http.scheme metrics only metrics only
http.flavor yes (opt-in) yes (opt-in)
http.*content_length* no no

Java

Attribute Client Server
http.url yes no
net.peer.ip no no
net.peer.port yes no
net.peer.name yes no
http.host no yes (all)
net.host.ip no no
net.host.name no no
net.host.port no no
http.server_name no yes (but only Armeria server instrumentation)
http.route yes ( ratpack client 🤔) yes. Most instrumentations populate route, others fall back to path
http.target no yes, many construct it from path + query
http.scheme no yes
http.flavor yes yes
http.*content_length some some
http.*content_length_uncompressed no no

Go

Attribute Client Server
http.url yes no
net.peer.ip yes, only if request host is IP address yes, only if request host is IP address
net.peer.port yes yes
net.peer.name yes yes
http.host yes yes
net.host.ip yes yes
net.host.name yes yes
net.host.port yes yes
http.server_name no yes
http.route no available on filters, but not on middleware. When not available path or empty string is used
http.target no yes
http.scheme yes yes
http.flavor yes yes
http.*content_length yes yes
http.*content_length_uncompressed no no

Python

Attribute Client Server
http.url yes (all) yes
net.peer.ip only urllib some
net.peer.port no some
net.peer.name no some
http.host no yes (constructed )
net.host.ip no no
net.host.name no no
net.host.port no most
http.server_name no most
http.route no yes
http.target no yes
http.scheme no yes
http.flavor no yes
http.*content_length no no

JS

Attribute Client Server
http.url yes (constructed ) yes (constructed )
net.peer.ip yes no
net.peer.port yes no
net.peer.name yes (constructed) no
http.host yes (constructed) no
net.host.ip no yes
net.host.name no yes (constructed )
net.host.port no yes
http.server_name no yes (manual configuration)
http.route no yes
http.target yes yes
http.scheme yes no
http.flavor yes yes
http.*content_length* yes yes

Several observations:

  • Clients
    • all clients provide http.url.
    • clients inconsistently provide net.peer attributes
    • for some clients, url needs to be constructed
  • Servers
    • in most cases url needs to be constructed. Many server instrumentation don't provide url
    • all servers provide http.host, some provide http.server_name
    • net.host.* attribute usage is inconsistent and rare
    • net.host.name description does not suggest how it can be used for HTTP and how it should be populated
    • server instrumentations populate http.route, but fallback to path, which could be problematic for metrics

Proposal:

UPDATE: taking into account @trask's awesome suggestion on metrics needs (below)

Client: the only set of attributes clients should support is: http.url, net.peer.name, net.peer.port

  1. require http.url, net.peer.name (use IP if only IP is avaialble), net.peer.port (when not default)
  2. Make net.peer.ip attribute optional (opt-in?)

Server: the only set of attributes servers should support is: http.scheme, net.host.name, net.host.port, http.target, http.route

  1. Require net.host.name, ask for it's value to be the best available server name with specific priority (server-name, host header, ip address)
  2. Require (when available) http.route - it's an essential attribute for both server traces and metrics
  3. Remove http.host - host header should be populated (opt-in) using http.request.header.host
  4. Remove http.server_name - if it's available it's populated in net.host.name
  5. Make http.url optional (opt-in)
  6. Make net.host.ip attribute optional (opt-in?)
  7. Require to populate http.route only when it's available (i.e. guarantee low-cardinality)

@trask
Copy link
Member

trask commented Mar 25, 2022

I really like this proposal 👍 👍

Only two comments:

Make net.peer.* attributes optional (opt-out?)

These are important http client metric attributes. Our current thinking in Java instrumentation is that metric attributes will always be a subset of span attributes (on operations where we capture both spans and metrics).

TBD on http.route

Similarly, this is an important http server metric attribute.

@lmolkova
Copy link
Contributor Author

lmolkova commented Mar 25, 2022

thank you @trask! Agree, and added your suggestion. Will check existing instrumentation for http.route in the next few days. (update: done)

@jamesmoessis
Copy link
Contributor

metric attributes will always be a subset of span attributes (on operations where we capture both spans and metrics)

100% on this! Very useful for folks wanting to generate metrics from spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions semconv:HTTP spec:trace Related to the specification/trace directory
Projects
No open projects
Status: Done
10 participants