Trace Payload Collection #234

nirga · 2023-06-19T16:42:39Z

Continuing the work started by @ronyis at #219.

We're proposing semantic attributes for reporting payloads (of HTTP requests, DB queries, etc.) as part of OpenTelemetry. This should be optional for users who wish to use it and can provide consumers of OpenTelemetry with a standard option for getting payloads out of trace instrumentation.

Previous instrumentation libraries (pre-otel) have successfully implemented it so I think this is something that will be welcomed by the community.

Would love to get your feedback and work on getting this accepted.

text/trace/0234-payload-collection.md

jsuereth · 2023-06-22T17:28:44Z

text/trace/0234-payload-collection.md

+(for example, `http.request.body`):
+
+- **Data attribute**: `<attribute>.content`. Holds the decoded content of the payload data. Alternatives: `.data`, `.payload`.
+- **Size attribute**: `<attribute>.size`. Holds the number of bytes of the encoded payload data. Alternatives: `.length`, `.bytes`.


Since we use protocol buffers, the size/length of content is implicitly recorded.

I believe even for JSON this is effectively true (you'll have a length/size available at runtime) so no need to encode it separately.

I think it's important to keep the size as a separate attribute, since we may have an actual different content size (for example if we truncated part of it).

jsuereth · 2023-06-22T17:29:39Z

text/trace/0234-payload-collection.md

+- **Size attribute**: `<attribute>.size`. Holds the number of bytes of the encoded payload data. Alternatives: `.length`, `.bytes`.
+- **Encoding attribute**: `<attribute>.encoding`. Holds the original attribute encoding type.
+  Predefined values should be declared (though users may decide using custom values as well).
+  For example - `json`, `protobuf`, `avro`, `utf-8`.


Are you going to allow "versioning" of these protocols?

What's the path for adding new (custom) encodings?

Note that this is only the encoding originally initially when sending the payload. The content structure inside the span should always be the same, regardless of the protocol. This is actually another reason why we should translate the payload into some common structure as proposed here.

jsuereth · 2023-06-22T17:31:22Z

text/trace/0234-payload-collection.md

+#### `Payload` class
+
+We propose adding a class that will provide the related functionality for handling payload data.
+Both traces and logs could use instances of this class, which will assist sharing related functionality.


Since this was added in real tracing products, can you link to the APIs/SDKs they offered?

It's a bit different, since they just offered this instrumentation OOTB, so the API for the payload was hidden from the developer. See for example Epsagon's instrumentation for FastAPI (The exact structure of the data is seen here)

jsuereth · 2023-06-22T17:32:30Z

text/trace/0234-payload-collection.md

+  )
+```
+
+### Payload attributes in Logs


+1 to resolving this discussion as a priority.

But should it affect the attribute names for spans as proposed here?

jsuereth · 2023-06-22T17:35:15Z

text/trace/0234-payload-collection.md

+
+Representing Null values in payload contents is required, as it part of JSON and many other formats
+like Protobuf and Avro. Therefore we propose supporting it by spans and logs attributes APIs (while
+in OTel proto schema it is already supported).


Where is this supported in OTel proto schema? Do you mean the "AnyValue" being allowed to be "emtpy"?

I'd like more details about where/how you need Null support here. Are you trying to record payload attributes even if the payload is null? Could you just write an empty byte array for the Proto/Avro case?

Yes. Hmm, this can work. Updated this.

The point here is that we need to support JSON payloads like '{"x": null}' and also null, and similar examples in other formats.
If the payload is encoded with the "AnyValue" type, it can be achieved by using empty objects. Though it's still needed to be supported by the APIs.
In case we add a dedicated proto object, like the Value suggested in the alternative, we should also make sure it supports Null values.

I'd need a lot more justification on a new AnyValue type in OTLP before I'd be convinced. I think we already support the JSON payloads you mention.

jsuereth · 2023-06-22T17:38:27Z

text/trace/0234-payload-collection.md

+Capturing payloads using appropriate APIs could assist in specifying different limits in the future (either
+more are less strict than of 'regular' attributes), and mechanisms for shortening large payloads as well.
+
+### Handling sensitive data


I think it's worth calling out that payload collection likely requires some kind of flag to signal that sensitive data collection is ok.

Specifically - I think this proposal needs some kind of notion on a per-trace basis that denotes whether sensitive collection is "allowed" for that trace.

E.g. you can imagine a system where I issue an RPC with a special flag denoting that collection of payloads (and other sensitive data) is ok, where this would be turned on.

You can also imagine production systems that require user-consent before this data could be collected in a o11y system. We should absolutely have a mechanism to check (dynamically, per-request / per-trace) whether this data can be collected.

I wonder if this might be out of scope for this as it should be resolved on the API level of the different SDKs. Here I suggest something much simpler - if we ever to auto-instrument collection of payloads, the default should always be off where users who decide to turn it on should know it will be turned on for everything.

I'm also in favor of having this kind of flag regarding sensitive data collection. Though I don't think it should be a requirement for the changes proposed in this OTEP.
At this point, users already collect sensitive data using the existing APIs and attributes.

mishushakov · 2023-09-27T23:13:36Z

Would love to see that happen at Step CI 😉

axlrate · 2024-03-12T07:36:26Z

Should the payload be a span event instead of a span attribute? Most backends have higher size limits for span events.

mmanciop · 2024-03-12T15:48:13Z

Should the payload be a span event instead of a span attribute? Most backends have higher size limits for span events.

Lumigo does this with span attributes, and the experience was that a truncation to 1024, 2048 or 4096 or power thereof was often good enough. (The complex bit was performing truncation that would keep validity of the data format, e.g., JSON.) Users have means via SDK configuration to change the attribute truncation threshold. I think setting this data as span events would be very unintuitive in terms of data model. I don't think we should use an unintuitive data model to work around limitations of backends. Moreover, for an important use-case like this, I can imagine backends makings exceptions for the specific keys we'd use.

codefromthecrypt · 2024-07-02T05:35:01Z

might be worthwhile to ping folks who did some things in this space, even if moved on since.

@mchandramouli who created blobs at expedia for request/response logging. There are some ancient notes here, but I suspect concepts are still relevant.
@pavolloffay who worked on the hypertrace java agent which adds request/response collection.

chore: payload collection suggestion

4f0c336

nirga requested review from a team June 19, 2023 16:42

nirga mentioned this pull request Jun 19, 2023

Capture request and response bodies open-telemetry/semantic-conventions#857

Open

fix: spellcheck

8a5b030

kenfinnigan reviewed Jun 20, 2023

View reviewed changes

text/trace/0234-payload-collection.md Outdated Show resolved Hide resolved

fix: length attribute changed to size

d2603e8

ronyis mentioned this pull request Jun 21, 2023

Trace Payload Collection #219

Closed

nirga requested a review from kenfinnigan June 21, 2023 08:48

jsuereth reviewed Jun 22, 2023

View reviewed changes

carlosalberto added the priority:p2 label Jun 26, 2023

fix: discussion with jsuereth

f0aae08

nirga requested a review from jsuereth June 26, 2023 17:20

tedsuo added the triaged label Jul 24, 2023

trask mentioned this pull request Aug 15, 2023

Elasticsearch picks up the body info open-telemetry/opentelemetry-java-instrumentation#5380

Closed

trask mentioned this pull request Sep 18, 2023

Allow request body to be collected as a span attribute open-telemetry/opentelemetry-java-instrumentation#8778

Open

nirga mentioned this pull request Nov 1, 2023

[Sandbox] OpenLLMetry cncf/sandbox#67

Closed

2 tasks

nirga mentioned this pull request Nov 27, 2023

🚀 Feature: contribute this to otel traceloop/openllmetry#213

Closed

1 task

trask mentioned this pull request May 14, 2024

Opentelemetry operator collecting request response/body/payload open-telemetry/opentelemetry-specification#3586

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trace Payload Collection #234

Trace Payload Collection #234

nirga commented Jun 19, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

ronyis Jun 27, 2023

jsuereth Jun 28, 2023

jsuereth Jun 22, 2023

nirga Jun 26, 2023

ronyis Jun 27, 2023

mishushakov commented Sep 27, 2023

axlrate commented Mar 12, 2024

mmanciop commented Mar 12, 2024

codefromthecrypt commented Jul 2, 2024

Trace Payload Collection #234

Are you sure you want to change the base?

Trace Payload Collection #234

Conversation

nirga commented Jun 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mishushakov commented Sep 27, 2023

axlrate commented Mar 12, 2024

mmanciop commented Mar 12, 2024

codefromthecrypt commented Jul 2, 2024