-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specific URL query string values should be redacted #971
base: main
Are you sure you want to change the base?
Specific URL query string values should be redacted #971
Conversation
Should we also add a few 'obvious' ones? e.g., 'password', 'secret', 'private', etc.? |
here's the default list used by Elastic APM: https://www.elastic.co/guide/en/apm/agent/java/1.x/config-core.html#config-sanitize-field-names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we need to let users/sdks/distros to expand the blocklist via configuration, but this would be a purely incremental change.
I do find it ironic that there's a big "query string values are considered non-sensitive" info box in that section. But yeah, that's a good list of additions too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be easily configured in a standard way so users can add more to it.
I'm also surprised that this list is as short as it is.
I'd definitely prefer this over #961, since this is less intrusive for users who already capture query strings and unlike #961 this really handles values that are problematic from security point of view. With #961 the only chance for people to capture query strings is to turn it back on, but that again would capture problematic values as well. So I definitely prefer an approach that tries to handle problematic values over an all or nothing config. Since configuration came up: I think this list could be generalized and applied to multiple other things (e.g. Db query parameters, Elastic also applies this to x-form encoded request bodies (not yet part of SemConv, but could be), and there are probably many other attribtues). So the config could be a list of keys that are applied for sanitization to every attribute that can potentially contain sensitive information, not only for query strings. That could be also introduced incrementally; for now I think what the PR proposes is ok. |
If we look at what the user provided in the original report, here are some keys: It is important to ensure that sensitive query parameters are scrubbed by default. While this list is not exhaustive, the following parameters should be scrubbed as a starting point: "sig", "code", "key", "access_token", "client_secret", "password", "appkey", "app_key", "apikey", "apiKey", "api_key", "accesskey", "access_key", and "token". |
There are many potential candidates for additional query string keys to include. From initial CVE report against OpenTelemetry .Net:
From https://www.elastic.co/guide/en/apm/agent/java/1.x/config-core.html#config-sanitize-field-names:
From a process perspective of how to pick and choose (both now and in the future), it might be nice to say something along the lines of
|
* [`sig`](https://learn.microsoft.com/en-us/azure/storage/common/storage-sas-overview#sas-token) | ||
* [`X-Goog-Signature`](https://cloud.google.com/storage/docs/access-control/signed-urls) | ||
|
||
This list is subject to grow over time, but once a key is added to the list, removing it will be considered a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would these be considered as breaking changes?
For example, items in the current list have links to the official doc, one can imagine over years some of these technologies will be abandoned and docs would be retired.
I prefer this solution over #961, 👍 Question: In my comment and in @austinlparker's comment and in @reyang's comment (in chronological order), we discussed the idea of having generalized SDK capabilities for redaction, is this still something we want? If so I am happy to work towards a proposal for that. |
I'd definitely answer that with a yes. Users might want to have redaction capabilities they can use for data that doesn't conform to OTel semantic conventions. There have been related previous attempts: |
I agree with adding this verbiage as it helps clarify how the list was decided. |
As discussed during the SemConv SIG meeting I am going to provide a proposal for the "general problem", I will try to have a draft by the end of this week to share |
Here's the proposal: open-telemetry/oteps#255 |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
Related to #860
Alternative proposal to #961
Changes
Specific URL query string values should now be redacted by default due to concerns around leaking credentials.
The list of problematic query string keys is documented in the semantic conventions and is subject to change over time as we learn about additional common usages of credentials found in query string values.