Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Enviroment Variables to process registry schema #672

Open
mjwolf opened this issue Jan 29, 2024 · 3 comments
Open

Add Enviroment Variables to process registry schema #672

mjwolf opened this issue Jan 29, 2024 · 3 comments
Assignees

Comments

@mjwolf
Copy link
Contributor

mjwolf commented Jan 29, 2024

Environment variables are an important part of process data models, and they should be added to the process registry schema.

process.env_vars was previously part of #564, and was removed because there are some open discussion items that should be decided before being added to the schema.

Some of these questions to resolve are:

  1. Should env_vars be an object, with environment variable names as free-form leaf nodes, as suggested here?

Although this suggestion could have advantages, I'm not sure if it's possible. According to the Open Group standard, environment variable names names shall not contain the character '=' (ref), but have no other exclusions, so extended character sets, symbols, etc are valid as part of environment variable names. I'm not sure if this works with OTel key names.

One alternative to using freeform keys is to store environment variables as a string array, such as this

  1. Should filtering be required or recommended? (discussion)

Environment variables could contain sensitive information, such as API keys, and this information should be redacted to prevent security problems. It should be decided if filtering is required (using MUST) or recommended (using SHOULD).

@trask
Copy link
Member

trask commented Jan 29, 2024

  1. Should env_vars be an object, with environment variable names as free-form leaf nodes, as suggested Add additional process attributes to registry #564 (comment)?

I believe so, we have a few other examples like this, and semconv tooling already supports <key> in attribute names (see HTTP headers for example)

  1. Should filtering be required or recommended?

Probably the easiest is to make these attributes Opt-In (also can see HTTP headers for example)

@svrnm
Copy link
Member

svrnm commented Jan 30, 2024

Thanks for bringing this discussion out of the PR, for completeness here is the part of the comment I made about having a similar format as the HTTP headers:

| `process.environment_variable.<key>` | string | Process environment variables, `<key>` being the environment variable name, the value being the environment variable value. | `proccess.environment_variable.PATH="/usr/local/bin;/usr/bin"`; `process.environment_variable.USER="ubuntu"` |

Although this suggestion could have advantages, I'm not sure if it's possible. According to the Open Group standard, environment variable names names shall not contain the character '=' (ref), but have no other exclusions, so extended character sets, symbols, etc are valid as part of environment variable names. I'm not sure if this works with OTel key names.

Looking into the (https://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap08.html) you quoted, it seems the following sentence has relevance as well:

These strings have the form name=value; names shall not contain the character '='. For values to be portable across systems conforming to IEEE Std 1003.1-2001, the value shall be composed of characters from the portable character set (except NUL and as indicated below). There is no meaning associated with the order of strings in the environment. If more than one string in a process' environment has the same name, the consequences are undefined.

IEEE 1003.1 is POSIX, and the portable character set, is a set of 103 characters, see also this source.

So this shouldn't be a concern, additionally attribute naming specification states that every name MUST be a valid Unicode sequence., which should help even if any system operates outside of POSIX/portable character set.

  1. Should filtering be required or recommended?
    Probably the easiest is to make these attributes Opt-In (also can see HTTP headers for example)

+1 for this, by having that format process.environment_variables.<key> you can select the environment variables that you want to have (and don't want to have), without the need to sanitize a string holding all of them. So (some) filtering is implicit.

@joaopgrassi
Copy link
Member

FYI @open-telemetry/semconv-system-approvers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants