-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define resource semantic convention stability and justification #162
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# Resource Semantic Convention Stabiltiy Requirements | ||
|
||
Outline what changes are "safe" to perform for resource semantic conventions | ||
and which require major version bumps. | ||
|
||
## Motivation | ||
|
||
This outlines acceptable changes to semantic conventions for resources, allowing | ||
growth and evolution over time. | ||
|
||
While Telemetry Schemas propose a mechanism to define changes to semantic | ||
conventions in a way that users can evolve telemetry, this document first | ||
attempts to define what is considered a compatible change. | ||
|
||
## Explanation | ||
|
||
We take the assumption that `Resource`, as it exists today, is the identity of | ||
an SDK and is leveraged as such in a lot of assumptions throughout open | ||
telemetry. Specifically, if an SDK that used to produce a `Resource` of | ||
attributes XYZ instead produces ABC, these metrics would be considered from a | ||
different source. To that end, if a Metric produced on version x.y of | ||
OpenTelemetry SDK is incompatible with one produced on version x.{y+1} because | ||
of `Resource` changes, this would be considered breaking. | ||
|
||
This means the following: | ||
|
||
- `Resource` detectors should provide a comprehensive set of attributes in | ||
all or nothing fashion. if a resource detector produced attributes XYZ for | ||
a given compute, it should continue to do so for compatible versions. | ||
- Resource attributes (e.g. `cloud.provider`) may have new possible values | ||
defined as long as there is zero overlap with how the previous values were | ||
computed. For example, `acme-cloud` would never be inferred where previously | ||
`aws` was inferred. | ||
*Note: if `aws` was inferred incorrectly on "Acme cloud" previously, that would not be a violation.* | ||
- Adding new Resource attribute semantic conventions is allowed, however | ||
existing resource detectors MUST NOT use them, only new detectors would be | ||
allowed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This last point is a bit vague. What counts as a new detector? If an attribute is added to the K8s resource definition, how would it be handled? |
||
|
||
If we assume capabilities of | ||
[Telemetry Schema conversion](https://github.com/open-telemetry/oteps/pull/161), | ||
then we can relax some restrictions as follows: | ||
Comment on lines
+39
to
+41
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, the previous paragraph assumes no schemas? Schemas are already part of the spec and will be part of 1.4.0 spec, so I do not quite understand why we need to continue evolving the behavior in the schemaless world. |
||
|
||
- 1:1 Renaming of attribute names or labels is allowed. | ||
- New attributes are allowed to be generated in a resource detector. | ||
|
||
## Internal details | ||
|
||
- Resource semantic conventions will need to be defined at the | ||
`Resource Detector` level to ensure stability of identity of resources. | ||
- Attribute names and existing values cannot be renamed within stable versions | ||
(pending OTEP #161) | ||
|
||
### New Semantic Conventions | ||
|
||
Going forward, Resource semantic conventions will be outlined at a "Resource | ||
detection" level. While users can merge together resource detectors, each | ||
detector will provide a stable set of Resource attributes and labels. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do resource detectors differ from the current namespacing regime for resource attributes? Is |
||
|
||
For Example, [k8s](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/resource/semantic_conventions/k8s.md) | ||
resources would be changed, such that for Node: | ||
|
||
#### Node | ||
|
||
A resource detector providing K8s node information MUST produce the following. | ||
|
||
**type:** `k8s.node` | ||
|
||
**Description:** A Kubernetes Node. | ||
|
||
<!-- semconv k8s.node --> | ||
| Attribute | Type | Description | Examples | Required | | ||
|---|---|---|---|---| | ||
| `k8s.node.name` | string | The name of the Node. | `node-1` | *No* | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does the stability requirement interact with optional attributes? |
||
| `k8s.node.uid` | string | The UID of the Node. | `1eb3a0c6-0477-4080-a9cb-0cb7db65c6a2` | *YES* | | ||
<!-- endsemconv --> | ||
|
||
Here, the `k8s.node.uid` is marked as required for this resource detector. | ||
|
||
## Trade-offs and mitigations | ||
|
||
What are some (known!) drawbacks? What are some ways that they might be mitigated? | ||
|
||
Note that mitigations do not need to be complete *solutions*, and that they do not need to be accomplished directly through your proposal. A suggested mitigation may even warrant its own OTEP! | ||
|
||
## Prior art and alternatives | ||
|
||
Currently, the [specification for stability states](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/versioning-and-stability.md#not-defined-semantic-conventions-stability): | ||
|
||
``` | ||
Semantic Conventions SHOULD NOT be removed once they are added. | ||
New conventions MAY be added to replace usage of older conventions, but the older conventions SHOULD NOT be removed. | ||
Older conventions SHOULD be marked as deprecated when they are replaced by newer conventions. | ||
``` | ||
|
||
This is both more restrictive and less restrictive than the current implementation. It has the consequence of potentially breaking metric/resource identity from version bumps in OpenTelemetry. | ||
|
||
One alternative for flexibility in resource labelling is to allow resources to | ||
contain "Identifying" and "Descriptive" attributes, where only identifying | ||
attributees are subjected to rigorous stability limitations. | ||
|
||
## Open questions | ||
|
||
- Should we be more flexible in Resource identity? | ||
- Should Resource attributes be outlined as "identifying" and "descriptive"? | ||
- Should we consolidate resource identity at the detector level? | ||
|
||
## Future possibilities | ||
|
||
As we improve the Telemetry Schema design, we may be able to improve what is | ||
considreed a compatible change for Resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused as to why we need this document.
I believe that now that starting from 1.4.0 we have Telemetry Schemas, the Schemas defines what changes are safe. If the change is representable as version change in the Schema then it is safe, everything else is unsafe.