OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

jmacd · 2024-07-29T22:36:23Z

Changes

Updates Trace SDK and TraceState handling specifications with OTEP 235 sampling thresholds. This PR depends on #4162 to introduce the concept of Trace Randomness. This PR is the second part of two, it focuses on thresholds.

Revise TraceIdRatioBased algorithm section. The existing TODO implies this is not a breaking change.
Change text about TraceIdRatioBased construction
Move text about TraceIdRatioBased description (leave unmodified).

The content of OTEP 235 was revised for clarity by @kalyanaj in open-telemetry/oteps#261. I've heavily copied from the final text in that still-unmerged OTEP. I introduced new content explaining how to compute thresholds from probabilities with use of variable precision, referring to the OTel Collector-Contrib pkg/sampling reference implementation. The new (Golang) demonstration code is validated here, https://go.dev/play/p/7eLM6FkuoA5.

A proof of concept for this specification along with #4162 can be found in open-telemetry/opentelemetry-go#5645.

Part of #3602.

Product of the Sampling SIG members @kentquirk @kalyanaj @oertl @PeterF778 and myself.

…ng OTEP 235.

specification/trace/tracestate-probability-sampling.md

jmacd · 2024-07-30T15:37:03Z

Feedback from the OTel Spec SIG meeting discussion cc/ @jsuereth:

Please add a migration guide to explain how transitioning samplers will work; in particular, it's not safe to begin using non-root independent sampling until TraceIdRatioBased samplers are replaced everywhere in a trace. Until then, only safe to continue using ParentBased sampling w/ root TraceIdRatioBased decision.

Update: 68fa270

github-actions · 2024-08-07T03:17:34Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

…ication into jmacd/otep235

specification/trace/tracestate-handling.md

…ication into jmacd/otep235

This reduces the number of lines of diff in PR 4166, which replaces the entire `tracestate-probability-sampling.md` file with new contents. Part of #4166. ## Changes Move a file, place a link to it and explain that a change is in progress.

jmacd · 2024-08-15T14:51:43Z

@kalyanaj @PeterF778 @oertl @kentquirk Please take another look at this PR, especially the file tracestate-probability-sampling.md which now reads as a new file, not as a major rewrite. The contents are derived from open-telemetry/oteps#261.

jmacd · 2024-08-15T14:52:54Z

@open-telemetry/specs-trace-approvers @open-telemetry/specs-approvers @open-telemetry/technical-committee this PR has reached consensus in the Sampling SIG, we have multiple prototypes implemented, and we are looking for final approvals.

specification/trace/sdk.md

specification/trace/tracestate-handling.md

specification/trace/tracestate-probability-sampling.md

specification/trace/sdk.md

specification/trace/tracestate-probability-sampling.md

…ication into jmacd/otep235

specification/trace/sdk.md

jpkrohling

Partial review, will try to complete by tomorrow.

jpkrohling · 2024-09-19T15:06:54Z

spec-compliance-matrix.md

@@ -87,6 +87,7 @@ formats is required. Implementing more than one format is optional.
 | [Built-in `SpanProcessor`s implement `ForceFlush` spec](specification/trace/sdk.md#forceflush-1) |          |     | +    |     | +      | +    | +      | +   | +    | +   | +    |       |
 | [Attribute Limits](specification/common/README.md#attribute-limits)                              | X        |     | +    |     | +      | +    | +      | +   |      |     |      |       |
 | Fetch InstrumentationScope from ReadableSpan                                                     |          |     | +    |     | +      |      |        | +   |      |     |      |       |
+| TraceIdRatioBased implements OpenTelemetry tracestate `th` field                                 |          |     |      |     |        |      |        |     |      |     |      |       |


Same question as the other PR: if this is required, shouldn't there be a couple of implementations lined up before the spec change is merged?

I have a shared my draft, open-telemetry/opentelemetry-go#5645, and @oertl has already merged an equivalent sampler in the Java contrib repository. (I would add that the OTel-Collector-Contrib probabilistic sampler processor acts as a near-prototype.)

The connection with probabilistic sampler is detailed in #4243 and has been described as an interoperability specification.

specification/trace/sdk.md

specification/trace/tracestate-probability-sampling.md

jpkrohling · 2024-09-19T16:31:39Z

specification/trace/tracestate-probability-sampling.md

+
+This proposal supports two sources of randomness:
+
+- **A custom source of randomness**: This proposal allows for a *random* (or pseudo-random) 56-bit value. We refer to this as `rv`. This can be generated and propagated through the `tracestate` header and the tracestate attribute in each span.


I think I commented this elsewhere, but when should I, as a user, should consider having a custom source of randomness?

This is meant to be part of #4162 which focuses on randomness. It writes "To enable sampling in this and other situations where TraceIDs lack sufficient randomness,"

However, I tried to stay away from the advanced use-cases some might mention. If you have a reason to use independent trace IDs and still want them to sample consistently, this is what you'd choose.

Continuing the connection with #4162. See this commit, which hopefully answers this question (thread).

specification/trace/tracestate-probability-sampling.md

jpkrohling

Other than my previous comments, LGTM!

specification/trace/tracestate-probability-sampling.md

jpkrohling · 2024-09-20T16:23:56Z

specification/trace/tracestate-probability-sampling.md

+
+The original TraceIdRatioBased sampler specification gave a workaround for the underspecified behavior, that it was safe to use for root spans: "It is recommended to use this sampler algorithm only for root spans (in combination with [`ParentBased`](./sdk.md#parentbased)) because different language SDKs or even different versions of the same language SDKs may produce inconsistent results for the same input."
+
+To avoid inconsistency during this transition, users SHOULD follow this guidance until all TraceIdRatioBased samplers used in a system have been upgraded to the modern `TraceIdRatioBased` specification based on W3C Trace Context Level 2 randomness.  After all `TraceIdRatioBased` samplers have been upgraded, it is safe to use `TraceIdRatioBased` sampler without also using the `ParentBased` sampler.


How can users assess that they reached this? Should we keep a table, showing from which versions which SDKs support the new spec?

Another way they can do this is to wait for all spans to have the W3C trace Random flag set across a system. How does that sound?

specification/trace/tracestate-probability-sampling.md

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>

…ication into jmacd/otep235

specification/trace/sdk.md

specification/trace/tracestate-probability-sampling.md

specification/trace/tracestate-handling.md

specification/trace/sdk.md

specification/trace/tracestate-probability-sampling.md

Nevay · 2024-09-28T17:00:54Z

specification/trace/tracestate-probability-sampling.md

+    }
+    // Raise precision by the number of leading 0s or Fs
+    _, expF := math.Frexp(probability)
+    _, expR := math.Frexp(1 - probability)


Increasing the precision by the number of leading 0s only affects probabilities close to 1. Is the difference between a sampling probability of e.g. 0.99999 and 0.999999 relevant or could this part be removed?

I would say it can be removed, yes. We believe fine-resolution probabilities close to 0 are important, but not close to 1.

I put this in for symmetry--so that rounding behavior near 0 and 1 would be the same, but I can easily be convinced to remove this line. It would mean that probabilities near 1 round up to exactly 1, subject to precision.

I'm in favor of removing it to keep the relative loss of precision consistent across multiples of 16; it feels somewhat arbitrary to increase the precision for values above 0.9375. Increasing the precision by the number of leading 0s would change the first row to 0cccd / 2.0077358064973794E-7:

Probability Threshold w/ precision=4 abs(1 - AdjustedCount * Probability)

0.95 / 16 ** 0 0ccd 3.212386963991065E-6

0.95 / 16 ** 1 f0ccd 3.212386963991065E-6

0.95 / 16 ** 6 ffffff0ccd 3.212386963991065E-6

0.90 / 16 ** 0 199a 6.78173001933402E-6

0.90 / 16 ** 1 f199a 6.78173001933402E-6

0.90 / 16 ** 6 ffffff199a 6.78173001933402E-6

yuanyuanzhao3 · 2024-10-02T14:49:53Z

specification/trace/sdk.md

@@ -386,40 +389,41 @@ The default sampler is `ParentBased(root=AlwaysOn)`.

 #### TraceIdRatioBased


I think there is a confusion that we need to help the users with.

At first glance, TraceIdRatioBased seems to imply that the sampling is based on some ratio related to trace ids. There are quite some nuance in fact. It is only truly related to some ratio of Trace Ids in the W3C L2 trace context with the randomness flag checked. Otherwise, it is actually based on a generated randomness value.

This probably worth explicitly calling out.

I hope 6e29b0e addresses some of the higher-level detail that is missing.

Here is a statement to add the nuance you refer to: 77b51f8 😀

Co-authored-by: Tobias Bachert <git@b-privat.de>

…ication into jmacd/otep235

trask · 2024-10-11T01:23:30Z

specification/trace/sdk.md

+
+Note that the "ratio-based" part of this Sampler's name implies that
+it makes a probability decision directly from the TraceID, even though
+it was not not originally specified in an exact way.  In the present


Suggested change

it was not not originally specified in an exact way. In the present

it was not originally specified in an exact way. In the present

trask · 2024-10-11T01:24:06Z

specification/trace/sdk.md

+Note that the "ratio-based" part of this Sampler's name implies that
+it makes a probability decision directly from the TraceID, even though
+it was not not originally specified in an exact way.  In the present
+specification,the Sampler decision is more nuanced: only a portion of


Suggested change

specification,the Sampler decision is more nuanced: only a portion of

specification, the Sampler decision is more nuanced: only a portion of

jmacd mentioned this pull request Jul 29, 2024

Prototype for W3C Trace Context Level 2 support in TraceIDRatioBased sampler open-telemetry/opentelemetry-go#5645

Draft

OpenTelemetry trace SDK requirements for probability sampling followi…

0524a3d

…ng OTEP 235.

jmacd force-pushed the jmacd/otep235 branch from eb65467 to 0524a3d Compare July 29, 2024 22:57

jmacd marked this pull request as ready for review July 29, 2024 23:24

jmacd requested review from a team July 29, 2024 23:24

github-actions bot assigned jack-berg Jul 29, 2024

jmacd mentioned this pull request Jul 29, 2024

Update 'rv' value generation based on randomness flag + editorial changes to improve clarity open-telemetry/oteps#261

Open

linebreaks

c5453f8

jmacd mentioned this pull request Jul 30, 2024

Rename the experimental probability sampling specification #4168

Merged

jmacd commented Jul 30, 2024

View reviewed changes

specification/trace/tracestate-probability-sampling.md Show resolved Hide resolved

github-actions bot added the Stale label Aug 7, 2024

jmacd mentioned this pull request Aug 7, 2024

Randomness requirements following W3C Trace Context level 2 #4162

Open

5 tasks

jmacd added 2 commits August 7, 2024 15:13

Merge branch 'main' of github.com:open-telemetry/opentelemetry-specif…

25a61fd

…ication into jmacd/otep235

Add a migration section

68fa270

PeterF778 reviewed Aug 7, 2024

View reviewed changes

specification/trace/tracestate-handling.md Outdated Show resolved Hide resolved

github-actions bot removed the Stale label Aug 8, 2024

jmacd added 2 commits August 15, 2024 07:18

Merge branch 'main' of github.com:open-telemetry/opentelemetry-specif…

51f9794

…ication into jmacd/otep235

lowercase hex

ba5a47b

jmacd added 3 commits August 15, 2024 07:46

spec-compliance-matrix.md

49673b7

merge w/ removed file

e51bea6

chlog

4afe1c7

kalyanaj reviewed Aug 15, 2024

View reviewed changes

kentquirk approved these changes Aug 20, 2024

View reviewed changes

specification/trace/tracestate-probability-sampling.md Show resolved Hide resolved

tsloughter reviewed Sep 10, 2024

View reviewed changes

specification/trace/sdk.md Show resolved Hide resolved

specification/trace/tracestate-probability-sampling.md Show resolved Hide resolved

jmacd added 2 commits September 12, 2024 08:50

Merge branch 'main' of github.com:open-telemetry/opentelemetry-specif…

c40de50

…ication into jmacd/otep235

spec-compliance: AlwaysOn too

15a9c6f

jmacd commented Sep 12, 2024

View reviewed changes

specification/trace/sdk.md Show resolved Hide resolved

specification/trace/sdk.md Show resolved Hide resolved

specification/trace/sdk.md Show resolved Hide resolved

jpkrohling self-requested a review September 18, 2024 07:50

jpkrohling reviewed Sep 19, 2024

View reviewed changes

jpkrohling reviewed Sep 20, 2024

View reviewed changes

jmacd and others added 3 commits September 25, 2024 15:28

edits for jpkrohling

672fac2

Apply suggestions from code review

3c80d97

Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>

Merge branch 'jmacd/otep235' of github.com:jmacd/opentelemetry-specif…

b2b37f7

…ication into jmacd/otep235

jmacd requested review from a team as code owners September 25, 2024 22:29

yuanyuanzhao3 approved these changes Sep 26, 2024

View reviewed changes

PeterF778 reviewed Sep 27, 2024

View reviewed changes

specification/trace/tracestate-handling.md Show resolved Hide resolved

PeterF778 reviewed Sep 27, 2024

View reviewed changes

specification/trace/sdk.md Outdated Show resolved Hide resolved

PeterF778 reviewed Sep 27, 2024

View reviewed changes

specification/trace/tracestate-probability-sampling.md Outdated Show resolved Hide resolved

PeterF778 reviewed Sep 27, 2024

View reviewed changes

specification/trace/tracestate-probability-sampling.md Show resolved Hide resolved

jmacd added 2 commits September 27, 2024 16:09

algorithm

1bb0b31

move a sentence; drop a paragraph

2f0e387

Nevay reviewed Sep 28, 2024

View reviewed changes

Nevay mentioned this pull request Sep 28, 2024

Update TraceIdRatioBasedSampler to calculate sampling threshold according to OTEP 235 open-telemetry/opentelemetry-php#1391

Draft

yuanyuanzhao3 approved these changes Oct 2, 2024

View reviewed changes

jmacd mentioned this pull request Oct 3, 2024

Document interoperability for OpenTelemetry sampling specifications #4243

Open

PeterF778 approved these changes Oct 4, 2024

View reviewed changes

jmacd and others added 5 commits October 4, 2024 13:55

more overview

6e29b0e

nuance

77b51f8

Update specification/trace/tracestate-probability-sampling.md

a61fbdd

Co-authored-by: Tobias Bachert <git@b-privat.de>

Merge branch 'main' of github.com:open-telemetry/opentelemetry-specif…

59c329d

…ication into jmacd/otep235

Merge branch 'jmacd/otep235' of github.com:jmacd/opentelemetry-specif…

d21f341

…ication into jmacd/otep235

trask reviewed Oct 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

jmacd commented Jul 29, 2024 •

edited

Loading

jmacd commented Jul 30, 2024 •

edited

Loading

github-actions bot commented Aug 7, 2024

jmacd commented Aug 15, 2024

jmacd commented Aug 15, 2024

jpkrohling left a comment

jpkrohling Sep 19, 2024

jmacd Sep 25, 2024

jmacd Oct 3, 2024

jpkrohling Sep 19, 2024

jmacd Sep 25, 2024

jmacd Oct 3, 2024

jpkrohling left a comment

jpkrohling Sep 20, 2024

jmacd Sep 25, 2024

Nevay Sep 28, 2024

jmacd Oct 4, 2024

Nevay Oct 8, 2024

yuanyuanzhao3 Oct 2, 2024

jmacd Oct 4, 2024

trask Oct 11, 2024

trask Oct 11, 2024


		This proposal supports two sources of randomness:

		- A custom source of randomness: This proposal allows for a random (or pseudo-random) 56-bit value. We refer to this as `rv`. This can be generated and propagated through the `tracestate` header and the tracestate attribute in each span.


		The original TraceIdRatioBased sampler specification gave a workaround for the underspecified behavior, that it was safe to use for root spans: "It is recommended to use this sampler algorithm only for root spans (in combination with [`ParentBased`](./sdk.md#parentbased)) because different language SDKs or even different versions of the same language SDKs may produce inconsistent results for the same input."

		To avoid inconsistency during this transition, users SHOULD follow this guidance until all TraceIdRatioBased samplers used in a system have been upgraded to the modern `TraceIdRatioBased` specification based on W3C Trace Context Level 2 randomness. After all `TraceIdRatioBased` samplers have been upgraded, it is safe to use `TraceIdRatioBased` sampler without also using the `ParentBased` sampler.

Probability	Threshold w/ precision=4	abs(1 - AdjustedCount * Probability)
0.95 / 16 ** 0	0ccd	3.212386963991065E-6
0.95 / 16 ** 1	f0ccd	3.212386963991065E-6
0.95 / 16 ** 6	ffffff0ccd	3.212386963991065E-6
0.90 / 16 ** 0	199a	6.78173001933402E-6
0.90 / 16 ** 1	f199a	6.78173001933402E-6
0.90 / 16 ** 6	ffffff199a	6.78173001933402E-6

		@@ -386,40 +389,41 @@ The default sampler is `ParentBased(root=AlwaysOn)`.

		#### TraceIdRatioBased

	it was not not originally specified in an exact way. In the present
	it was not originally specified in an exact way. In the present

	specification,the Sampler decision is more nuanced: only a portion of
	specification, the Sampler decision is more nuanced: only a portion of

OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

Are you sure you want to change the base?

OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235 #4166

Conversation

jmacd commented Jul 29, 2024 • edited Loading

Changes

jmacd commented Jul 30, 2024 • edited Loading

github-actions bot commented Aug 7, 2024

jmacd commented Aug 15, 2024

jmacd commented Aug 15, 2024

jpkrohling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpkrohling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmacd commented Jul 29, 2024 •

edited

Loading

jmacd commented Jul 30, 2024 •

edited

Loading