Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment silently fails to flush when a subsegment is created with a closed parent #456

Open
danielwhite opened this issue Mar 7, 2024 · 1 comment

Comments

@danielwhite
Copy link

If a subsegment is created where the parent segment has been closed, then the entire trace segment will fail to be flushed.

The underlying reason seems to be that Segment.openSegments becomes negative resulting in Segment.flush ignoring the segment because it will only handle a segment with exactly zero openSegments.

func (seg *Segment) flush() bool {
if (seg.openSegments == 0 && seg.EndTime > 0) || seg.ContextDone {

The current silent loss of the entire trace segment is very surprising and difficult to diagnose.

Example

A more complete test case demonstrates this:

func TestSegment_Close_out_of_order(t *testing.T) {
	ctx, td := NewTestDaemon()
	defer td.Close()
	ctx, seg := BeginSegment(ctx, "test")

	ctxSubSeg1, subSeg1 := BeginSubsegment(ctx, "TestSubsegment1")
	subSeg1.Close(nil) // closed before it is used as a parent
	_, subSeg2 := BeginSubsegment(ctxSubSeg1, "TestSubsegment2")
	subSeg2.Close(nil)

	seg.Close(nil)

	assert.Zero(t, seg.openSegments)
	assert.True(t, seg.Emitted)
}

Currently, this will fail with the following:

=== RUN   TestSegment_Close_out_of_order
2024-03-08T01:31:26+11:00 [INFO] Emitter using address: 127.0.0.1:32794
    segment_test.go:170: 
        	Error Trace:	segment_test.go:170
        	Error:      	Should be zero, but was -1
        	Test:       	TestSegment_Close_out_of_order
    segment_test.go:171: 
        	Error Trace:	segment_test.go:171
        	Error:      	Should be true
        	Test:       	TestSegment_Close_out_of_order
--- FAIL: TestSegment_Close_out_of_order (0.00s)
@danielwhite
Copy link
Author

I'd be happy to open a PR, but I'm not entirely sure the right course of action.

The Java SDK seems to handle a similar situation with a "NoOp" segment.

Something like the following seems to fix both the test case in the description and the issue I'm seeing in practice:

diff --git a/xray/segment.go b/xray/segment.go
index 4e96b34..f9e1507 100644
--- a/xray/segment.go
+++ b/xray/segment.go
@@ -314,6 +314,9 @@ func BeginSubsegment(ctx context.Context, name string) (context.Context, *Segmen
        if !seg.ParentSegment.Sampled {
                seg.Dummy = true
        }
+       if !seg.parent.InProgress {
+               seg.Dummy = true
+       }

I'm not sure if this is the best way to express this or if it risks hiding the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant