You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Present shutdown sequence calls of collector service is as follows servers -> writers -> collector_queue_processors(with drain)
First closing storage writers and then draining the collector queue. Which resulting in collector accepting spans until the writers close operation done.
While draining collector queue on issue of collector close operation, collector is trying to write spans to storage since the writer is closed first it resulting in panic and leads to span loss.
To Reproduce
Steps to reproduce the behavior:
Continuously generate a high volume of traffic to collector service
Stop the collector service process by CTRL + C or soft kill the process.
We can see a panic with error message Send on closed channel and process exit in collector logs
Expected behavior
Ideal shutdown sequence order should be as follows servers -> queue processors (with drain) -> writers
The text was updated successfully, but these errors were encountered:
svc.RunAndThen(func() {
if closer, ok := spanWriter.(io.Closer); ok {
err := closer.Close()
if err != nil {
logger.Error("failed to close span writer", zap.Error(err))
}
}
if err := c.Close(); err != nil {
logger.Error("failed to cleanly close the collector", zap.Error(err))
}
})
Fix:
svc.RunAndThen(func() {
if err := c.Close(); err != nil {
logger.Error("failed to cleanly close the collector", zap.Error(err))
}
if closer, ok := spanWriter.(io.Closer); ok {
err := closer.Close()
if err != nil {
logger.Error("failed to close span writer", zap.Error(err))
}
}
})
Describe the bug
Present shutdown sequence calls of collector service is as follows
servers -> writers -> collector_queue_processors(with drain)
First closing storage writers and then draining the collector queue. Which resulting in collector accepting spans until the writers close operation done.
While draining collector queue on issue of collector close operation, collector is trying to write spans to storage since the writer is closed first it resulting in panic and leads to span loss.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Ideal shutdown sequence order should be as follows
servers -> queue processors (with drain) -> writers
The text was updated successfully, but these errors were encountered: