Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended span batch write size #871

Open
jameshartig opened this issue Jul 17, 2024 · 3 comments
Open

Recommended span batch write size #871

jameshartig opened this issue Jul 17, 2024 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request priority: p2

Comments

@jameshartig
Copy link

The trace Readme.md doesn't call WithMaxExportBatchSize which means the export batch size is the default, 512. This means that every 512 spans the batch span processor will synchronously [1] export the spans during which new spans will be queued up to the MaxQueueSize which defaults to 2048. If you are regularly doing 1000s of spans per second you will end up calling batchWriteSpans several times a second needlessly using up the 4,800/minute quota [2] and if any of those RPC calls takes more than a couple hundred milliseconds you will end up dropping spans.

The quotas page [2] says that the "Maximum number of spans per PatchTraces call" is 25,000 (assuming that PatchTraces is the same as batchWriteSpans). Does that mean we can set the batch export size to 25,000? I know that many of Google's RPC methods have internal request size limits and I'm concerned that setting it at 25,000 might trip the limit. Is there a recommended value?

There was a discussion [3] a few years ago where it seemed like it was settled on 6k for that particular case mostly because higher batch sizes were slower. Is that still the case now? I also saw that the batching implementation was replaced since then which means there isn't a need for increasing the grpc pool size since there isn't more than 1 outstanding RPC at a time when using the batch span processor, correct?

We have replaced the batch span processor internally with our own implementation to help improve the situation but I still believe the recommendations here are relevant for either one.

[1] https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/trace/batch_span_processor.go#L317
[2] https://cloud.google.com/trace/docs/quotas
[3] #456

@jameshartig
Copy link
Author

We've gone with a limit of 20,000 for now. I'll report back if we have issues.

@aabmass aabmass added documentation Improvements or additions to documentation priority: p2 labels Jul 29, 2024
@aabmass
Copy link
Contributor

aabmass commented Jul 29, 2024

Thanks for raising the issue. We'll try to figure out good defaults and update the documentation.

@dashpole dashpole added the enhancement New feature or request label Aug 14, 2024
@damemi
Copy link
Member

damemi commented Sep 16, 2024

PatchTraces is the v1 cloud trace API while BatchWriteSpans is v2 which our exporter uses. There doesn't actually seem to be any limit on BatchWriteSpans. I'll work on a summary of @dashpole's performance tuning findings from #456 for the readme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request priority: p2
Projects
None yet
Development

No branches or pull requests

4 participants