Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Java agent self-observability #12595

Closed
2 of 3 tasks
wu-sheng opened this issue Sep 5, 2024 · 1 comment
Closed
2 of 3 tasks

[Feature] Java agent self-observability #12595

wu-sheng opened this issue Sep 5, 2024 · 1 comment
Assignees
Labels
agent Language agent related. core feature Core and important feature. Sometimes, break backwards compatibility. feature New feature java Java agent related

Comments

@wu-sheng
Copy link
Member

wu-sheng commented Sep 5, 2024

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

This has been a thing in my mind for years. I am writing this down to see if someone would be interested in implementing this.
Java agent is bundled in-process codes, so its runtime performance is hard to measure by using traditional tools, so, I want to propose an in-kernel self-observability implementation to measure tracing performance.

We should measure the agent performance using the following metrics:

  • created_tracing_context_counter - Counter. The number of created tracing contexts. This should include a label=created_by(value=sampler,propagated). created_by=propagated means the agent created the context due to downstream service added sw8 header to trigger force sampling. created_by=sampler means the agent created this context by local sampler no matter which policy it uses.
  • finished_tracing_context_counter - Counter. The number of created contexts. The gap between finished_tracing_context_counter and created_tracing_context_counter should be relatively stable, otherwise, the memory cost would be increased.
  • created_ignored_context_counter and finished_ignored_context_counter. Same concepts like *_tracing_context_counter.
  • interceptor_error_counter - Counter. The number of errors happened in the interceptor logic, with label=plugin_name, inter_type(constructor, inst, static). We don't add interceptor names into labels in case of OOM. The number of plugins is only dozens, it is predictable, but the number of interceptors will be hundreds.
  • possible_leaked_context_counter - Counter. The number of detected leaked contexts. It should include the label=source(value=tracing, ignore). When source=tracing, it is today's shadow tracing context. But now, it is measured.
  • tracing_context_performance - Histogram. For successfully finished tracing context, it measures every interceptor's time cost(by using nanoseconds), the buckets of the histogram should be {0.01, 0.1, 0.5, 1, 3, 5, 10, 50, 100, 200, 500, 1000}ms. This provides the performance behavior for the tracing operations.

Use case

SkyWalking OAP should accept these meters through native protocols, and build a new self-observability dashboard for the Java agent.

Also, I hope this provides some inspirations for other agent maintainers/contributors to add similar concepts agent by agent. cc @apache/skywalking-committers

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

  • Yes I am willing to submit a pull request on my own!

Code of Conduct

@wu-sheng wu-sheng added core feature Core and important feature. Sometimes, break backwards compatibility. agent Language agent related. feature New feature java Java agent related labels Sep 5, 2024
@wu-sheng wu-sheng self-assigned this Sep 5, 2024
@weixiang1862
Copy link
Member

Please assign this to me and @CzyerChen , we will collaborate to implement this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent Language agent related. core feature Core and important feature. Sometimes, break backwards compatibility. feature New feature java Java agent related
Projects
None yet
Development

No branches or pull requests

3 participants