[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

mahadzaryab1 · 2024-10-03T00:26:32Z

Which problem is this PR solving?

Resolves [jaeger-v2] Refactor Adaptive Sampling Processor Configurations #6021

Description of the changes

Migration Guide

How was this change tested?

CI / Unit Tests / Integration Tests

Checklist

I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
I have signed all commits
I have added unit tests for the new functionality
I have run lint and test steps successfully
- for jaeger: make lint test
- for jaeger-ui: yarn lint and yarn test

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>

codecov · 2024-10-03T00:36:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.91%. Comparing base (5598400) to head (f96fe53).
Report is 7 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6039      +/-   ##
==========================================
- Coverage   96.92%   96.91%   -0.02%     
==========================================
  Files         349      349              
  Lines       16599    16603       +4     
==========================================
+ Hits        16088    16090       +2     
- Misses        328      329       +1     
- Partials      183      184       +1

Flag	Coverage Δ
badger_v1	`7.98% <ø> (ø)`
badger_v2	`1.81% <ø> (ø)`
cassandra-4.x-v1	`15.75% <ø> (ø)`
cassandra-4.x-v2	`1.74% <ø> (ø)`
cassandra-5.x-v1	`15.75% <ø> (ø)`
cassandra-5.x-v2	`1.74% <ø> (ø)`
elasticsearch-6.x-v1	`18.69% <ø> (ø)`
elasticsearch-7.x-v1	`18.74% <ø> (-0.02%)`	⬇️
elasticsearch-8.x-v1	`18.94% <ø> (ø)`
elasticsearch-8.x-v2	`1.80% <ø> (ø)`
grpc_v1	`9.51% <ø> (ø)`
grpc_v2	`7.12% <ø> (+0.01%)`	⬆️
kafka-v1	`9.69% <ø> (ø)`
kafka-v2	`1.81% <ø> (ø)`
memory_v2	`1.81% <ø> (ø)`
opensearch-1.x-v1	`18.79% <ø> (-0.02%)`	⬇️
opensearch-2.x-v1	`18.80% <ø> (+0.01%)`	⬆️
opensearch-2.x-v2	`1.81% <ø> (+0.01%)`	⬆️
tailsampling-processor	`0.46% <ø> (ø)`
unittests	`95.70% <100.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mahadzaryab1 · 2024-10-03T01:48:50Z

plugin/sampling/strategyprovider/adaptive/options.go

+	// LeaderLeaseRefreshInterval is the duration to sleep if this processor is elected leader before
+	// attempting to renew the lease on the leader lock. NB. This should be less than FollowerLeaseRefreshInterval
+	// to reduce lock thrashing.
+	LeaderLeaseRefreshInterval time.Duration `mapstructure:"leader_lease_refresh_interval"`

-	// CalculationInterval determines how often new probabilities are calculated. E.g. if it is 1 minute,
-	// new sampling probabilities are calculated once a minute and each bucket will contain 1 minute worth
-	// of aggregated throughput data.
-	CalculationInterval time.Duration `mapstructure:"calculation_interval"`
+	// FollowerLeaseRefreshInterval is the duration to sleep if this processor is a follower
+	// (ie. failed to gain the leader lock).
+	FollowerLeaseRefreshInterval time.Duration `mapstructure:"follower_lease_refresh_interval"`


@yurishkuro do you have any thoughts on how to group these? I was thinking grouping them into a type like Synchronization.

yurishkuro

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

yurishkuro · 2024-10-03T05:30:42Z

cmd/jaeger/config.yaml

@@ -38,7 +38,8 @@ extensions:
    #   path: ./cmd/jaeger/sampling-strategies.json
    adaptive:
      sampling_store: some_store
-      initial_sampling_probability: 0.1
+      sampling:


The whole struct here is "adaptive sampling", I think repeating "sampling" is redundant, does not add any clarity.

yurishkuro · 2024-10-03T05:33:07Z

plugin/sampling/strategyprovider/adaptive/options.go

-	// TODO implement manual overrides per service/operation.
-	TargetSamplesPerSecond float64 `mapstructure:"target_samples_per_second"`
+	Calculation Calculation `mapstructure:"calculation"`
+	Sampling    Sampling    `mapstructure:"sampling"`


I'm not sure this is an improvement. Most flags are related to calculation anyway. Some are related to boundary conditions, like min rate.

mahadzaryab1 · 2024-10-03T11:22:16Z

I'm not seeing this change as an improvement. The new subgroupings are not logical or better than a flat list. Maybe this config is already good as is.

@yurishkuro Here is what the current full config would look like

  remote_sampling:
    adaptive:
      sampling_store: some_store
      target_samples_per_second: 
      delta_tolerance: 
      calculation_interval: 
      aggregation_buckets: 
      calculation_buckets: 
      calculation_delay: 
      initial_sampling_probability: 
      min_sampling_probability: 
      min_samples_per_second: 
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

I saw two main groupings here. One for the collection of the data (sampling), and one for the computations (calculation). The config from the current PR would look something like:

  remote_sampling:
    adaptive:
      calculation: 
        aggregation_buckets:
        buckets:
        delay:
        delta_tolerance:
        interval:
      sampling:
        store: 
        initial_probability:
        min_probability: 
        min_rate:
        target_rate: 
      # potentially a better grouping for the following two
      leader_lease_refresh_interval: 
      follower_lease_refresh_interval: 
    http:
    grpc:

Let me know what you think. Do you see any other groupings or do you want to keep the current flat list? I can adjust the migration guide accordingly and close this PR out if we don't want any changes here.

yurishkuro · 2024-10-03T14:33:26Z

I find the flat list easier to read right now. The calc/sampling separation is artificial, looks like it saves typing one word, but actually makes naming harder to understand. I don't think it's worth changing.

mahadzaryab1 · 2024-10-03T22:40:20Z

Closing as per discussion above

Refactor Configurations For Adaptive Sampling Processor

f96fe53

Signed-off-by: Mahad Zaryab <mahadzaryab1@gmail.com>

mahadzaryab1 commented Oct 3, 2024

View reviewed changes

yurishkuro reviewed Oct 3, 2024

View reviewed changes

mahadzaryab1 closed this Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading

mahadzaryab1 Oct 3, 2024

yurishkuro left a comment

yurishkuro Oct 3, 2024

yurishkuro Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

yurishkuro commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

[WIP] [jaeger-v2] Refactor Configurations For Adaptive Sampling Processor #6039

Conversation

mahadzaryab1 commented Oct 3, 2024 • edited Loading

Which problem is this PR solving?

Description of the changes

How was this change tested?

Checklist

codecov bot commented Oct 3, 2024 • edited Loading

Codecov Report

mahadzaryab1 Oct 3, 2024

Choose a reason for hiding this comment

yurishkuro left a comment

Choose a reason for hiding this comment

yurishkuro Oct 3, 2024

Choose a reason for hiding this comment

yurishkuro Oct 3, 2024

Choose a reason for hiding this comment

mahadzaryab1 commented Oct 3, 2024 • edited Loading

yurishkuro commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading

codecov bot commented Oct 3, 2024 •

edited

Loading

mahadzaryab1 commented Oct 3, 2024 •

edited

Loading