Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-9476] Managed string storage for interning over several profiles #607

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

AlexJF
Copy link
Contributor

@AlexJF AlexJF commented Sep 2, 2024

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

@pr-commenter
Copy link

pr-commenter bot commented Sep 2, 2024

Benchmarks

Comparison

Benchmark execution time: 2024-09-02 17:22:16

Comparing candidate commit 93f84f5 in PR branch alexjf/prof-9476-managed-string-storage-11 with baseline commit cfae8af in branch main.

Found 33 performance improvements and 5 performance regressions! Performance is the same for 3 metrics, 2 unstable metrics.

scenario:credit_card/is_card_number/

  • 🟥 throughput [-328179529.906op/s; -327757347.793op/s] or [-52.952%; -52.884%]
  • 🟩 execution_time [-1.610µs; -1.610µs] or [-99.792%; -99.784%]

scenario:credit_card/is_card_number/ 3782-8224-6310-005

  • 🟩 execution_time [-105.105µs; -104.900µs] or [-100.014%; -99.819%]
  • 🟩 throughput [+1908508.470op/s; +1949780.080op/s] or [+20.056%; +20.489%]

scenario:credit_card/is_card_number/ 378282246310005

  • 🟩 execution_time [-97.085µs; -96.884µs] or [-100.018%; -99.812%]
  • 🟩 throughput [+1802756.943op/s; +1842609.113op/s] or [+17.498%; +17.885%]

scenario:credit_card/is_card_number/37828224631

  • 🟥 throughput [-328185740.165op/s; -327665325.695op/s] or [-52.962%; -52.878%]
  • 🟩 execution_time [-1.611µs; -1.610µs] or [-99.815%; -99.761%]

scenario:credit_card/is_card_number/378282246310005

  • 🟩 execution_time [-95.145µs; -94.975µs] or [-100.005%; -99.826%]
  • 🟩 throughput [+1913455.843op/s; +1950285.668op/s] or [+18.204%; +18.554%]

scenario:credit_card/is_card_number/37828224631000521389798

  • 🟩 execution_time [-94.521µs; -94.436µs] or [-99.973%; -99.883%]
  • 🟩 throughput [+4087297.237op/s; +4118062.664op/s] or [+38.644%; +38.935%]

scenario:credit_card/is_card_number/x371413321323331

  • 🟩 execution_time [-22.834µs; -22.774µs] or [-100.033%; -99.767%]

scenario:credit_card/is_card_number_no_luhn/

  • 🟥 throughput [-328384065.243op/s; -327962008.988op/s] or [-52.982%; -52.914%]
  • 🟩 execution_time [-1.610µs; -1.610µs] or [-99.791%; -99.784%]

scenario:credit_card/is_card_number_no_luhn/ 3782-8224-6310-005

  • 🟩 execution_time [-86.655µs; -86.555µs] or [-99.979%; -99.864%]
  • 🟩 throughput [+3172921.368op/s; +3204598.361op/s] or [+27.500%; +27.775%]

scenario:credit_card/is_card_number_no_luhn/ 378282246310005

  • 🟩 execution_time [-80.102µs; -79.947µs] or [-100.018%; -99.824%]
  • 🟩 throughput [+3354640.462op/s; +3391521.999op/s] or [+26.865%; +27.161%]

scenario:credit_card/is_card_number_no_luhn/37828224631

  • 🟥 throughput [-328388260.118op/s; -327975498.458op/s] or [-52.985%; -52.919%]
  • 🟩 execution_time [-1.610µs; -1.610µs] or [-99.792%; -99.783%]

scenario:credit_card/is_card_number_no_luhn/378282246310005

  • 🟩 execution_time [-78.279µs; -78.143µs] or [-100.009%; -99.835%]
  • 🟩 throughput [+3664218.479op/s; +3705186.952op/s] or [+28.679%; +29.000%]

scenario:credit_card/is_card_number_no_luhn/37828224631000521389798

  • 🟩 execution_time [-94.686µs; -94.573µs] or [-99.989%; -99.869%]
  • 🟩 throughput [+4261865.383op/s; +4287914.262op/s] or [+40.358%; +40.605%]

scenario:credit_card/is_card_number_no_luhn/x371413321323331

  • 🟩 execution_time [-22.845µs; -22.777µs] or [-100.052%; -99.752%]

scenario:normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...

  • 🟩 execution_time [-314.406µs; -314.188µs] or [-99.945%; -99.876%]

scenario:normalization/normalize_name/normalize_name/bad-name

  • 🟩 execution_time [-27.984µs; -27.971µs] or [-99.899%; -99.855%]

scenario:normalization/normalize_name/normalize_name/good

  • 🟩 execution_time [-16.722µs; -16.716µs] or [-99.888%; -99.851%]

scenario:normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...

  • 🟩 execution_time [-618.664µs; -618.528µs] or [-99.915%; -99.893%]

scenario:normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて

  • 🟩 execution_time [-388.449µs; -388.127µs] or [-99.925%; -99.842%]

scenario:normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters

  • 🟩 execution_time [-191.484µs; -191.412µs] or [-99.913%; -99.876%]

scenario:normalization/normalize_service/normalize_service/[empty string]

  • 🟩 execution_time [-45.120µs; -45.102µs] or [-99.923%; -99.885%]

scenario:normalization/normalize_service/normalize_service/test_ASCII

  • 🟩 execution_time [-49.368µs; -49.323µs] or [-99.905%; -99.812%]

scenario:normalization/normalize_trace/test_trace

  • 🟥 execution_time [+17.324ns; +21.667ns] or [+5.549%; +6.941%]

scenario:redis/obfuscate_redis_string

  • 🟩 execution_time [-2.077µs; -1.851µs] or [-5.292%; -4.718%]

scenario:sql/obfuscate_sql_string

  • 🟩 execution_time [-3.024µs; -2.975µs] or [-4.106%; -4.039%]

scenario:tags/replace_trace_tags

  • 🟩 execution_time [-416.787ns; -413.262ns] or [-15.143%; -15.015%]

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 279.634ns 281.692ns ± 0.451ns 281.735ns ± 0.200ns 281.904ns 282.106ns 282.453ns 284.448ns 0.96% -0.320 11.963 0.16% 0.032ns 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 33.617ns 34.430ns ± 0.521ns 34.433ns ± 0.221ns 34.633ns 35.031ns 35.246ns 39.468ns 14.62% 4.468 42.124 1.51% 0.037ns 1 200
normalization/normalize_name/normalize_name/good execution_time 21.509ns 21.854ns ± 0.121ns 21.852ns ± 0.072ns 21.920ns 22.040ns 22.129ns 22.560ns 3.24% 1.017 4.926 0.55% 0.009ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [281.629ns; 281.754ns] or [-0.022%; +0.022%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [34.358ns; 34.502ns] or [-0.210%; +0.210%] None None None
normalization/normalize_name/normalize_name/good execution_time [21.837ns; 21.871ns] or [-0.076%; +0.076%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 138.235µs 138.562µs ± 0.162µs 138.538µs ± 0.082µs 138.622µs 138.836µs 139.081µs 139.442µs 0.65% 1.828 6.473 0.12% 0.011µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [138.540µs; 138.585µs] or [-0.016%; +0.016%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 577.678ns 591.949ns ± 5.433ns 592.464ns ± 3.250ns 595.368ns 600.033ns 602.737ns 608.497ns 2.71% -0.192 0.266 0.92% 0.384ns 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 438.330ns 452.828ns ± 3.603ns 453.682ns ± 0.331ns 453.958ns 454.324ns 454.752ns 467.900ns 3.13% -2.640 9.582 0.79% 0.255ns 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 192.772ns 202.258ns ± 4.025ns 203.433ns ± 3.080ns 204.593ns 207.468ns 214.376ns 214.521ns 5.45% 0.350 0.986 1.99% 0.285ns 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 37.008ns 43.307ns ± 7.200ns 40.222ns ± 2.571ns 46.654ns 58.897ns 59.212ns 59.496ns 47.92% 1.332 0.417 16.58% 0.509ns 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 62.746ns 69.835ns ± 5.735ns 67.942ns ± 3.001ns 69.236ns 80.084ns 81.197ns 82.352ns 21.21% 0.946 -0.632 8.19% 0.406ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [591.196ns; 592.702ns] or [-0.127%; +0.127%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [452.328ns; 453.327ns] or [-0.110%; +0.110%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [201.700ns; 202.815ns] or [-0.276%; +0.276%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [42.309ns; 44.305ns] or [-2.304%; +2.304%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [69.040ns; 70.630ns] or [-1.138%; +1.138%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.332µs 2.337µs ± 0.006µs 2.336µs ± 0.001µs 2.337µs 2.349µs 2.352µs 2.402µs 2.85% 6.267 57.328 0.27% 0.000µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.336µs; 2.338µs] or [-0.037%; +0.037%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 37.206µs 37.281µs ± 0.030µs 37.278µs ± 0.012µs 37.292µs 37.319µs 37.343µs 37.552µs 0.74% 3.562 31.260 0.08% 0.002µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [37.276µs; 37.285µs] or [-0.011%; +0.011%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 17.236µs 39.534µs ± 35.255µs 39.061µs ± 3.136µs 41.906µs 52.753µs 66.743µs 507.224µs 1198.53% 11.783 153.563 88.95% 2.493µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [34.648µs; 44.420µs] or [-12.359%; +12.359%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 320.832ns 331.682ns ± 6.501ns 329.691ns ± 2.272ns 335.407ns 341.534ns 344.677ns 377.964ns 14.64% 2.320 12.011 1.96% 0.460ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [330.781ns; 332.583ns] or [-0.272%; +0.272%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 3.250ns 3.427ns ± 0.017ns 3.429ns ± 0.001ns 3.429ns 3.431ns 3.434ns 3.444ns 0.43% -9.722 94.198 0.49% 0.001ns 1 200
credit_card/is_card_number/ throughput 290401870.306op/s 291801344.829op/s ± 1512749.646op/s 291664502.574op/s ± 75492.818op/s 291742458.176op/s 291836776.387op/s 292417672.787op/s 307698180.754op/s 5.50% 9.747 94.594 0.52% 106967.553op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 85.678ns 87.385ns ± 1.049ns 87.177ns ± 0.526ns 87.841ns 88.944ns 90.373ns 94.899ns 8.86% 2.548 13.384 1.20% 0.074ns 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 10537515.230op/s 11445271.281op/s ± 133555.391op/s 11470888.504op/s ± 69174.822op/s 11519560.678op/s 11601009.748op/s 11657928.474op/s 11671545.184op/s 1.75% -2.253 10.862 1.16% 9443.792op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 80.442ns 82.480ns ± 0.837ns 82.364ns ± 0.369ns 82.776ns 84.098ns 84.609ns 86.153ns 4.60% 0.851 1.736 1.01% 0.059ns 1 200
credit_card/is_card_number/ 378282246310005 throughput 11607263.244op/s 12125392.709op/s ± 122083.487op/s 12141258.576op/s ± 54445.028op/s 12188378.754op/s 12307850.680op/s 12362999.406op/s 12431274.637op/s 2.39% -0.761 1.503 1.00% 8632.606op/s 1 200
credit_card/is_card_number/37828224631 execution_time 3.265ns 3.428ns ± 0.016ns 3.429ns ± 0.001ns 3.430ns 3.432ns 3.436ns 3.471ns 1.21% -9.041 86.101 0.48% 0.001ns 1 200
credit_card/is_card_number/37828224631 throughput 288137942.898op/s 291741498.324op/s ± 1465338.084op/s 291624336.949op/s ± 77215.758op/s 291702072.197op/s 291874746.800op/s 292234614.027op/s 306299422.029op/s 5.03% 9.129 87.010 0.50% 103615.050op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 78.445ns 80.373ns ± 0.742ns 80.323ns ± 0.469ns 80.814ns 81.562ns 82.631ns 82.874ns 3.18% 0.319 0.929 0.92% 0.052ns 1 200
credit_card/is_card_number/378282246310005 throughput 12066459.063op/s 12443043.689op/s ± 114515.003op/s 12449691.845op/s ± 72590.872op/s 12508042.910op/s 12626194.489op/s 12693726.420op/s 12747708.045op/s 2.39% -0.242 0.832 0.92% 8097.433op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 66.638ns 68.126ns ± 0.501ns 68.126ns ± 0.229ns 68.357ns 68.608ns 69.402ns 72.377ns 6.24% 3.141 26.110 0.73% 0.035ns 1 200
credit_card/is_card_number/37828224631000521389798 throughput 13816479.302op/s 14679550.659op/s ± 105634.225op/s 14678620.035op/s ± 49407.618op/s 14727244.210op/s 14837170.055op/s 14899517.104op/s 15006472.233op/s 2.23% -2.758 22.542 0.72% 7469.468op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 21.408ns 22.903ns ± 0.586ns 22.867ns ± 0.401ns 23.255ns 23.961ns 24.364ns 24.566ns 7.43% 0.373 -0.063 2.55% 0.041ns 1 200
credit_card/is_card_number/x371413321323331 throughput 40706334.724op/s 43690793.674op/s ± 1109186.752op/s 43731079.829op/s ± 757747.219op/s 44530271.898op/s 45231255.936op/s 46049899.136op/s 46712124.668op/s 6.82% -0.236 -0.133 2.53% 78431.347op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 3.252ns 3.429ns ± 0.017ns 3.431ns ± 0.001ns 3.431ns 3.433ns 3.435ns 3.448ns 0.52% -9.727 94.176 0.49% 0.001ns 1 200
credit_card/is_card_number_no_luhn/ throughput 289987010.403op/s 291628528.315op/s ± 1514515.166op/s 291485767.351op/s ± 47613.458op/s 291528817.008op/s 291676653.357op/s 292084418.029op/s 307483183.619op/s 5.49% 9.750 94.534 0.52% 107092.394op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 66.184ns 67.908ns ± 0.480ns 67.866ns ± 0.248ns 68.125ns 68.687ns 69.095ns 69.955ns 3.08% 0.442 2.997 0.71% 0.034ns 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 14295004.491op/s 14726577.055op/s ± 103823.594op/s 14734843.875op/s ± 53883.880op/s 14784656.836op/s 14869260.310op/s 15059622.133op/s 15109455.950op/s 2.54% -0.340 2.923 0.70% 7341.437op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 62.137ns 63.054ns ± 0.402ns 62.999ns ± 0.230ns 63.260ns 63.794ns 64.043ns 64.623ns 2.58% 0.818 1.115 0.64% 0.028ns 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 15474352.787op/s 15860007.808op/s ± 100592.999op/s 15873219.087op/s ± 57855.149op/s 15928061.072op/s 15994189.785op/s 16057126.100op/s 16093549.199op/s 1.39% -0.772 0.993 0.63% 7112.999op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 3.262ns 3.430ns ± 0.017ns 3.431ns ± 0.001ns 3.432ns 3.434ns 3.436ns 3.454ns 0.66% -9.611 92.256 0.48% 0.001ns 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 289546826.829op/s 291590621.592op/s ± 1474196.903op/s 291466129.044op/s ± 56674.731op/s 291508264.088op/s 291589702.784op/s 291977698.984op/s 306524845.482op/s 5.17% 9.638 92.583 0.50% 104241.463op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 59.221ns 60.752ns ± 0.458ns 60.773ns ± 0.260ns 61.003ns 61.574ns 61.896ns 62.055ns 2.11% -0.103 1.000 0.75% 0.032ns 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 16114708.570op/s 16461224.529op/s ± 124302.526op/s 16454637.142op/s ± 70520.018op/s 16527520.891op/s 16683111.924op/s 16777960.337op/s 16885842.412op/s 2.62% 0.170 1.027 0.75% 8789.516op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 66.407ns 67.410ns ± 0.376ns 67.421ns ± 0.226ns 67.645ns 67.920ns 68.431ns 69.583ns 3.21% 1.062 6.002 0.56% 0.027ns 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 14371227.387op/s 14835079.346op/s ± 82326.959op/s 14832265.288op/s ± 49764.035op/s 14882017.501op/s 14959664.464op/s 15044226.583op/s 15058753.625op/s 1.53% -0.950 5.487 0.55% 5821.395op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 21.201ns 22.465ns ± 0.475ns 22.430ns ± 0.344ns 22.804ns 23.301ns 23.492ns 23.639ns 5.39% 0.091 -0.427 2.11% 0.034ns 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 42302249.284op/s 44533853.036op/s ± 940003.068op/s 44582571.576op/s ± 690877.754op/s 45236358.518op/s 46016170.143op/s 46576136.463op/s 47167006.992op/s 5.80% 0.008 -0.413 2.11% 66468.254op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [3.425ns; 3.429ns] or [-0.068%; +0.068%] None None None
credit_card/is_card_number/ throughput [291591692.277op/s; 292010997.381op/s] or [-0.072%; +0.072%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [87.239ns; 87.530ns] or [-0.166%; +0.166%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [11426761.788op/s; 11463780.773op/s] or [-0.162%; +0.162%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [82.364ns; 82.596ns] or [-0.141%; +0.141%] None None None
credit_card/is_card_number/ 378282246310005 throughput [12108473.112op/s; 12142312.306op/s] or [-0.140%; +0.140%] None None None
credit_card/is_card_number/37828224631 execution_time [3.425ns; 3.430ns] or [-0.067%; +0.067%] None None None
credit_card/is_card_number/37828224631 throughput [291538416.558op/s; 291944580.089op/s] or [-0.070%; +0.070%] None None None
credit_card/is_card_number/378282246310005 execution_time [80.270ns; 80.476ns] or [-0.128%; +0.128%] None None None
credit_card/is_card_number/378282246310005 throughput [12427173.011op/s; 12458914.367op/s] or [-0.128%; +0.128%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [68.056ns; 68.195ns] or [-0.102%; +0.102%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [14664910.771op/s; 14694190.546op/s] or [-0.100%; +0.100%] None None None
credit_card/is_card_number/x371413321323331 execution_time [22.822ns; 22.984ns] or [-0.355%; +0.355%] None None None
credit_card/is_card_number/x371413321323331 throughput [43537071.057op/s; 43844516.290op/s] or [-0.352%; +0.352%] None None None
credit_card/is_card_number_no_luhn/ execution_time [3.427ns; 3.431ns] or [-0.069%; +0.069%] None None None
credit_card/is_card_number_no_luhn/ throughput [291418631.078op/s; 291838425.551op/s] or [-0.072%; +0.072%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [67.841ns; 67.974ns] or [-0.098%; +0.098%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [14712188.103op/s; 14740966.006op/s] or [-0.098%; +0.098%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [62.999ns; 63.110ns] or [-0.088%; +0.088%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [15846066.586op/s; 15873949.031op/s] or [-0.088%; +0.088%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [3.427ns; 3.432ns] or [-0.067%; +0.067%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [291386312.080op/s; 291794931.105op/s] or [-0.070%; +0.070%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [60.689ns; 60.816ns] or [-0.105%; +0.105%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [16443997.395op/s; 16478451.664op/s] or [-0.105%; +0.105%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [67.358ns; 67.462ns] or [-0.077%; +0.077%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [14823669.622op/s; 14846489.071op/s] or [-0.077%; +0.077%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [22.399ns; 22.531ns] or [-0.293%; +0.293%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [44403577.651op/s; 44664128.421op/s] or [-0.293%; +0.293%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 70.453µs 70.660µs ± 0.090µs 70.652µs ± 0.054µs 70.708µs 70.795µs 70.908µs 71.294µs 0.91% 2.039 11.868 0.13% 0.006µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [70.648µs; 70.673µs] or [-0.018%; +0.018%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 93f84f5 1725297011 alexjf/prof-9476-managed-string-storage-11
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 2.075µs 2.951µs ± 1.860µs 2.761µs ± 0.018µs 2.781µs 3.063µs 4.419µs 24.798µs 798.07% 10.319 109.412 62.88% 0.132µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [2.693µs; 3.209µs] or [-8.737%; +8.737%] None None None

Baseline

Omitted due to size.

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite nice!

The blast radius is not very big, and to be honest, 99% of it seems to be the change to make apis take both an id or a string.

I think if we can solve that one, then the diff for this feature would be quite small + it'll be very close to zero cost for libdatadog clients that aren't using it, which seems like a really nice property.

Comment on lines +130 to 139
#[derive(Copy, Clone, Default, Debug)]
pub struct Label<'a> {
pub key: CharSlice<'a>,
pub key_id: u32,

/// At most one of the following must be present
pub str: CharSlice<'a>,
pub str_id: u32,
pub num: i64,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for the PoC, but it occurs to me that perhaps we could introduce a new type that can represent either a charslice or an interned string:

struct {
  kind: byte;
  union {
    slice: ddog_Slice_CChar
    id: u32
  }
}

We could tweak the ddog_Slice_CChar definition to even make sure that the final structure/union is the same size a char slice today (as it's kinda wasteful -- a ptr + a whole 64-bit length for the string which we'll never need)

(It's less clear to me if this nicety would be only for the ffi or if it would make sense to propagate to the rust-level API...)

Comment on lines +405 to +409
/// # Arguments
/// * `sample_types`
/// * `period` - Optional period of the profile. Passing None/null translates to zero values.
/// * `start_time` - Optional time the profile started at. Passing None/null will use the current
/// time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Missing the new argument ;)

@@ -38,6 +41,7 @@ pub struct Profile {
stack_traces: FxIndexSet<StackTrace>,
start_time: SystemTime,
strings: StringTable,
string_storage: Option<Rc<RwLock<ManagedStringStorage>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the first version, but my concurrency background does immediately make me think that perhaps we could get better performance by having something more specific than just wrapping this whole thing with a lock.

For instance: If we give up concurrent writing and assume only a single writer, we may be able to do a more of a transaction-like thing where the writer can do whatever it wants, and the reader needs to do a few reads and may "abort" and need to restart if it was concurrent with a writer that mutated something it wanted.

Comment on lines +41 to +80
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_intern(
storage: ManagedStringStorage,
string: Option<&CharSlice>,
) -> ManagedStringStorageInternResult {
(|| {
let storage = get_inner_string_storage(storage, true)?;

let string: &CharSlice = string.expect("non null string");
let string: &str = CStr::from_ptr(string.as_ptr())
.to_str()
.expect("valid utf8 string");

let string_id = storage
.write()
.expect("acquisition of write lock on string storage should succeed")
.intern(string);

anyhow::Ok(string_id)
})()
.context("ddog_prof_Profile_serialize failed")
.into()
}

#[must_use]
#[no_mangle]
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_unintern(
storage: ManagedStringStorage,
id: u32,
) -> ManagedStringStorageResult {
(|| {
let storage = get_inner_string_storage(storage, true)?;
storage
.read()
.expect("acquisition of read lock on string storage should succeed")
.unintern(id);
anyhow::Ok(())
})()
.context("ddog_prof_Profile_serialize failed")
.into()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to have versions of these methods that take in arrays of strings/indexes as an optimization.

Specifically, thus far most API calls we did to libdatadog on the "hot path" are big, heavy things, and we do a very small amount of them, so any overhead in ffi and whatnot is quite small.

For interning, we'll often be interning multiple values in a row (or even just pairs -- function name and filename), so on the caller side could cheaply stack-allocate an array, put all the strings in there, and feed them all in one step to libdatadog.

Comment on lines +331 to +339
if function.name_id > 0 || function.system_name_id > 0 || function.filename_id > 0 {
name = self.resolve(function.name_id);
system_name = self.resolve(function.system_name_id);
filename = self.resolve(function.filename_id);
} else {
name = self.intern(function.name);
system_name = self.intern(function.system_name);
filename = self.intern(function.filename)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess seeing this pattern of "id or not id" in the code makes me "double down" on thinking that it would be nice if we could have a string representation that could either be a string or a pre-interned id, so that most of these functions would not need any change (and the change would be in intern or something like that).

(It would also solve the problem of right now the code forces us to always use an id or not use an id, but doesn't really barf if we try to mix them up, although it will do the wrong thing.)

Comment on lines +61 to +66
fn get_string(&self, id: StringId) -> Rc<str> {
self.set
.get_index(id.to_offset())
.expect("StringId to have a valid interned index")
.clone()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: We should probably return an error (or option) to the caller, rather than blowing up if an id is incorrect

Comment on lines +163 to +167
pub fn unintern(&self, id: u32) {
let data = self.get_data(id);
let usage_count = &data.usage_count;
usage_count.set(usage_count.get() - 1);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Should we immediately clean the entry if usage_count gets below zero?

Comment on lines +111 to +127
self.id_to_data.retain(|_, data| {
/*let used_in_this_gen = data
.last_usage_gen
.get()
.map_or(false, |last_usage_gen| last_usage_gen == self.current_gen);
if used_in_this_gen || *id == 0 {
// Empty string (id = 0) or anything that was used in the gen
// we are now closing, is kept alive
return true;
}*/
if data.usage_count.get() > 0 {
return true;
}

self.str_to_id.remove_entry(&data.str);
false
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great suggestion for this one, but I'll point out that this may get a bit expensive, especially as we accumulate strings.

(GC-like generational storage anyone? 🔥 Half-kidding, half-not-kidding ;D)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PoC it seems fine, but yeah I think once we want to merge this in, it's worth writing down a bit of text to explain what all the weirdness going on in here is doing.

(I was able to follow it along with a lot of context, but I think it's going to be hard to wrap our heads around this once this gets in)


storage
.write()
.expect("acquisition of write lock on string storage should succeed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think expect panics on error; since we already have a structure for returning an error from this function, I think it's a bit nicer to just return back an error ;)

Comment on lines +44 to +59
fn intern(&mut self, item: Rc<str>) -> StringId {
// For performance, delay converting the [&str] to a [String] until
// after it has been determined to not exist in the set. This avoids
// temporary allocations.
let index = match self.set.get_index_of(&item) {
Some(index) => index,
None => {
let (index, _inserted) = self.set.insert_full(item.clone());
// This wouldn't make any sense; the item couldn't be found so
// we try to insert it, but suddenly it exists now?
debug_assert!(_inserted);
index
}
};
StringId::from_offset(index)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurred to me that while ideally all this structure would be lock-free and whatnot, inserting and claiming an index is actually the part that would be really nice to be able to do in a non-blocking manner.

I was googling and didn't find a lot of interesting solid/production pre-existing data structure implementations with the kinds of characteristics we'd want (which is super disappointing -- I miss you java.util.concurrent), but yea, may be worth keeping an eye out if we find something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe https://crates.io/crates/dashmap ? But we'd need to solve the index assignment (easy if we don't do index reuse, really annoying if we want to do that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants