Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodSpeed reporting wrong numbers #82

Open
overlookmotel opened this issue Jul 16, 2024 · 4 comments
Open

CodSpeed reporting wrong numbers #82

overlookmotel opened this issue Jul 16, 2024 · 4 comments

Comments

@overlookmotel
Copy link

overlookmotel commented Jul 16, 2024

The problem

I've seen quite a few times recently that benchmark measures CodSpeed gives on PRs are erroneous.

This is a problem when doing perf work, as you can't tell if what you're doing is good or not.

e.g. PR oxc-project/oxc#4214 showed initially a giving 0 speed-up, but then benchmarks re-ran after the PR below it in the stack was merged, and suddenly it shows 6% perf improvement. https://codspeed.io/oxc-project/oxc/branches/07-12-perf_semantic_reduce_lookups

That's wrong. The PR gives 0 perf improvement.

Reason was that in the last run, CodSpeed did the comparison to 2 commits back (3016f03), rather than 1 back. So 6% result shown included the perf boost of oxc-project/oxc#4213 which is the commit that preceded it.

Why?

I am not sure why this has started happening recently. Could be:

  1. Changes at CodSpeed's end.
  2. Caused by our switch to using Graphite merge queue.

Solutions

  1. Raise with CodSpeed.
  2. If they can't fix, investigate if we can handle it somehow at our end.

Because we intercept and store bench results and upload them to CodSpeed our end, we could potentially get our Github action to check that benchmarks for previous commit have completed and been uploaded to CodSpeed already, before submitting results for current commit. If not, wait until they are.

@overlookmotel
Copy link
Author

overlookmotel commented Jul 26, 2024

Appears to be a CodSpeed thing, unrelated to Graphite merge queue.

Another example from today: oxc-project/oxc#4476
https://codspeed.io/oxc-project/oxc/branches/07-26-perf_sourcemap_pre_allocate_string_buf_while_encoding

bench

Codspeed gets "stuck" comparing to "base" commit ccb1835, whereas the PR was rebased on latest main, so the merge commit for that PR that benchmarks are running on is based on latest main 42a2519.

@overlookmotel
Copy link
Author

@overlookmotel overlookmotel transferred this issue from oxc-project/backlog Jul 26, 2024
@overlookmotel overlookmotel changed the title Investigate CodSpeed reporting wrong numbers CodSpeed reporting wrong numbers Jul 26, 2024
@Boshen
Copy link
Member

Boshen commented Jul 27, 2024

The conclusion is base should always trigger benchmark on any Rust changes. This case is change in rustc version.

@Boshen Boshen closed this as not planned Won't fix, can't repro, duplicate, stale Jul 27, 2024
@overlookmotel overlookmotel transferred this issue from oxc-project/oxc Jul 29, 2024
@overlookmotel overlookmotel reopened this Jul 29, 2024
@overlookmotel
Copy link
Author

overlookmotel commented Jul 29, 2024

I've moved this back to backlog and reopened.

I didn't take quite the same conclusion from this as Boshen. It seems to me that the problem is a race condition - if a PR completes its benchmark run before the latest commit on main completes its, then CodSpeed uses previous commit on main as the base for comparison.

We could fix this by adding a check to the upload action to make sure benchmarks aren't still running on base commit. If they are, wait for it to finish before uploading. As we're already handling uploading benchmarks ourselves for purpose of sharding, we can handle this quite easily.

I think this problem has become more common recently because of our use of Graphite merge queue means it's constantly merging stuff and running benchmarks in quick succession.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants