Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for improving E2E test developer experience #33532

Closed
5 of 10 tasks
noisysocks opened this issue Jul 19, 2021 · 9 comments
Closed
5 of 10 tasks

Ideas for improving E2E test developer experience #33532

noisysocks opened this issue Jul 19, 2021 · 9 comments
Assignees
Labels
[Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues

Comments

@noisysocks
Copy link
Member

noisysocks commented Jul 19, 2021

A few things we could try to improve the experience of writing and running E2E tests:

@noisysocks noisysocks added [Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues labels Jul 19, 2021
@gziolo
Copy link
Member

gziolo commented Aug 2, 2021

Fix Core version in .wp-env.json to a known git commit which is updated automatically via a PR every week. This might make it less disruptive (i.e. doesn't block every single developer) when a Core change breaks Gutenberg CI.

Yes, it's super annoying, but it doesn't happen that often. It all depends on what we consider a priority here. The current setup enforces that the conflicts in the Gutenberg plugin are fixed as soon as possible. Some of those issues are obvious mistakes like duplicated function/class definition, others are related to changes in the default theme. There is definitely room for improvement in the process to find a better balance between the developer experience and ensuring that the Gutenberg plugin is compatible with WP core's trunk.

Look at splitting the 4 npm run test-e2e actions into 6 actions. This might speed up E2E test runs on a PR.

By looking at a random PR:

  1. Env setup took ~4-5 minutes.
  2. Test execution took ~11-13 minutes.

When we split into more jobs we get a major improvement for the second part of the CI job. A quick estimate would be ~11-13 minutes reduced to hopefully ~7-8 minutes on a single node. The drawback would be that we not only use 2 more nodes but also increase the total execution time by ~8-10 minutes to perform additional env setup (for nodes 5 and 6). It would be essential to figure out if there is a way to share at least the build part between more nodes so we could scale to as many nodes as we need.

I also think @desrosj did some testing with splitting e2e tests into more nodes. It would be great to see what he learned.

@kevin940726
Copy link
Member

It would be essential to figure out if there is a way to share at least the build part between more nodes so we could scale to as many nodes as we need.

I tried it once in my fork, and to my surprise, sometimes building the app from scratch is actually faster than downloading and unpacking them from GH artifacts. I'm still confident that there must be some way to reduce the execution time though, just have to do some more experiments.

@mtias
Copy link
Member

mtias commented Aug 5, 2021

I think this needs to be scoped down a little bit.

  • Not clear what the value of a dashboard would be compared to its overhead: where would it run? What would we code it with? How would we maintain and update it? Who would monitor it?
  • Screencasts: also not sure what value they provide over the setup cost. It doesn't seem like something crucial at this point or worth investing too much time into.

@gziolo
Copy link
Member

gziolo commented Aug 6, 2021

Screencasts: also not sure what value they provide over the setup cost. It doesn't seem like something crucial at this point or worth investing too much time into.

The current implementation proposes in #33506 slows down test execution by a few minutes per CI node, so your comment is valid. It could be useful for debugging, but overall it should be rather disabled if there is a performance penalty involved.

@kevin940726
Copy link
Member

kevin940726 commented Aug 7, 2021

I don't think a couple of minutes of slowdown in tests is that much of a problem though. E2E tests are already very slow, slowing down each by a few seconds shouldn't outweigh the benefits of debugging ability it brings. I've already encountered several tests which are only failing in CI and hard to debug/reproduce locally. Often times the only option we could make is to skip the tests and risk regressing the bug in future PRs. Furthermore, you can think of the slowdown as an emulation of CPU throttling, so that we can build more resilient tests. I've already found several flaky tests in that PR because of the slowdown. They are just hidden in plain sight, waiting to surface in some random PRs. In conclusion, I think the trade-offs are worth it. (In addition, it's currently implemented so that it's only enabled in CI by default, so no performance penalty during development)

A dashboard can also help here. IMO flaky tests which are only rarely failing don't worth the time to fix it ASAP. We need a way to determine and prioritize each flaky test by its failing rate. I'm aware of the complexity of a dashboard could bring, hence I opened a separate issue #33809 to track it.

@vcanales
Copy link
Member

  • Look at automatically retrying E2E tests. This might help with stability.

Regarding this, I'm opening #33979 in order to experiment with re-running failed jobs instead of the entire workflow. I might look into automatically retrying if this works out; otherwise, my thought is that retrying full workflows would add way too much time to be worth it.

@kevin940726
Copy link
Member

@vcanales There's already #31682, which works, but the consensus seems to be that we want to have a dashboard first to record all the failing tests.

@noisysocks
Copy link
Member Author

noisysocks commented Aug 11, 2021

Yes, it's super annoying, but it doesn't happen that often. It all depends on what we consider a priority here. The current setup enforces that the conflicts in the Gutenberg plugin are fixed as soon as possible.

Agree that we need to fix conflicts as soon as possible and keep Gutenberg tested against the latest WordPress trunk. But I don't think conflicts should block all developers (there are a lot of us now! 😀) from working and I really don't think we should have to deal with conflicts at very stressful times e.g. plugin release day. Being a Gutenberg developer should be fun and chill.

It would be essential to figure out if there is a way to share at least the build part between more nodes so we could scale to as many nodes as we need.

100%. Ideally the parallelised jobs that run E2E tests should happen after a single non-parallelised setup job. Maybe this won't improve total performance all that much but it would definitely improve re-run performance which I think is a big deal as many developers spend a lot of time waiting for failed tests to re-run.

Not clear what the value of a dashboard would be compared to its overhead: where would it run? What would we code it with? How would we maintain and update it? Who would monitor it?

I don't think we can systematically address flakey tests unless we measure what we want to improve. That's the value of a dashboard. I am thinking that it could be a GitHub Action that runs daily, scrapes the E2E test logs, and publishes to a static GitHub Pages site. If we have to set up seperate hosting, a database, etc. then I agree that the overhead is probably too high and that it would become basically unmaintained. (I think gutenberg.run suffers from this.)

Screencasts: also not sure what value they provide over the setup cost. It doesn't seem like something crucial at this point or worth investing too much time into.

No real opinion on this one. I trust @kevin940726 😛

@annezazu
Copy link
Contributor

Considering this hasn't been scoped down further and hasn't had much traction in a few years, I'm going to close this out but welcome folks to either reopen or start a new issue with more relevant/recent info about improving E2E test developer experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. [Type] Overview Comprehensive, high level view of an area of focus often with multiple tracking issues
Projects
None yet
Development

No branches or pull requests

6 participants