Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize all jenkins stages #18901

Closed
wants to merge 2 commits into from

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Jun 2, 2020

Some related stages are being grouped inside other stages, even if this makes sense logically, it prevents nested stages to be run in parallel, making some of these stages to take a lot of time.

Remove the stages that contain nested stages, and move them directly to the root parallel, so all stages are parallelized.

@jsoriano jsoriano self-assigned this Jun 2, 2020
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label Team:Automation Label for the Observability productivity team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jun 2, 2020
@jsoriano jsoriano force-pushed the parallelize-jenkins-stages branch from b904796 to 37ef9a2 Compare June 2, 2020 15:41
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 2, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #18901 updated]

  • Start Time: 2020-06-03T08:35:19.121+0000

  • Duration: 76 min 18 sec

Test stats 🧪

Test Results
Failed 0
Passed 9180
Skipped 1569
Total 10749

Steps errors

Expand to view the steps failures

  • Name: Make -C libbeat testsuite
    • Description: make -C libbeat testsuite

    • Duration: 31 min 57 sec

    • Start Time: 2020-06-03T08:57:54.956+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-06-03T09:51:10.174Z] + FILE=winlogbeat/build/coverage/full.cov
[2020-06-03T09:51:10.174Z] + [ -f winlogbeat/build/coverage/full.cov ]
[2020-06-03T09:51:10.174Z] + FILE=journalbeat/build/coverage/full.cov
[2020-06-03T09:51:10.174Z] + [ -f journalbeat/build/coverage/full.cov ]
[2020-06-03T09:51:10.661Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats
[2020-06-03T09:51:10.973Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-06-03T09:51:10.985Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Lint
[2020-06-03T09:51:11.064Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Libbeat-stress-tests
[2020-06-03T09:51:11.135Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Elastic-Agent-Mac-OS-X
[2020-06-03T09:51:11.210Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Winlogbeat-oss-crosscompile
[2020-06-03T09:51:11.287Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Elastic-Agent-x-pack
[2020-06-03T09:51:11.357Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-crosscompile
[2020-06-03T09:51:11.432Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-oss-Mac-OS-X
[2020-06-03T09:51:11.505Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Dockerlogbeat
[2020-06-03T09:51:11.580Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-x-pack-Mac-OS-X
[2020-06-03T09:51:11.652Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Heartbeat-Mac-OS-X
[2020-06-03T09:51:11.721Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Journalbeat-oss
[2020-06-03T09:51:11.791Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Libbeat-crosscompile
[2020-06-03T09:51:11.869Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Generators-Metricbeat-Linux
[2020-06-03T09:51:11.939Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Generators-Beat-Linux
[2020-06-03T09:51:12.008Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Functionbeat-x-pack
[2020-06-03T09:51:12.076Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-Mac-OS-X
[2020-06-03T09:51:12.145Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Elastic-Agent-x-pack-Windows
[2020-06-03T09:51:12.214Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Generators-Metricbeat-Mac-OS-X
[2020-06-03T09:51:12.284Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-Mac-OS-X
[2020-06-03T09:51:12.352Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-x-pack-Mac-OS-X
[2020-06-03T09:51:12.421Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-OSS-Unit-tests
[2020-06-03T09:51:12.490Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Generators-Beat-Mac-OS-X
[2020-06-03T09:51:12.560Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Functionbeat-Mac-OS-X-x-pack
[2020-06-03T09:51:12.628Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack-Mac-OS-X
[2020-06-03T09:51:12.696Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-crosscompile
[2020-06-03T09:51:12.765Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Heartbeat-oss
[2020-06-03T09:51:12.835Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-x-pack-Windows
[2020-06-03T09:51:12.904Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-x-pack
[2020-06-03T09:51:12.973Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-oss-Windows
[2020-06-03T09:51:13.043Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Heartbeat-Windows
[2020-06-03T09:51:13.112Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Libbeat-x-pack
[2020-06-03T09:51:13.180Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Winlogbeat-Windows-x-pack
[2020-06-03T09:51:13.248Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-Windows
[2020-06-03T09:51:13.318Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-x-pack-Windows
[2020-06-03T09:51:13.387Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Winlogbeat-Windows
[2020-06-03T09:51:13.457Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Packetbeat-oss
[2020-06-03T09:51:13.527Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Auditbeat-oss-Linux
[2020-06-03T09:51:13.596Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Functionbeat-Windows
[2020-06-03T09:51:13.664Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-Windows
[2020-06-03T09:51:13.732Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack-Windows
[2020-06-03T09:51:13.800Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-x-pack
[2020-06-03T09:51:13.868Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Filebeat-oss
[2020-06-03T09:51:13.938Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Libbeat-oss
[2020-06-03T09:51:14.007Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-OSS-Integration-tests
[2020-06-03T09:51:14.084Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-Python-integration-tests
[2020-06-03T09:51:14.154Z] Running in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack
[2020-06-03T09:51:14.612Z] + cat
[2020-06-03T09:51:14.612Z] + /usr/local/bin/runbld ./runbld-script
[2020-06-03T09:51:14.612Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-06-03T09:51:21.205Z] runbld>>> runbld started
[2020-06-03T09:51:21.205Z] runbld>>> 1.6.11/a66728ff8f4356963772e6e6d2069392fa06acbe
[2020-06-03T09:51:23.121Z] runbld>>> The following profiles matched the job 'Beats/beats-beats-mbp/PR-18901' in order of occurrence in the config (last value wins).
[2020-06-03T09:51:24.062Z] runbld>>> Debug logging enabled.
[2020-06-03T09:51:24.062Z] runbld>>> Storing result
[2020-06-03T09:51:24.322Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-06-03T09:51:24.322Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200603095123-4D4EE047
[2020-06-03T09:51:24.322Z] runbld>>> Adding system facts.
[2020-06-03T09:51:25.267Z] runbld>>> Adding vcs info for the latest commit:  a687867d2399bdcd29d3d7b72547943b4aa2a6bc
[2020-06-03T09:51:25.267Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-06-03T09:51:25.267Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-06-03T09:51:25.267Z] Processing JUnit reports with runbld...
[2020-06-03T09:51:25.267Z] + echo 'Processing JUnit reports with runbld...'
[2020-06-03T09:51:25.842Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-06-03T09:51:25.842Z] runbld>>> DURATION: 14ms
[2020-06-03T09:51:25.842Z] runbld>>> STDOUT: 40 bytes
[2020-06-03T09:51:25.842Z] runbld>>> STDERR: 49 bytes
[2020-06-03T09:51:25.842Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-06-03T09:51:25.842Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats
[2020-06-03T09:51:27.236Z] runbld>>> Storing build metadata: 
[2020-06-03T09:51:27.236Z] runbld>>> Adding test report.
[2020-06-03T09:51:27.236Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats
[2020-06-03T09:51:28.177Z] runbld>>> Found 114 test output files
[2020-06-03T09:51:28.440Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-OSS-Integration-tests/metricbeat/build/TEST-go-integration-graphite.xml
[2020-06-03T09:51:28.440Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-OSS-Integration-tests/metricbeat/build/TEST-go-integration-windows.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-openmetrics.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-cloudfoundry.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-iis.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-istio.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-activemq.xml
[2020-06-03T09:51:28.701Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-tomcat.xml
[2020-06-03T09:51:30.622Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 10599 Skipped: 1330
[2020-06-03T09:51:30.623Z] runbld>>> Storing result
[2020-06-03T09:51:30.623Z] runbld>>> FAILURES: 0
[2020-06-03T09:51:30.623Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-06-03T09:51:30.623Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200603095123-4D4EE047
[2020-06-03T09:51:30.623Z] runbld>>> Email notification disabled by environment variable.
[2020-06-03T09:51:30.623Z] runbld>>> Slack notification disabled by environment variable.
[2020-06-03T09:51:36.405Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats-beats-mbp_PR-18901
[2020-06-03T09:51:36.626Z] [INFO] getVaultSecret: Getting secrets
[2020-06-03T09:51:36.671Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-06-03T09:51:37.417Z] + chmod 755 generate-build-data.sh
[2020-06-03T09:51:37.417Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18901/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18901/runs/4 FAILURE 4578007
[2020-06-03T09:51:37.417Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18901/runs/4/steps/?limit=10000 -o steps-info.json
[2020-06-03T09:51:38.761Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats-beats-mbp/PR-18901/runs/4/tests/?status=FAILED -o tests-errors.json

@jsoriano jsoriano marked this pull request as ready for review June 3, 2020 08:37
@kuisathaverat
Copy link
Contributor

kuisathaverat commented Jun 3, 2020

most of the grouped tasks take less than 10 min (most less than 5min), so put then in parallel does not change the time of the build because they will end before others (e.g. Metricbeat takes 30min each). Put all in parallel will run all stages even do on the grouped doesn't, when one fails in the group does not continue with that group. Also, launch everything in parallel consumes more resources (more workers) and can have the side effect of increase the time of the build because there are no workers free to launch.
However, we can give it a try, but as much as think about it I think we should split the thing into beats.

@jsoriano
Copy link
Member Author

jsoriano commented Jun 4, 2020

I guess that we have to find a balance between the fact that we are going to add more stages per beat, and the overhead that parallelizing more can have on the occupation of workers.

most of the grouped tasks take less than 10 min (most less than 5min), so put then in parallel does not change the time of the build because they will end before others (e.g. Metricbeat takes 30min each).

My main motivation to do this change is that I have already seen several builds waiting up to 10 or 20 minutes for the last Auditbeat or Heartbeat builds to finish, and we need to add more stages to cover more cases (#17411). So builds for beats that currently only have two or three ~10 minutes stages are going to end up having at least a couple of stages more. This is why I decided to parallelize Auditbeat while I was adding more stages in #18835, and I would expect something similar for other beats.

Put all in parallel will run all stages even do on the grouped doesn't, when one fails in the group does not continue with that group.

I am actually not sure if we want this in all cases. It is ok to block the build in a lint stage if the code doesn't even compile, but for example I would like to have the feedback of all stages for a beat to see if a test fails or not in on all platforms.

Also, launch everything in parallel consumes more resources (more workers) and can have the side effect of increase the time of the build because there are no workers free to launch.

Yep, I thought about this, but we are also doing many other efforts to have more selective testing. So maybe one thing compensates the other. Also, more workers are going to be launched, but they are going to be occupied during less time, this might help to improve the general throughput.

However, we can give it a try, but as much as think about it I think we should split the thing into beats.

Do you mean to have one pipeline per beat? Or completely separate beats to different repos as apm-server? I think this would be a bigger discussion to have 🙂

Maybe something else we can try to reduce overhead on workers is to serialize per platform instead of per beat, but this would require a lot of manual assignation of stages to workers. I would go with full parallelization by now, and see how it works.

@kuisathaverat
Copy link
Contributor

Do you mean to have one pipeline per beat? Or completely separate beats to different repos as apm-server? I think this would be a bigger discussion to have

Different pipelines, one per beat

@jsoriano
Copy link
Member Author

Closing this by now, there are other ongoing efforts to refactor the pipeline.

@jsoriano jsoriano closed this Jul 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review Team:Automation Label for the Observability productivity team [zube]: In Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants