Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci][1/n] Rebuild Docker images if necessary #11329

Merged
merged 1 commit into from
Jun 9, 2022

Conversation

driazati
Copy link
Member

@driazati driazati commented May 16, 2022

This sets us up to rebuild Docker images and uses them in later stages in the same build. This is not yet enabled (see the commented out code in DockerBuild.groovy.j2. If the build is running on main, then the images are uploaded to Docker Hub automatically once the run is complete. To avoid landing a complicated infra PR all at once this mostly adds the scaffolding and a follow up PR will: 1. upload images to tlcpack instead of tlcpackstaging 2. dynamically determine the docker image to use based on the latest hash 3. always rebuild docker images if an updated one hasn't been uploaded

cc @Mousius @areusch

@driazati driazati force-pushed the rebuild_images2 branch 10 times, most recently from f45948c to 7fbdda7 Compare May 17, 2022 22:40
@driazati driazati changed the title [ci][wip] Rebuild Docker images if necessary [ci] Rebuild Docker images if necessary May 18, 2022
@driazati driazati marked this pull request as ready for review May 18, 2022 18:01
Copy link
Contributor

@areusch areusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @driazati, added some comments

jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
tests/scripts/git_utils.py Outdated Show resolved Hide resolved
tests/scripts/should_rebuild_docker.py Outdated Show resolved Hide resolved
tests/scripts/should_rebuild_docker.py Show resolved Hide resolved
@github-actions github-actions bot requested a review from Mousius May 18, 2022 19:30
@driazati driazati force-pushed the rebuild_images2 branch 3 times, most recently from 677df48 to e21ffa8 Compare May 18, 2022 22:38
@areusch
Copy link
Contributor

areusch commented May 18, 2022

cool! i think this looks good to me, will let @Mousius have a look.

also cc @leandron

@driazati
Copy link
Member Author

just a note for reviewers: the code is 99% done but will take a couple more iterations through CI once it’s verified before it can be merged (ie right now it still has testing code specific to this PR)

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew, this Jenkinsfile is getting fun 😸 Thanks for working through this @driazati, is it possible to document some of the flow here in terms of using ECR as an intermediary cache before Docker Hub? Not necessarily this PR but it'd be great to have as a reference

tests/scripts/git_utils.py Outdated Show resolved Hide resolved
tests/python/ci/test_ci.py Outdated Show resolved Hide resolved
tests/python/ci/test_ci.py Outdated Show resolved Hide resolved
tests/scripts/cmd_utils.py Outdated Show resolved Hide resolved
tests/python/ci/test_ci.py Outdated Show resolved Hide resolved
jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
).trim()
def tag = "${date_Ymd_HMS}-${upstream_revision.substring(0, 8)}"
{% for image in images %}
update_docker({{ image.name }}, "tlcpackstaging/test_{{ image.name }}:${tag}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is literally for testing, but I'll leave a comment just in case 😸

Suggested change
update_docker({{ image.name }}, "tlcpackstaging/test_{{ image.name }}:${tag}")
update_docker({{ image.name }}, "tlcpackstaging/{{ image.name }}:${tag}")

jenkins/Jenkinsfile.j2 Outdated Show resolved Hide resolved
@driazati driazati force-pushed the rebuild_images2 branch 4 times, most recently from 97688a8 to 5c4613d Compare May 31, 2022 20:03
@driazati driazati changed the title [ci] Rebuild Docker images if necessary [ci] Add code to rebuild Docker images if necessary May 31, 2022
@driazati driazati changed the title [ci] Add code to rebuild Docker images if necessary [ci][1/n] Add code to rebuild Docker images if necessary May 31, 2022
@driazati driazati changed the title [ci][1/n] Add code to rebuild Docker images if necessary [ci][1/n] Rebuild Docker images if necessary May 31, 2022
@driazati
Copy link
Member Author

@Mousius I've re-worked this PR a bit, I'm weary about landing it all at once since it can potentially have a big impact on CI times. In a follow up PR I'll enable the docker image rebuilds based on the Docker Hub hashes behind a flag in Jenkins so we can turn it off if we need to, but this one is ready to go now

@driazati driazati requested a review from Mousius June 1, 2022 14:59
@driazati driazati force-pushed the rebuild_images2 branch 2 times, most recently from 11ff3d4 to a462745 Compare June 2, 2022 22:56
@driazati
Copy link
Member Author

driazati commented Jun 3, 2022

@tvm-bot rerun

Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@driazati I left a few questions but I'm ok with follow up in the next in the PR series. I also manually triggered CI to re-run as the bot didn't seem to be doing it?

timeout(time: max_time, unit: 'MINUTES') {
try {
cmake_build(docker_type, path, make_flag)
// always run cpp test when build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment has been sneaking around with this function but actually no longer relates to this function?

).trim()
def tag = "${date_Ymd_HMS}-${upstream_revision.substring(0, 8)}"
{% for image in images %}
update_docker({{ image.name }}, "tlcpackstaging/test_{{ image.name }}:${tag}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this is just to avoid conflicts with the over night build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, in the follow up(s) we'll switch it over

@driazati driazati force-pushed the rebuild_images2 branch 2 times, most recently from 9110565 to aa75d80 Compare June 7, 2022 20:33
@driazati
Copy link
Member Author

driazati commented Jun 7, 2022

I also manually triggered CI to re-run as the bot didn't seem to be doing it?

the bot is broken but fixes are in flight, we're looking into better ways to test these kinds of things before deploying (e.g. the merge functionality also isn't working due to some auth errors I haven't been able to sort out yet)

This rebuilds Docker images and uses them in later stages in the same build. If the build is running on `main`, then the images are uploaded to Docker Hub automatically once the run is complete. Images are always rebuilt, but Docker Hub functions as a cache. If there have been no changes to `docker/` since the last available hash on Docker Hub, then the build will just use the images from Hub.
@Mousius Mousius merged commit 6d557ff into apache:main Jun 9, 2022
@Mousius
Copy link
Member

Mousius commented Jun 9, 2022

Nice work @driazati, looking forward to seeing this fully functional 😸

Kathryn-cat pushed a commit to Kathryn-cat/tvm that referenced this pull request Jun 10, 2022
This rebuilds Docker images and uses them in later stages in the same build. If the build is running on `main`, then the images are uploaded to Docker Hub automatically once the run is complete. Images are always rebuilt, but Docker Hub functions as a cache. If there have been no changes to `docker/` since the last available hash on Docker Hub, then the build will just use the images from Hub.
driazati added a commit to driazati/tvm that referenced this pull request Jun 10, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
driazati added a commit to driazati/tvm that referenced this pull request Jun 10, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
driazati added a commit to driazati/tvm that referenced this pull request Jun 10, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
driazati added a commit to driazati/tvm that referenced this pull request Jun 10, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
driazati added a commit to driazati/tvm that referenced this pull request Jun 13, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
driazati added a commit to driazati/tvm that referenced this pull request Jun 14, 2022
This activates the code from apache#11329 behind a couple Jenkins flags:
* `USE_AUTOUPDATED_DOCKER_IMAGES` - whether or not to use the dynamically determined docker images in the build
* `PUSH_DOCKER_IMAGES` - whether or not to push built images to `tlcpack` on `main`

These are there so we can quickly toggle this behavior off it breaks anything. Once this is on though, any docker changes in a PR should be reflected entirely in that build, so there would be no need for a separate PR to update the Docker images.

We will have to see if this works in practice, as the docker image builds do download quite a bit of data (which can flake) and add some runtime overhead (about 30m), so when an update lands all PRs will end up needing to rebuild until the merged commit finishes.
juda pushed a commit to juda/tvm that referenced this pull request Jun 21, 2022
This rebuilds Docker images and uses them in later stages in the same build. If the build is running on `main`, then the images are uploaded to Docker Hub automatically once the run is complete. Images are always rebuilt, but Docker Hub functions as a cache. If there have been no changes to `docker/` since the last available hash on Docker Hub, then the build will just use the images from Hub.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants