Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New docker image based on pure ubuntu and gracefully handle services #6080

Merged
merged 2 commits into from
Sep 8, 2023

Conversation

unkcpz
Copy link
Member

@unkcpz unkcpz commented Jul 10, 2023

Supercede #6072

Details are described in aiidateam/AEP#39
I commend on the PR for some technical details and the link to s6 toolkit.

The PR is from aiidateam directly so the push to ghcr.io can be triggered. I also set the arm64 test only triggered from PR from aiidateam to prevent the cost from malicious PRs from the forked repository.

  • CI tests
  • Clean up the environment variable pass to the s6-overlay startup scripts.
  • iron-up
  • Update the documentation of using docker to run aiida.

@unkcpz unkcpz changed the title Docker stack New docker image based on pure ubuntu and gracefully handle services Jul 10, 2023
@unkcpz unkcpz marked this pull request as draft July 10, 2023 10:10
@codecov
Copy link

codecov bot commented Jul 10, 2023

Codecov Report

Patch coverage has no change and project coverage change: +0.07% 🎉

Comparison is base (4e0e7d8) 79.64% compared to head (a399e21) 79.71%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6080      +/-   ##
==========================================
+ Coverage   79.64%   79.71%   +0.07%     
==========================================
  Files         546      546              
  Lines       39811    39987     +176     
==========================================
+ Hits        31704    31870     +166     
- Misses       8107     8117      +10     

see 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.

📢 Have feedback on the report? Share it here.

@unkcpz unkcpz force-pushed the docker-stack branch 5 times, most recently from d4f6c9d to 564c2c0 Compare July 10, 2023 15:49
@unkcpz
Copy link
Member Author

unkcpz commented Jul 10, 2023

I push the image to ghcr.io, to test the PR image, just run:

docker run -it ghcr.io/aiidateam/aiida-core:pr-6080 bash

@unkcpz unkcpz marked this pull request as ready for review July 10, 2023 16:00
pyproject.toml Outdated Show resolved Hide resolved
.github/workflows/docker.yml Show resolved Hide resolved
All commands `build`, `tests`, and `up` will use the locally detected platform and use a version tag based on the state of the local git repository.
However, you can also specify a custom platform or version with the `--platform` and `--version` parameters, example: `doit build --arch=amd64 --version=my-version`.

You can specify target stacks to build with `--target`, example: `doit build --target base --target base`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doit not only used for local build and test, but it is the command run on the CI action directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To see how it works in CI, you can run this command locally from .docker folder.

.docker/base-with-services/Dockerfile Outdated Show resolved Hide resolved

RUN mamba install --yes \
--channel conda-forge \
postgresql=${PGSQL_VERSION} && \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgresql is installed from conda forge to avoid the version may not exist from apt repo.

# Install erlang.
RUN apt-get update --yes && \
apt-get install --yes --no-install-recommends \
erlang \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is tricky to install the correct RMQ version because the erlang version needs to compatible with the final rabbitmq version as shown in https://www.rabbitmq.com/which-erlang.html

.docker/base/Dockerfile Outdated Show resolved Hide resolved
.docker/dodo.py Outdated Show resolved Hide resolved
.docker/dodo.py Outdated Show resolved Hide resolved
@unkcpz
Copy link
Member Author

unkcpz commented Jul 16, 2023

Hi @sphuber, I update the AEP with the details. Let me know if any more information is needed.

@sphuber
Copy link
Contributor

sphuber commented Sep 4, 2023

Is this fully ready for final review @unkcpz . I notice the tests are not yet passing?

@unkcpz unkcpz force-pushed the docker-stack branch 7 times, most recently from 7b2976a to cd02a89 Compare September 4, 2023 11:55
@unkcpz
Copy link
Member Author

unkcpz commented Sep 4, 2023

Is this fully ready for final review @unkcpz . I notice the tests are not yet passing?

Hi @sphuber, yes it is ready to review. I also rebase the commits to every commit are specific to one task.
The failed tests were from rabbitmq test, the changes have nothing to do with it, I rebased so should all be fine now.

Edit: now the test failed for install CI test and seems caused by pymatgen?

@sphuber
Copy link
Contributor

sphuber commented Sep 4, 2023

The test-install tests still fail. Two out of three failures seem to be due to a new pymatgen version released two days ago.

The third failure is not (necessarily) related to that. I have seen it before in my branch where I am introducing the pydantic dependency. Somehow, this affects the test that tries to unload the profile, which now somehow holds on to a reference of the backend. I have no clue what is causing this or how to fix it. I have lost many an hour trying to debug this already...

@sphuber
Copy link
Contributor

sphuber commented Sep 4, 2023

Here is the culprit: materialsproject/pymatgen@118c245
Apparently, properties was deprecated for Specie.

@sphuber
Copy link
Contributor

sphuber commented Sep 4, 2023

@unkcpz I have a tentative fix for the pymatgen problems in #6109. Could you please squash your commits on this branch into one and then rebase on that branch to see if it fixes the tests. In my PR I don't think the relevant workflow is triggered.

@unkcpz unkcpz force-pushed the docker-stack branch 2 times, most recently from edfbdd5 to 00620b9 Compare September 4, 2023 15:23
@unkcpz
Copy link
Member Author

unkcpz commented Sep 4, 2023

Could you please squash your commits on this branch into one and then rebase on that branch to see if it fixes the tests. In my PR I don't think the relevant workflow is triggered.

Yes, done! But seems #6109 still fails with those three tests.

@sphuber
Copy link
Contributor

sphuber commented Sep 5, 2023

Yes, done! But seems #6109 still fails with those three tests.

No, there is just one test remaining. The two related to pymatgen are fixed, but as I already explained above, the third one is not related and I have no clue what causes it.

@unkcpz
Copy link
Member Author

unkcpz commented Sep 5, 2023

I see. But it is strange to me why pre-commit CI in #6109 is not failed but failed in this PR? It also has the old pymatgen version installed (python3.9).

@sphuber
Copy link
Contributor

sphuber commented Sep 5, 2023

But it is strange to me why pre-commit CI in #6109 is not failed but failed in this PR?

It also failed in my PR originally, but I fixed it. The problem is that since we are now using API of two different versions, pylint will complain for the syntax of the version that is not installed. So we need to add disable statements for both versions of the API.

@unkcpz unkcpz force-pushed the docker-stack branch 2 times, most recently from a9191d2 to df6aece Compare September 6, 2023 11:49
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @unkcpz . I have added some initial comments. I have also commented on the AEP with a question on the image names. Let's discuss that there first, before we start changing it here.

.github/workflows/docker-push.yml Outdated Show resolved Hide resolved
.github/workflows/docker-merge-tags.yml Outdated Show resolved Hide resolved
.github/actions/load-image/action.yml Outdated Show resolved Hide resolved
.github/actions/create-dev-env/action.yml Outdated Show resolved Hide resolved
.github/workflows/docker.yml Show resolved Hide resolved
docs/source/intro/run_docker.rst Show resolved Hide resolved
docs/source/intro/run_docker.rst Show resolved Hide resolved
docs/source/intro/run_docker.rst Show resolved Hide resolved
Minimal working version of the basic docker image build locally

base image can run

First working with-service stack

adapt dodo script to build for correct arch

Use mamba image back again

workable s6-overlay solution

Properly tear down the aiida daemon

Fixes CI

test pass locally

fix env variable in s6 scripts

suppress dev version warning

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

push ghcr only when it is pull request form org

upload artifact which is time consuming only for PR from org

Add a CI action only for container build

not run some container action from fork

Add doc for the docker build and how to use the new docker stack

Fixes the package permission

pre-commit fix

- pg log to home folder

Fixes CI errors

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Update pyproject.toml

Update .github/workflows/docker.yml

exclude .docker files from yapf and pylint

Update .docker/dodo.py

With git, vim, ssh client pre-installed

more doc on how to start the exist container

Add .local/bin to PATH for packages install by user

Pin PyYAML<5.3

Looks like the particular issue is caused by a new release of Cython which breaks PyYAML builds.
yaml/pyyaml#724 yaml/pyyaml#724
Pinning PyYAML to earlier version seems to do the trick for now...

Remove doit but use docker baked directly

Rename image to aiida-core-base and aiida-core-with-services
@sphuber sphuber self-requested a review September 8, 2023 09:33
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, thanks a lot @unkcpz

@sphuber sphuber merged commit 9e80834 into main Sep 8, 2023
50 checks passed
@sphuber sphuber deleted the docker-stack branch September 8, 2023 09:34
sphuber pushed a commit that referenced this pull request Nov 14, 2023
The current Docker image provided with `aiida-core` depends on
`aiida-prerequisites` as a base image. This image is maintained outside
of the `aiida-core` repo, making it additional maintenance to keep it up
to date when a new `aiida-core` version is released. In addition, the
`aiida-prerequisites` image is no longer maintained because the AiiDAlab
stack now depends on another base image.

Finally, the `aiida-prerequisites` design had shortcomings as to how the
required services, PostgreSQL and RabbitMQ, are handled. They had to be
started manually and were not cleanly stopped on container shutdown.

An AEP was submitted to add two Docker images to `aiida-core` that
simplifies their maintenance and that improve the usability by properly
and automatically handling the services. See the AEP for more details:
https://aep.readthedocs.io/en/latest/009_improved_docker_images/readme.html

Cherry-pick: 9e80834
sphuber pushed a commit that referenced this pull request Nov 15, 2023
The current Docker image provided with `aiida-core` depends on
`aiida-prerequisites` as a base image. This image is maintained outside
of the `aiida-core` repo, making it additional maintenance to keep it up
to date when a new `aiida-core` version is released. In addition, the
`aiida-prerequisites` image is no longer maintained because the AiiDAlab
stack now depends on another base image.

Finally, the `aiida-prerequisites` design had shortcomings as to how the
required services, PostgreSQL and RabbitMQ, are handled. They had to be
started manually and were not cleanly stopped on container shutdown.

An AEP was submitted to add two Docker images to `aiida-core` that
simplifies their maintenance and that improve the usability by properly
and automatically handling the services. See the AEP for more details:
https://aep.readthedocs.io/en/latest/009_improved_docker_images/readme.html

Cherry-pick: 9e80834
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants