Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration testing: Automate running notebooks upon PR to update base image #44

Open
4 tasks
abarciauskas-bgse opened this issue Feb 13, 2024 · 5 comments
Assignees

Comments

@abarciauskas-bgse
Copy link
Contributor

Goal:
Whenever a proposed update is made for the base image of the VEDA JHub environment, such as #43, a set of automated tests should run using that image which assert that core functionality we expect to work in the VEDA JupyterHub still works as expected.

Proposed steps:

  • identify or modify existing set of https://github.com/NASA-IMPACT/veda-docs to use for integration testing
  • review the models of other projects such as pangeo-forge, which runs a suite of simple automated tests maintained in the same repo and maap-data-system-tests which is a more complex system using test harness.
  • propose a solution for VEDA - aiming to reduce duplication with veda-docs if possible - e.g. can we use some of the notebooks in veda-docs as our tests.
  • implement
@jsignell
Copy link

I have been thinking about this ticket and looked at the examples provided. I don't know if this is actually the kind of thing that is well-suited to automation. In the examples the tests are super basic. They are similar to the tests that run on conda-forge which is basically just "can you import the package?" In practice there are so many little things that can go wrong and so many small warnings that can crop up and workarounds that may no longer be necessary.

Here are my reasons why I think this might be a job for a human:

  1. The notebooks require access to data to run.
  2. It is likely that the notebooks will require small tweaks.
  3. We want to minimize sprawl in the veda-docs repo. Whenever possible docs should go in more general places.

I guess it depends on the frequency at which we expect to update the image (I'm assuming monthly time-scale).

@batpad
Copy link
Collaborator

batpad commented Mar 29, 2024

The repo2docker Github action does this in IMHO a really nice way: jupyterhub/repo2docker-action#83

@ranchodeluxe how do you feel about keeping most of the structure here, but moving to using the repo2docker github action, which will give us a lot of this stuff for free? If that seems like a reasonable thing to do to you, I can dig in a bit more and make a separate ticket for moving the Action, which will make things like these integration tests significantly easier to do.

cc @yuvipanda

@batpad
Copy link
Collaborator

batpad commented Mar 29, 2024

Just to comment on approaches between this and #45 -

I do think we likely want to do both. Ideally, the automated tests would be with "smaller" notebooks, with less external dependencies, data access, etc. that I think we'd specifically craft for the purpose of tests, to be intentional about what we are testing, rather than use the existing notebooks from docs. We should also accept that we might not be able to catch every conceivable type of issue with the automated tests, but they should give us a good baseline to catch common kinds of regressions and bugs.

I do also think it's probably a good idea to come up with a check-list for manual testing before we release new images onto the hubs. They should clearly state steps to follow, the notebooks to test, and what the expected outputs are. Ideally, we should have a document that "anyone" can use to test, without any previous context or knowledge about the details of the notebooks being tested.

For the automated tests, I'd push strongly to re-use the pattern from the repo2docker action mentioned above - it seems very elegant, and I would rather use upstream maintained code where possible rather than inventing our own stuff. Am happy to ticket that out in more detail and figure out moving toward that goal in the coming quarter if there are no objections.

For the manual tests, @abarciauskas-bgse @wildintellect @jsignell, do you have a good sense of how best to coordinate on coming up with something like a check-list with clear instructions on how someone can test notebooks in docs, that we can then link to whenever we make a PR for a new image release and make that manual testing part of our release process?

@ranchodeluxe
Copy link
Collaborator

The repo2docker Github action does this in IMHO a really nice way: jupyterhub/repo2docker-action#83

@ranchodeluxe how do you feel about keeping most of the structure here, but moving to using the repo2docker github action, which will give us a lot of this stuff for free? If that seems like a reasonable thing to do to you, I can dig in a bit more and make a separate ticket for moving the Action, which will make things like these integration tests significantly easier to do.

cc @yuvipanda

I'm down for whatever since they are just images, the white hot kernel of the internet. Just as long as there is flexibility and easy to navigate trouble shooting then let's do one, both or All The Things ™️

@wildintellect
Copy link
Collaborator

I think we're all in agreement that the core of this ticket is about testing new images, and that the VEDA-docs are not the right way.
So we should focus on what is the checklist of things we need to check, and based on that we can decide what could be automated and what must be manual (human).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants