Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pyedr data packaging and deploy on push #54

Merged
merged 7 commits into from
Nov 7, 2022
Merged

Conversation

IAlibay
Copy link
Member

@IAlibay IAlibay commented Jul 15, 2022

Fixes #52
Towards #58

Changes:

  • Adds pyedr/tests/data/* to the pyedr manifest.
  • Removes deploy on push so we don't spam everyone to approve the workflow every time something gets merged into master.

@codecov
Copy link

codecov bot commented Jul 15, 2022

Codecov Report

Base: 89.70% // Head: 82.97% // Decreases project coverage by -6.73% ⚠️

Coverage data is based on head (526d9ae) compared to base (2c0efef).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #54      +/-   ##
==========================================
- Coverage   89.70%   82.97%   -6.74%     
==========================================
  Files           7        3       -4     
  Lines         505      276     -229     
==========================================
- Hits          453      229     -224     
+ Misses         52       47       -5     
Impacted Files Coverage Δ
pyedr/pyedr/pyedr.py 82.19% <100.00%> (-0.53%) ⬇️
panedr/panedr/tests/test_edr.py
pyedr/pyedr/tests/datafiles.py
pyedr/pyedr/tests/test_edr.py
panedr/panedr/__init__.py

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change date in CHANGELOG.

ChangeLog Outdated Show resolved Hide resolved
@@ -1,3 +1,4 @@
include ../README.rst ../LICENSE.txt
include ../ChangeLog ../AUTHORS
include setup.py setup.cfg
include pyedr/tests/data/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be more specific so that backup files are not included??? Not sure if it’s worth the effort but wanted to mention it.

Copy link
Member Author

@IAlibay IAlibay Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you're trying convey here - essentially the issue we have is that the datafiles needed to run the tests, using say pytest --pyargs pyedr.tests aren't being included in the package. This means that the tests won't run.

That being said, if we don't want to include the tests files here that's fine too, we just won't tell folks that they can test locally outside of a source installation. In some ways I would much prefer that because the tarball for pyedr with the test files is ~ 15 MB, compared to 11.8 kB without.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant that instead of * use *.xvg or whatever more specific pattern includes the files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the point of not including test files with the package — seems overkill to make it 15 MB.

If we exclude the files then the tests should produce a message that state how one can get them.

I also don’t know how down-stream packagers like it if the tests cannot be run easily.

Is there a way to include the files in the tar file but exclude from wheel and conda???

Copy link
Member Author

@IAlibay IAlibay Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, in that case my vote would be towards it's not worth it. We shouldn't have backup files in there to begin with, and specific including file types means remembering to update the manifest every time we add a new file type (which can easily be forgotten about).

If we want to have the test files here, we should just bundle them all - that being said, I'm increasingly convinced that we shouldn't bundle them here, 14MB as a tarball means ~ 50 MB unpacked, that seems like too much of a burden, especially for what is likely to be an MDA core dependency.
edit: sorry didn't see your second message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops hadn't seen your second message come in @orbeckst - :/ it might be possible to make the tarball & wheels different, but then it's hard to make conda pick up something else that isn't the release or the pypi tarball (if that makes sense).

Essentially it gets rather messy.

One option here is to make cat.edr and cat.xvg much smaller given they are ~ 14M each (compared to ~ 100kB for the other files), @jbarnoud would need to weigh in on this here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, I left the whole tests out initially to have the less cumbersome experience for the users. If pyedr is meant to be even leaner, it makes sense to have the same approach. I also like packages being light, I am currently rewriting some of my code in rust in part because shipping gigabytes of dependences was becoming impractical.

I am not a huge fan of shipping the tests without the data though, as the data is an integral part of the tests. I should still be able to live with it.

Copy link
Member

@orbeckst orbeckst Jul 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pretty much agree with everything (edit:) @IAlibay said. (I hadn’t seen @jbarnoud ‘s comment while writing)

I’d be pragmatic for right now, though, and err on the side of “correctness” and completeness and merge the pr as is to create a package that includes everything you need to check that it’s working as advertised.

Then we have a basis to work from and we can decide how best to shrink the package footprint: if we can make the files smaller and keep the package under, say, 1 MB, then that would be a good way. Otherwise , removing the test files is probably the way to go.

I’ll leave it to you and @jbarnoud if you want to merge the pr or continue discussion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested using much smaller files in #55 and coverage seems unaffected, so if we want to reduce the test size I can get that PR ready for merging

@IAlibay IAlibay changed the title [0.6.1] Fix pyedr data packaging and deploy on push [HOLD DECISION - 0.6.1] Fix pyedr data packaging and deploy on push Jul 18, 2022
@orbeckst
Copy link
Member

I’m ok with including the test files for 0.6.1 to get it done, even though it makes for a fat distribution. For another release we can decide if we want to remove the test files again to make it a slim package.

@IAlibay
Copy link
Member Author

IAlibay commented Aug 2, 2022

From discussions I think we've agreed that including the data files isn't necessary. Not 100% sure on steps forward, but for now the 0.6.0 release is fine to move on with for the edr reader in MDA.

@orbeckst
Copy link
Member

orbeckst commented Sep 1, 2022

Should we merge this PR to include the data, then look at PR #55 and check that we can reduce the test data size?

@IAlibay IAlibay mentioned this pull request Sep 2, 2022
@IAlibay IAlibay changed the title [HOLD DECISION - 0.6.1] Fix pyedr data packaging and deploy on push Fix pyedr data packaging and deploy on push Nov 6, 2022
@IAlibay
Copy link
Member Author

IAlibay commented Nov 6, 2022

Latest commit should have addressed things, repo is now 600 KB unpacked which seems reasonable? (you could fit two copies of the repo on a floppy disk!)

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy.

@@ -1,3 +1,6 @@
include ../README.rst ../LICENSE.txt
include ../ChangeLog ../AUTHORS
include setup.py setup.cfg
include pyedr/tests/data/*.xvg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how relevant this is here, but finding the error with the user guide was a bit of a pain (MDAnalysis/mdanalysis#3885). So maybe it would be kinder to our future selves to include pyedr/tests/data/* instead of specifying file types explicitly? Might be that I'm missing something about the purpose of a MANIFEST.in as compared to what we fixed in the PR linked above though, in which case don't mind me.

Otherwise, LGTM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a similar thing but @orbeckst is probably more right imho, it's more dangerous to ship unknown files to users than to ship missing ones. Ideally any files in data should have tests so problems should pop out of testpypi CI (which we don't have yet but will do soon due to this separate action I'm building).

Note: The main difference from the MDA issue is that if stuff isn't in the manifest then it won't show up at all in the tarball here.

@IAlibay IAlibay merged commit 1094467 into master Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove deploy on merge
4 participants