Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized and refactored common compound code. #3005

Merged
merged 2 commits into from
Feb 13, 2021

Conversation

mnmelo
Copy link
Member

@mnmelo mnmelo commented Oct 23, 2020

Fixes #3000

Changes made in this Pull Request:

Performance benchmarks, in the discussion of #3000, are quite promising, especially with caching.

Problem

Cached compound masks work great and can give large speedups, but invalidating them is tough: if I have an AtomGroup with cached masks for segments and the user moves a residue between segments those cached masks should no longer be valid, but how to notify an AtomGroup of something occurring at the ResidueGroup/SegmentGroup level?

Pointers appreciated.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@pep8speaks
Copy link

pep8speaks commented Oct 23, 2020

Hello @mnmelo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 329:1: W293 blank line contains whitespace
Line 333:1: W293 blank line contains whitespace
Line 348:1: W293 blank line contains whitespace
Line 362:1: W293 blank line contains whitespace
Line 366:1: W293 blank line contains whitespace
Line 369:1: W293 blank line contains whitespace

Line 816:36: E127 continuation line over-indented for visual indent
Line 819:36: E127 continuation line over-indented for visual indent
Line 1711:47: E128 continuation line under-indented for visual indent

Comment last updated at 2021-02-08 05:22:51 UTC

@codecov
Copy link

codecov bot commented Oct 23, 2020

Codecov Report

Merging #3005 (a639166) into develop (258a73e) will decrease coverage by 0.06%.
The diff coverage is 98.79%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3005      +/-   ##
===========================================
- Coverage    93.14%   93.08%   -0.07%     
===========================================
  Files          171      171              
  Lines        22750    22706      -44     
  Branches      3218     3209       -9     
===========================================
- Hits         21191    21135      -56     
- Misses        1502     1514      +12     
  Partials        57       57              
Impacted Files Coverage Δ
package/MDAnalysis/lib/_cutil.pyx 100.00% <ø> (ø)
package/MDAnalysis/core/groups.py 98.43% <98.79%> (-0.16%) ⬇️
package/MDAnalysis/coordinates/H5MD.py 91.08% <0.00%> (-3.47%) ⬇️
package/MDAnalysis/core/universe.py 96.83% <0.00%> (-0.98%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 258a73e...a639166. Read the comment docs.

@mnmelo
Copy link
Member Author

mnmelo commented Oct 26, 2020

In #2376 I propose a solution to the cache invalidation problem, which is to cache at the Universe level.

Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the DRY changes in this PR a lot. I can't see where the caching is, has this been removed for now? If so I'm happy to merge these changes.

@mnmelo
Copy link
Member Author

mnmelo commented Feb 7, 2021

I have dropped the caching for this PR, so it's more contained and easier to review. (Also, the caching invalidation mechanism is what is still up for discussion).

Gonna clean up these lingering bugs and update soon (should've tested locally 😬).

Performance increased and code duplication reduced.
@mnmelo mnmelo force-pushed the feature-common-compound-splitting branch from cd85aea to 4d4752c Compare February 7, 2021 19:56
@mnmelo
Copy link
Member Author

mnmelo commented Feb 7, 2021

Ok, this is ready for review. I guess codecov hit a roadbump or something because only groups.py was touched and diff coverage is almost 100% there.

@mnmelo mnmelo marked this pull request as ready for review February 7, 2021 23:17
@mnmelo
Copy link
Member Author

mnmelo commented Feb 8, 2021

To get my feet wet in asv I also implemented the performance benchmarks I had posted in #3000. I really don't know what the status is on asv usage, so guidance is appreciated.

@mnmelo
Copy link
Member Author

mnmelo commented Feb 11, 2021

Ping @richardjgowers. I think this is ready to merge. There seem to be some inconsistencies in Codecov computation triggering a fail there.
Also, this PR removes a couple hundred lines of repeated code, and the coverage may indeed go down just because of that, even if all remaining lines are as covered as before.

@richardjgowers richardjgowers merged commit 42c5417 into develop Feb 13, 2021
@richardjgowers richardjgowers deleted the feature-common-compound-splitting branch February 13, 2021 14:12
@richardjgowers
Copy link
Member

@mnmelo thanks!

PicoCentauri pushed a commit to PicoCentauri/mdanalysis that referenced this pull request Mar 30, 2021
* Optimized and refactored common compound code.

Performance increased and code duplication reduced.

* Added compound-splitting performance benchmarks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use of compounds involves excessive sorting and fancy-indexing of AtomGroups and compound indices.
4 participants