Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cache for slow hdu indexing #278

Merged
merged 4 commits into from
Mar 19, 2024

Conversation

braingram
Copy link
Collaborator

@braingram braingram commented Mar 6, 2024

Opening a jwst file with many (~550) extensions (level 2 output for nirspec mos mode) take ~9 seconds (when skip_fits_update=True). Opening the same file with skip_fits_update=False takes longer, ~26 seconds. Surprisingly, a large portion of the added time is spent in calls to get_hdu.

Using cProfile and skip_fits_update=False opening now takes ~60 seconds (due to profiling overhead) and the rendered profile (with snakeviz) is as follows:

Screen Shot 2024-03-06 at 10 17 06 AM

zooming into _load_from_schema reveals ~40 seconds in get_hdu
Screen Shot 2024-03-06 at 10 17 57 AM

This PR adds a hdu_cache to skip repeated indexing the hdulist (which in some conditions for this file takes 2-3 ms). With this PR opening the same file with skip_fits_update=False (and no profiling) takes 12 seconds and with skip_fits_update=True still takes 9 seconds (most of this is spent in asdf.open as the tree is quite large). Running cProfile with skip_fits_update=False takes 24 seconds and zooming in to get_hdu reveals 5 seconds spent in get_hdu (20% down from 66% without this PR):
Screen Shot 2024-03-06 at 10 23 19 AM

Checklist

  • added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

Copy link

codecov bot commented Mar 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 64.95%. Comparing base (4d7c3a6) to head (15618ee).
Report is 16 commits behind head on main.

❗ Current head 15618ee differs from pull request most recent head ced15e7. Consider uploading reports for the commit ced15e7 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #278      +/-   ##
==========================================
+ Coverage   64.84%   64.95%   +0.11%     
==========================================
  Files         103      104       +1     
  Lines        5694     5718      +24     
==========================================
+ Hits         3692     3714      +22     
- Misses       2002     2004       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@braingram braingram marked this pull request as ready for review March 6, 2024 17:59
@braingram braingram requested a review from a team as a code owner March 6, 2024 17:59
@braingram
Copy link
Collaborator Author

Copy link
Collaborator

@hbushouse hbushouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@braingram braingram merged commit d8d47fc into spacetelescope:main Mar 19, 2024
19 checks passed
@braingram braingram deleted the get_hdu_cache branch March 19, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants