Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FITS reader #2821

Closed
profjsb opened this issue Sep 17, 2019 · 3 comments
Closed

FITS reader #2821

profjsb opened this issue Sep 17, 2019 · 3 comments
Labels
cuIO cuIO issue

Comments

@profjsb
Copy link

profjsb commented Sep 17, 2019

This is a request for a GPU FITS reader. Such a reader will be a welcomed and critical component as the community starts to transition data pipelines from CPU- to GPU-centric workflows.

The common image exchange format in astronomy is FITS (Flexible Image Transport System) and there are well-supported CPU-centric packages for reading (and writing) FITS, such as PyFITS (https://pythonhosted.org/pyfits/) and astropy.io (https://docs.astropy.org/en/stable/io/fits/). As part of many data pipelines, it is common to read FITS files from disk, combine and manipulate the images/spectra (as operations on numpy arrays, e.g.), and then write the results back to disk. The reduction pipeline pypeit (https://github.com/pypeit/PypeIt/tree/master/pypeit) is a good example package to see the end-to-end manipulation of FITS files for science.

With the relatively recent introduction of neural network-based steps for astronomical image processing (e.g., we have a package called deepCR, https://github.com/profjsb/deepCR, https://arxiv.org/abs/1907.09500) the best practice when wanting to use GPUs currently is to read FITS data from disk, push the data to a GPU Tensor in pytorch, apply machine learning models, then convert the Tensor back to a CPU-based numpy array. This roundtrip adds overhead. We'd like to be able to read FITS files directly to a GPU Tensor in pytorch (and the like). Of course writing FITS files directly from GPU Tensors would be a next step.

If a FITS reader is developed that can easily lead to the construction of a tensor variable on the GPU, this will open up our community to develop entirely GPU-based image processing pipelines. Much of our manipulations on images are very amenable to the massive parallelism afforded by GPUs. As someone leading an astronomy-meets-machine-learning group at UC Berkeley, I'm personally excited about this as we start to make use of GPU-based clusters, such as the new "Perlmutter" system at NERSC (https://www.nersc.gov/systems/perlmutter/).

@mrocklin
Copy link
Collaborator

For context, FITS is a very common format in the astronomy community. It's not my favorite file format, but it does have the advantage of being very simple (if I remember correctly).

cc'ing @mjsamoht because this seems like maybe an easy cuIO task. It seems similar in spirit to #2727 , but probably both slightly lower impact and lower effort.

@mrocklin mrocklin added the cuIO cuIO issue label Sep 17, 2019
@datametrician
Copy link
Contributor

I'm not sure this is the right repo for this request. I believe https://github.com/NVIDIA/DALI would be better since they maintain the kernels for image IO.

@mrocklin
Copy link
Collaborator

Thanks @datametrician . I'll move this over .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue
Projects
None yet
Development

No branches or pull requests

4 participants