Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NumPyDataLoader to provide axis option #20

Open
jcfr opened this issue Aug 1, 2023 · 0 comments
Open

Update NumPyDataLoader to provide axis option #20

jcfr opened this issue Aug 1, 2023 · 0 comments

Comments

@jcfr
Copy link
Contributor

jcfr commented Aug 1, 2023

This issue was created to document improvements to NumPyDataLoader contributed through:

and originally discussed in:


Originally posted by @dzenanz in Slicer/Slicer#6733 (comment), Slicer/Slicer#6733 (comment), Slicer/Slicer#6733 (comment) and Slicer/Slicer#6733 (comment)

My principal motivation for this PR is desire to easily and conveniently visualize dumps of NCHW 4D tensors. Having an option to load them as "channel last/fastest" would be nice, but I really want PyTorch's convention to be the default.

Even when reordering the axis for channel to be last, with the updated reader, I get an all-black image of "4 components". Visualizing something (vector norm, component average, or even just the first component) would be more useful than an all-black image.

Also, we could employ heuristic: the dimensions with largest size could be considered to be spatial. So (10, 4, 368, 640) would be interpreted as 4D image with 4 timepoints and 3D size of 640x368x10 (ijk). Prefer timepoints over components unless size is 3, which can be easily visualized as RGB.

Of course, all of these complications stem from .npy not having suitable metadata.

Originally posted by @lassoan in Slicer/Slicer#6733 (comment)

We could have auto, NCDHW, NKJIC axis options. Auto could implement the heuristics you described.

My only concern is that if sooner or later users will report that the loader randomly breaks. We experienced this with ITK, as ITK vector image reader has such heuristic, too. We were shocked when after being annoyed by seemingly random load failures for a long time we realized that loading fails for all data sets that have 3 or 4 time points simply because it is interpreted as channel data instead of time sequence.

Not much better, but maybe a bit more predictable heuristic could be to decide based on (composite) file extension. For example, something.torch.npy could make NCDHW order the default.

Why don't you save the result using pynrrd? You could use nrrd.write and specify the axis kinds like this:

nrrd.write("test.nrrd", a, {'kinds': ['RGB-color', 'domain', 'domain']})

How do you load the image into Slicer? Drag-and-drop is pretty tedious and it also shows the loaded image by default (there is no way to update an existing image). You could solve both tedious drag-and-drop and setting axes by using slicerio:

np.save("path/to/myfile.npy", someTensor),
nodeID = slicerio.server.file_load("path/to/myfile.npy", "NumpyArrayFile", {'axes': 'NCDHW'})

slicerio.server.file_load loads a data set in Slicer (if Slicer is not running then it launches it). If you load the node from a standard file format then you can reload it by simply calling slicerio.server.node_reload(id=nodeID).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant