Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(memh5): zarr support #169

Merged
merged 7 commits into from
May 30, 2022
Merged

feat(memh5): zarr support #169

merged 7 commits into from
May 30, 2022

Conversation

nritsche
Copy link
Contributor

@nritsche nritsche commented Apr 1, 2021

TODO:

Also see radiocosmology/draco#130
and chime-experiment/ch_pipeline#82

Closes https://github.com/chime-experiment/Pipeline/issues/33

Edit by @anjakefala
As of 2021-08-23, here is the discussed remaining work for this PR, along with any unchecked boxes above:

  • remove references to chunkify()
  • Ignore compression and chunking options when saving to .h5 (log DEBUG level)
  • add zarr to the base chime environment on Cedar, before we set the pipeline running with this new code
  • make sure the docs can build and upload to readthedocs; one way is to either make bitshuffle an optional dependency for building the docs, or build the docs and upload them through readthedocs through github actions (see ch_util)
    - Update: ch_util actually pushes the docs to github pages, not readthedocs: ci(docs): move to github-pages chime-experiment/ch_util#10
  • clean up commits made by @anjakefala
  • @jrs65 and @anjakefala review the full PR
  • get a successful run with test_daily and then get a successful run with the full pipeline.

@nritsche nritsche force-pushed the rn/zarr branch 2 times, most recently from a6ab52e to 8b6656c Compare April 13, 2021 23:02
caput/mpiarray.py Outdated Show resolved Hide resolved
@nritsche nritsche force-pushed the rn/zarr branch 2 times, most recently from 1a61bcc to 22a159d Compare April 19, 2021 20:05
@nritsche nritsche force-pushed the rn/zarr branch 4 times, most recently from e6bcdcb to 977f32a Compare April 28, 2021 23:58
@radiocosmology radiocosmology deleted a comment from lgtm-com bot Apr 29, 2021
@nritsche nritsche force-pushed the rn/zarr branch 6 times, most recently from 191d37a to c03d3f8 Compare May 4, 2021 02:31
@lgtm-com
Copy link

lgtm-com bot commented May 7, 2021

This pull request introduces 1 alert when merging 695c556 into 2fc0149 - view on LGTM.com

new alerts:

  • 1 for Variable defined multiple times

@lgtm-com
Copy link

lgtm-com bot commented May 12, 2021

This pull request introduces 1 alert when merging 7499ed7 into 69a7bfa - view on LGTM.com

new alerts:

  • 1 for Variable defined multiple times

@nritsche nritsche force-pushed the rn/zarr branch 3 times, most recently from dcd6fa2 to ceef244 Compare May 13, 2021 17:54
@nritsche nritsche requested a review from jrs65 July 6, 2021 17:58
@nritsche nritsche marked this pull request as ready for review July 9, 2021 20:33
@anjakefala anjakefala force-pushed the rn/zarr branch 5 times, most recently from 8c37c6e to 21f081b Compare August 20, 2021 21:50
setup.py Show resolved Hide resolved
caput/pipeline.py Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
caput/config.py Outdated Show resolved Hide resolved
caput/fileformats.py Outdated Show resolved Hide resolved
@@ -0,0 +1,77 @@
import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tristpinsm have you had a look at these tests? Do they seem reasonable? I think I remember there were a bunch of edge cases you needed to deal with, I wonder if we should add specific tests for them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I guess it could be useful to test the behaviour of some edge cases, if only to be aware of their outcomes. These could include:

  • precision set to 0/inf/NaN
  • weight set to 0/inf/NaN
  • value set to inf/NaN
  • combinations of the above
  • not an edge case exactly but truncating negative numbers

More generally a more robust test that I was using when developing these functions was to generate an ensemble of random numbers, truncate them, and check that the distribution of truncation errors was what you expect. That might be cumbersome and difficult to implement in this context though.

Another thing to note is there is a comment that claims one of the tests fails...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a few special cases in for this.

caput/memh5.py Outdated Show resolved Hide resolved
deep_group_copy(
self,
f,
convert_attribute_strings=convert_attribute_strings,
convert_dataset_strings=convert_dataset_strings,
file_format=file_format,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why these need to get passed into deep_group_copy???

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be very confusing. We should try to simplify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is with file_format, compression opts have been removed.

caput/memh5.py Outdated Show resolved Hide resolved
caput/memh5.py Outdated Show resolved Hide resolved
@@ -712,12 +770,11 @@ def to_hdf5(

Parameters
----------
filename : str, h5py.File or h5py.Group
f : str, h5py.File or h5py.Group
File to write dataset into.
dataset : string
Name of dataset to write into. Should not exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs documentation for all parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, they are not documented anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need doc entry for chunks

caput/pipeline.py Show resolved Hide resolved
Comment on lines 19 to 59
@pytest.fixture
def h5_file_select_parallel(datasets, h5_file):
"""Prepare an HDF5 file for the select_parallel tests."""
if comm.rank == 0:
m1 = mpiarray.MPIArray.wrap(datasets[0], axis=0, comm=MPI.COMM_SELF)
m2 = mpiarray.MPIArray.wrap(datasets[1], axis=0, comm=MPI.COMM_SELF)
container = MemGroup(distributed=True, comm=MPI.COMM_SELF)
container.create_dataset("dset1", data=m1, distributed=True)
container.create_dataset("dset2", data=m2, distributed=True)
container.to_hdf5(h5_file)

comm.Barrier()

yield h5_file, datasets

fname = "tmp_test_memh5_select_parallel.h5"
comm.Barrier()

if comm.rank == 0:
rm_all_files(h5_file)


@pytest.fixture
def zarr_file_select_parallel(datasets, zarr_file):
"""Prepare a Zarr file for the select_parallel tests."""
if comm.rank == 0:
m1 = mpiarray.MPIArray.wrap(datasets[0], axis=0, comm=MPI.COMM_SELF)
m2 = mpiarray.MPIArray.wrap(datasets[1], axis=0, comm=MPI.COMM_SELF)
container = MemGroup(distributed=True, comm=MPI.COMM_SELF)
container.create_dataset("dset1", data=m1, distributed=True)
container.create_dataset("dset2", data=m2, distributed=True)
container.to_hdf5(fname)
container.to_file(zarr_file, file_format=fileformats.Zarr)

comm.Barrier()

yield fname, datasets
yield zarr_file, datasets

comm.Barrier()

# tear down

if comm.rank == 0:
file_names = glob.glob(fname + "*")
for fname in file_names:
os.remove(fname)
rm_all_files(zarr_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem like this could factor out the memh5 container creation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nicety, but not very important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this when I fix the tests.

@jrs65
Copy link
Contributor

jrs65 commented Feb 24, 2022

I added a bunch of comments, there not all that actionable, some are just notes to myself wondering how something is meant to work.

setup.py Outdated Show resolved Hide resolved
@@ -0,0 +1,77 @@
import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a few special cases in for this.

@tristpinsm
Copy link
Contributor

The last two commits I added (allowing non-contiguous arrays in truncate and an additional check on attributes) should probably be discussed

caput/config.py Outdated
@@ -510,7 +510,7 @@ class Project:

mode = file_format(default='zarr')
"""
options = ("hdf5", "zarr", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without a None default option, the caput.fileformats.check_file_format guessing functionality is broken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tristpinsm is looking into this.

nritsche and others added 7 commits May 30, 2022 11:29
Add support for serialising data into Zarr format files. This enables
compression for distributed containers.

Co-authored-by: Anja Kefala <anja.kefala@gmail.com>
Co-authored-by: Tristan Pinsonneault-Marotte <tristpinsm@gmail.com>
Co-authored-by: Richard Shaw <richard@phas.ubc.ca>
…ites

This significantly cleans up the handling of distributed file writes and
condenses the logic into a single flow for all file types.
Co-authored-by: Tristan Pinsonneault-Marotte <tristpinsm@gmail.com>
@jrs65 jrs65 merged commit 75987f1 into master May 30, 2022
@jrs65 jrs65 deleted the rn/zarr branch May 30, 2022 18:34
@anjakefala
Copy link
Contributor

👏 👏 👏 👏

@jrs65
Copy link
Contributor

jrs65 commented May 30, 2022

👏 👏 👏 👏

Slowly getting there. Now I'm fighting to merge mpiarray in!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants