feat(transform): task to apply reduction ops to a container #241

ljgray · 2023-04-27T01:27:45Z

Apply a reduction op across up to n-1 axes of a rank-n dataset. The reduction can be applied to more than one dataset in the container, and axis downselection can be applied to n-1 axes as well. A weight mask can optionally be generated, which masks values where the weights are zero and is broadcast to the dataset shape if possible. This mask can also be applied before doing the reduction, using a masked array.

The task tries to optimize redistribution by first trying to minimize the number of redistributions that happen and then trying to choose the best possible axes to redistribute over if needed. This is just being done by an inverse sort by axis length.

I've tested this a fair bit, but I wouldn't be surprised if there are still some edge cases that could break. It generally works as expected when being used to generate variance-over-freq and variance-over-el datasets in the chime daily pipeline.

Requires radiocosmology/caput#242

sjforeman

I don't see any obvious flaws in this implementation (modulo a few points where I got confused and wrote comments), but the logic is sufficiently complicated that it would be good to have someone else take a look

draco/analysis/transform.py

jrs65

I'm going to echo Simon's comments here. This is nice, and I think could be broadly useful, but I fear it might be too smart for its own good. The actual task we want to do (variance over freq, variance over elevation) could be implemented in probably < 20 lines. This generic version is 300, and might not be flexible enough for what we want.

One of my fears is that it's pretty hard to specify a sensible generic reduction of the data when you would like to be able to use different weightings (unweighted, masked, and weighted) and have things turn out sensibly. Some of the common reductions have fairly sensible definitions (e.g. mean, variance), but others don't, i.e. how do you do a weighted prod, or a weighted min?

I'm not sure how exactly to go about simplifying this. One thing I might suggest is defining a common API, like:

def reduction(data: np.ndarray, weight: np.ndarray, axis: int, weighting: Literal["none", "masked", "weighted"]) -> tuple[np.ndarray, np.ndarray]:
    """Perform a reduction for data and weight to return a reduced data and weight."""
    ...

and then having specific implementations for each of a more limited set of operations you actually want to support and be well defined.

Something like:

def reduction_var(data: np.ndarray, weight: np.ndarray, axis: int, weighting: Literal["none", "masked", "weighted"]) -> tuple[np.ndarray, np.ndarray]:
    if weighting == "none":
        v = data.var(axis=axis)
    else:
        if weighting == "masked":
            w = (weight > 0).astype(weight.dtype)
        else:
            w = weight.copy()
        w *= w.sum(axis=axis) / w.shape[axis]
        v = np.var(w**0.5 * v, axis=axis)
    return v, np.ones_like(v)

That above is a little ugly, so maybe there's a better approach, but I imagine you get the vague idea.

draco/analysis/transform.py

draco/core/containers.py

test/test_containers.py

ketiltrout

I think this looks okay. It might be worth adding a comment to _make_output_container pointing out the choice of index_map[0] for the value of the rolled-up axis.

…ainer

ljgray force-pushed the ljg/reduce-across-axis branch 4 times, most recently from 0e1796a to 3fd9659 Compare April 28, 2023 00:17

ljgray marked this pull request as ready for review April 28, 2023 00:20

ljgray requested review from jrs65, sjforeman and Arnab-half-blood-prince April 28, 2023 00:27

ljgray mentioned this pull request Apr 28, 2023

feat(daily): add reduced ringmaps to daily config chime-experiment/ch_pipeline#191

Closed

sjforeman reviewed Apr 28, 2023

View reviewed changes

draco/analysis/transform.py Outdated Show resolved Hide resolved

draco/analysis/transform.py Outdated Show resolved Hide resolved

draco/analysis/transform.py Outdated Show resolved Hide resolved

draco/analysis/transform.py Outdated Show resolved Hide resolved

ljgray mentioned this pull request Apr 28, 2023

feat(daily): add reduced ringmaps to daily config chime-experiment/ch_pipeline#192

Merged

ljgray force-pushed the ljg/reduce-across-axis branch 9 times, most recently from 9fd3e58 to a4ba5db Compare May 1, 2023 20:42

jrs65 requested changes May 1, 2023

View reviewed changes

draco/analysis/transform.py Outdated Show resolved Hide resolved

ljgray force-pushed the ljg/reduce-across-axis branch 5 times, most recently from df01fe1 to d9eaf0f Compare May 2, 2023 20:05

jrs65 reviewed May 2, 2023

View reviewed changes

draco/analysis/transform.py Outdated Show resolved Hide resolved

ljgray force-pushed the ljg/reduce-across-axis branch 3 times, most recently from 2e64493 to 7da4aed Compare May 2, 2023 23:37

ljgray force-pushed the ljg/reduce-across-axis branch from ed63c39 to 86dc848 Compare June 13, 2023 22:16

ljgray requested a review from sjforeman June 20, 2023 20:00

ljgray force-pushed the ljg/reduce-across-axis branch from 2298e91 to 5b9e99f Compare June 21, 2023 21:08

jrs65 requested changes Jun 22, 2023

View reviewed changes

draco/analysis/transform.py Outdated Show resolved Hide resolved

draco/analysis/transform.py Outdated Show resolved Hide resolved

draco/core/containers.py Show resolved Hide resolved

test/test_containers.py Show resolved Hide resolved

ljgray force-pushed the ljg/reduce-across-axis branch 7 times, most recently from 9d31e3d to 5eabac1 Compare June 22, 2023 20:06

ljgray requested a review from jrs65 June 22, 2023 20:12

ljgray force-pushed the ljg/reduce-across-axis branch 4 times, most recently from 7076057 to b145a27 Compare June 22, 2023 23:31

ljgray added 4 commits June 28, 2023 11:36

feat(containers): update copy_dataset_filter to handle multiples axes

60deb1e

feat(io): move selection parsing out of BaseLoadFiles into a mixin

1608122

fix(transform): remove extra kwarg from copy_datasets_filter

8cc33c2

fix(transform): remove commented lines in CollateProducts

eb9dfd7

ljgray force-pushed the ljg/reduce-across-axis branch from b145a27 to 0e7da70 Compare June 28, 2023 18:36

ljgray requested a review from ssiegelx June 28, 2023 19:14

ljgray force-pushed the ljg/reduce-across-axis branch from 0e7da70 to 715bba4 Compare June 28, 2023 21:45

ketiltrout approved these changes Jun 28, 2023

View reviewed changes

ljgray added 3 commits June 28, 2023 15:18

feat(transform): tasks to apply reduction and downselection to a cont…

74e0f75

…ainer

feat(test): add copy_dataset_filter test

7e6e495

git: add .c extension to gitignore

82fdc54

ljgray force-pushed the ljg/reduce-across-axis branch from 715bba4 to 82fdc54 Compare June 28, 2023 22:18

ljgray merged commit cdbbfb9 into master Jun 28, 2023
4 checks passed

ljgray deleted the ljg/reduce-across-axis branch June 28, 2023 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transform): task to apply reduction ops to a container #241

feat(transform): task to apply reduction ops to a container #241

ljgray commented Apr 27, 2023 •

edited

Loading

sjforeman left a comment

jrs65 left a comment

ketiltrout left a comment

feat(transform): task to apply reduction ops to a container #241

feat(transform): task to apply reduction ops to a container #241

Conversation

ljgray commented Apr 27, 2023 • edited Loading

sjforeman left a comment

Choose a reason for hiding this comment

jrs65 left a comment

Choose a reason for hiding this comment

ketiltrout left a comment

Choose a reason for hiding this comment

ljgray commented Apr 27, 2023 •

edited

Loading