feat(mpiarray): ufunc, __array_finalize, and getitem__ handling for MPIArray #162

anjakefala · 2021-02-01T22:07:11Z

If used with draco, requires: radiocosmology/draco#125

Subclassing NumPy arrays 101

view(MPIArray) -> __array_finalize__
MPIArray[slice] -> __getitem__ -> __array_finalize__
MPIArray() -> __new__ -> __array_finalize__

ufuncs are the universal functions that are applied element by element in nparrays. Such a np.add(), np.multiply(). If they end up summing over an axis, they are a ufunc with a reduce method. If they go element-by-element, they are a ufunc with an outer method. If they occur in-place, they have an at method.

ufunc -> __array_ufunc__

More links on the role all these various functions play in writing subclasses for NumPy arrays can be found here: https://github.com/chime-experiment/Pipeline/issues/81

New Exceptions

The new AxisException will be added. It will be raised when there are issues involving the integrity of the distributed axis with MPIArrays.

getitem

When you use global_slice to index into a distributed axis, then it will return an array for the rank on which the index exists, and None otherwise.
When you slice on a non-distributed axis, __getitem__ will return an MPIArray, whose distributed axes number might be lower than the original array, depending on if the slice results in an axis reduction. It is assumed that the distributed axis length is unchanged. An mpiutil.split_local call will be made.
When you directly slice into a distributed axis, it will index into the local arrays on each rank, and return a regular numpy array. This behaviour will be deprecated, and replaced with the raising of an AxisException.

ufunc

Upon a ufunc being executed, both inputs and outputs are converted to
standard nparrays, and then the nparray ufunc is called
all inputted MPIArrays must be distributed along the same axes. An AxisException will be called, if they are not.
if no output are provided, the results are converted back to MPIArrays. The new array will either be distributed over that axis, or possibly one axis down for reduce methods. keepdims kwargs are handled.
If you pass a kwarg axis to the ufunc, it must not be the distributed axis. Operations along the distributed axis are not allowed.
For operations that normally return a scalar, the scalars will be wrapped into a 1D array, distributed across axis 0.
To produce the MPIArray output, MPIArray.wrap(..) is called. This means there is a call to mpiutil.split_local and mpiutil.allreduce.

array_finalize

This is called whenever a user uses .view(), __new__, and broadcasts on an MPIArray. It finalizes the creation of the output MPIArray. view()s can occur with __getitem__ calls.

If it is a __new__ call, it does nothing.

If we are in an np.ndarray.view() call, it does nothing. This should only occur when we are within a wrap()

If we are in a view(), it grabs the attributes from the origin array.

Misc

If a user wishes to create an MPIArray from an ndarray, they should use MPIArray.wrap(). They should not use ndarray.view(MPIArray).

https://github.com/chime-experiment/Pipeline/issues/81

jrs65 · 2021-02-02T16:57:07Z

@anjakefala we touched on this yesterday, but what I think would help a lot would be to collect a list of MPIArray calls that we would like to work and agree on what their behaviour should be (for which we might want to circulate around to chime-analysis for opinions). Then we can turn them all into unit test cases.

Something like:

# Setup array
comm = MPI.COMM_WORLD
dist_array = mpiarray.MPIArray((comm.size, 4), comm=comm, axis=0)
dist_array[:] = comm.rank

# I think this one is uncontroversial
assert dist_array.sum(axis=1) == 4 * comm.rank

# Should this be allowed?
# assert dist_array.sum(axis=0) == ???

# What should this do? It seems the two sensible options are that is sums over all axes, giving the same scalar everywhere...
assert dist_array.sum() == 4 * comm.size * (comm.size - 1) // 2

# ... or that it just ignores the distributed axis, giving a distributed 1-d array
assert dist_array.sum() == 4 * comm.rank

lgtm-com · 2021-02-04T22:16:54Z

This pull request introduces 1 alert when merging d2c1a05 into 4a85a3d - view on LGTM.com

new alerts:

1 for Suspicious unused loop iteration variable

caput/tests/test_mpiarray.py

jrs65 · 2021-02-16T22:17:19Z

caput/tests/test_mpiarray.py

+        dist_arr_add = dist_arr + dist_arr
+
+        # Check that you can add two numpy arrays,
+        # if they are distributed along the same axes
+        # Check that you can multiple a numpy array against a scalar
+        assert (dist_arr_add == dist_arr_scalar).all()


Also, note that .all() in an MPIArray reduction over all axes, and we hadn't clearly worked out the semantics for that yet.

Seeing how it works with .all() maybe pushes it in the direction of you should get one value per rank.

caput/mpiarray.py

anjakefala · 2021-02-23T01:26:39Z

Linter is failing due to pylint updates. This PR is ready for another conversation now.

caput/mpiarray.py

"Generally accepted style in Python is to avoid staticmethods unless you have a good reason"

…precated

…omm functions

… variant

…aving a single element

anjakefala · 2022-05-30T20:05:47Z

😭

anjakefala force-pushed the mpiarray branch 3 times, most recently from 951d27a to bcbd75f Compare February 4, 2021 01:03

anjakefala force-pushed the mpiarray branch 12 times, most recently from 3131917 to 4286cee Compare February 12, 2021 23:30

anjakefala marked this pull request as ready for review February 16, 2021 17:32

jrs65 self-requested a review February 16, 2021 17:35

jrs65 requested changes Feb 16, 2021

View reviewed changes

anjakefala force-pushed the mpiarray branch 3 times, most recently from 8165556 to a5b4311 Compare February 22, 2021 23:01

anjakefala requested a review from jrs65 February 23, 2021 01:26

anjakefala force-pushed the mpiarray branch from 6f14021 to 1582984 Compare February 23, 2021 01:29

anjakefala commented Feb 23, 2021

View reviewed changes

caput/mpiarray.py Outdated Show resolved Hide resolved

anjakefala force-pushed the mpiarray branch 3 times, most recently from 9cfe4e2 to 9190207 Compare March 1, 2021 22:26

anjakefala and others added 21 commits May 27, 2022 15:23

style(mpiarray): move _mpi_to_ndarray to be a module-level function

ceee20b

"Generally accepted style in Python is to avoid staticmethods unless you have a good reason"

docs(mpiarray): add documentation to global class MPIArray

06ade6e

build(flake8): add spaces for flake8

1a0077c

build(mpiarray): ignore not relevant pylint remarks

4359716

fix(mpiarray): use setters for protected variables

40b8991

fix(mpiarray): more linter fixes

21c48dd

docs(mpiarray): bold subheadings

bac2388

fix(mpiarray): wrapped scalars should only have a single axis

1befdca

docs(mpiarray): port docstrings to pass with doctest

c5e0514

docs(mpiarray): clarify that direct indexing into parallel axis is de…

9883aef

…precated

Tidied up docstring and wrapped some lines

88a43cf

fix(mpiarray): pass original array's dtype to newly viewed MPIArrays

ecfdd74

tests(mpiarray): add tests to ensure proper handling of complex dtypes

1249b1e

tests(mpiarray): add tests for Allreduce and MPIArray

c313e80

docs(mpiarray): add documentation and warnings about lower-case MPI.C…

c232e24

…omm functions

feat(mpiarray): add allreduce and Allreduce operations to MPIArray

dc99ce1

tests(mpiarray): update allreduce tests for MPIArray

bc421af

fix(docs): fix docstring error

426a138

style(mpiarray): run black

f7ed00a

style(mpiarray): move reduce code underneath allreduce, memory buffer…

a75e3d2

… variant

feat(mpiarray): restrict allreduce() usage to arrays with all ranks h…

771aada

…aving a single element

jrs65 force-pushed the mpiarray branch 2 times, most recently from 27b2380 to b9a80a1 Compare May 30, 2022 18:38

jrs65 added 2 commits May 30, 2022 12:32

Merge branch 'master' into mpiarray-jrs-bak

80f19d5

deal with warnings

783c084

jrs65 force-pushed the mpiarray branch from 12bec2d to 783c084 Compare May 30, 2022 19:40

jrs65 approved these changes May 30, 2022

View reviewed changes

jrs65 merged commit 43f44a6 into master May 30, 2022

jrs65 deleted the mpiarray branch May 30, 2022 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mpiarray): ufunc, __array_finalize, and getitem__ handling for MPIArray #162

feat(mpiarray): ufunc, __array_finalize, and getitem__ handling for MPIArray #162

anjakefala commented Feb 1, 2021 •

edited

Loading

jrs65 commented Feb 2, 2021

lgtm-com bot commented Feb 4, 2021

jrs65 Feb 16, 2021

anjakefala commented Feb 23, 2021

anjakefala commented May 30, 2022

feat(mpiarray): ufunc, __array_finalize__, and __getitem__ handling for MPIArray #162

feat(mpiarray): ufunc, __array_finalize__, and __getitem__ handling for MPIArray #162

Conversation

anjakefala commented Feb 1, 2021 • edited Loading

Subclassing NumPy arrays 101

New Exceptions

getitem

ufunc

array_finalize

Misc

jrs65 commented Feb 2, 2021

lgtm-com bot commented Feb 4, 2021

jrs65 Feb 16, 2021

Choose a reason for hiding this comment

anjakefala commented Feb 23, 2021

anjakefala commented May 30, 2022

feat(mpiarray): ufunc, __array_finalize, and getitem__ handling for MPIArray #162

feat(mpiarray): ufunc, __array_finalize, and getitem__ handling for MPIArray #162

anjakefala commented Feb 1, 2021 •

edited

Loading