-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducers should preserve parameters
#2544
Comments
In the following, the lists have parameter >>> original = ak.with_parameter([[1], [], [2, 3]], "hello", "there")
>>> original
<Array [[1], [], [2, 3]] type='3 * [var * int64, parameters={"hello": "ther...'>
>>> ak.sum(original, axis=-1, keepdims=True)
<Array [[1], [0], [5]] type='3 * 1 * int64'> but the output array has lists without parameters. In fact, the input lists were variable-length and the output lists are fixed-size with a length of 1. That's already a different type. When we keep parameters, it is because we want to keep the type unchanged, but that's already not what's happening in the above. How disruptive of a backward-incompatible change would it be to preserve the Separately, the type (and therefore the parameters) of the reduced data should be maintained for some reducers. (Above the horizontal line, parameter-preservation is a question that is independent of which reducer we're talking about. In this section, it depends on which reducer we're talking about.) In the following, the numbers have parameter >>> original = ak.unflatten(ak.with_parameter([1, 2, 3], "hello", "there"), [1, 0, 2])
>>> original
<Array [[1], [], [2, 3]] type='3 * var * int64[parameters={"hello": "there"}]'>
>>> ak.sum(original, axis=-1)
<Array [1, 0, 5] type='3 * int64'>
>>> ak.sum(original, axis=-1, keepdims=True)
<Array [[1], [0], [5]] type='3 * 1 * int64'> but the output array has numbers without parameters. It seems that
The same thing could be said of non-reducer statistics that reduce dimension:
It quickly gets to be a complicated landscape. We want to come up with a simple rule for this, so that it's easy to predict what's going to happen (and whether some observed behavior is a bug or not). The simplest rule would be that they should never preserve their parameters, which is the status quo. Maybe this isn't what one wants, but parameters can be reintroduced, and at least it's easy to know what's going to happen. The next-simplest rule would be that only The next-to-next-simplest rule would be to also include The "next-simplest rule" above looks the most reasonable to me, until we start talking about units. If we're propagating units, using Pint on all of the ufuncs and implementing it ourselves for the reducers (because we always have to implement special features on reducers; there's no avoiding it), then we'd have to do special things for sums versus products, and |
parameters
I've not been operating under that impression hitherto. I've applied this rule for nominal parameters like
I think we decided a policy not to do this (see #1943 (comment)).
Yes 😞. Depending upon your view for non-nominal parameters, it could be that these should always be preserved, whilst nominal parameters are governed by the rules you outline above. My rules would be:
Unit handling does get complicated, but we already need to consider units for reducers in #2545. I'm hoping that the unit conversion will happen first, and thereafter units are just another parameter (that we don't want to lose!) |
Oh, right. I'm neutral on whether In the implementation, you might find that you need to make it special. Out of the functions in your list, >>> array = ak.unflatten(ak.with_parameter([1, 2, 3, 4, 5], "__units__", "cm"), [3, 0, 2])
>>> array
<Array [[1, 2, 3], [], [4, 5]] type='3 * var * int64[parameters={"__units__...'>
>>> ak.prod(array, axis=-1)
<Array [6, 1, 20] type='3 * int64'> Technically, the units of the first element of the output should be I don't think any of the others are incompatible with units, and the statistical functions implemented by reducers may be able to inherit the correct unit propagation automatically once the reducers are done. |
Yep, good spot (I was thinking about the group properties of the numerics). I wonder what the mathematical analogue is for the units. |
Version of Awkward Array
main
Description and code to reproduce
ak.min(array, axis=-1, keepdims=True)
should preserve the parameters of the original array.The text was updated successfully, but these errors were encountered: