Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix numpy.ma.fix_invalid issue in NumPy 2.1.0 by replacing with numpy.ma.masked_invalid #2042

Merged
merged 1 commit into from
Aug 20, 2024

Conversation

kounelisagis
Copy link
Member

For an unknown reason, numpy.ma.fix_invalid behaves differently between NumPy 2.1.0 and NumPy 2.0.0. Specifically, when passing a pandas Series containing a numpy.nan value, numpy.ma.fix_invalid now makes changes in-place, even if the copy argument is set to its default value of True. This issue occurs only with pandas Series, not with NumPy arrays, for example.

Nevertheless, we don't actually need the numpy.ma.fix_invalid function since we handle NaNs later using numpy.nan_to_num. It would be wiser (and likely more performant) to use numpy.ma.masked_invalid, which simply creates a MaskedArray instance and allows us to obtain the mask from there.

cc: @jdblischak
Closes #2040


>>> import pandas as pd, numpy as np
>>> pd.__version__; np.__version__
'2.2.2'
'2.1.0'
>>> my_series = pd.Series([1.0, 2.0, np.nan, 0.0, 1.0])
>>> my_series
0    1.0
1    2.0
2    NaN
3    0.0
4    1.0
dtype: float64
>>> np.ma.fix_invalid(my_series)
masked_array(data=[1.0, 2.0, --, 0.0, 1.0],
             mask=[False, False,  True, False, False],
       fill_value=1e+20)
>>> my_series
0    1.000000e+00
1    2.000000e+00
2    1.000000e+20
3    0.000000e+00
4    1.000000e+00
dtype: float64
>>> import pandas as pd, numpy as np
>>> pd.__version__; np.__version__
'2.2.2'
'2.0.0'
>>> my_series = pd.Series([1.0, 2.0, np.nan, 0.0, 1.0])
>>> my_series
0    1.0
1    2.0
2    NaN
3    0.0
4    1.0
dtype: float64
>>> np.ma.fix_invalid(my_series)
masked_array(data=[1.0, 2.0, --, 0.0, 1.0],
             mask=[False, False,  True, False, False],
       fill_value=1e+20)
>>> my_series
0    1.0
1    2.0
2    NaN
3    0.0
4    1.0
dtype: float64

@teo-tsirpanis teo-tsirpanis changed the title Fix numpy.ma.fix_invalid issue in NumPy 2.21.0 Fix numpy.ma.fix_invalid issue in NumPy 2.1.0 Aug 20, 2024
@teo-tsirpanis
Copy link
Member

numpy.ma.fix_invalid behaves differently between NumPy 2.1.0 and NumPy 2.0.0

Is there an issue on NumPy about that? Can you open one if not?

@teo-tsirpanis
Copy link
Member

Change seems fine. Launched nightlies from this branch and will approve if passed.

@kounelisagis
Copy link
Member Author

numpy.ma.fix_invalid behaves differently between NumPy 2.1.0 and NumPy 2.0.0

Is there an issue on NumPy about that? Can you open one if not?

I couldn't find an existing issue, but I can open a new one.

Copy link
Member

@teo-tsirpanis teo-tsirpanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One previously failing job now succeeds. Thanks!

@kounelisagis kounelisagis changed the title Fix numpy.ma.fix_invalid issue in NumPy 2.1.0 Fix numpy.ma.fix_invalid issue in NumPy 2.1.0 by replacing with numpy.ma.masked_invalid Aug 20, 2024
@kounelisagis kounelisagis merged commit cbdc6ed into dev Aug 20, 2024
61 checks passed
@kounelisagis kounelisagis deleted the agis/fix-numpy-2.21.0 branch August 20, 2024 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The nightly build with earliest supported numpy job failed on Sunday (2024-08-18)
2 participants