Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] nsmallest & nlargest are not raising an error for string dtype #13945

Closed
galipremsagar opened this issue Aug 23, 2023 · 0 comments · Fixed by #13946
Closed

[BUG] nsmallest & nlargest are not raising an error for string dtype #13945

galipremsagar opened this issue Aug 23, 2023 · 0 comments · Fixed by #13946
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Describe the bug
nsmallest & nlargest shouldn't be returning results when string columns are present, but cudf is currently returning a result.

Steps/Code to reproduce bug

In [1]: import cudf
s = 
In [2]: s = cudf.Series(['a', 'b', 'c', 'd'])
s.
In [3]: s.nlargest(1)
Out[3]: 
3    d
dtype: object

In [4]: s.to_pandas().nlargest(1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 s.to_pandas().nlargest(1)

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/series.py:4134, in Series.nlargest(self, n, keep)
   4036 def nlargest(
   4037     self, n: int = 5, keep: Literal["first", "last", "all"] = "first"
   4038 ) -> Series:
   4039     """
   4040     Return the largest `n` elements.
   4041 
   (...)
   4132     dtype: int64
   4133     """
-> 4134     return algorithms.SelectNSeries(self, n=n, keep=keep).nlargest()

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/algorithms.py:1277, in SelectN.nlargest(self)
   1275 @final
   1276 def nlargest(self):
-> 1277     return self.compute("nlargest")

File /nvme/0/pgali/envs/cudfdev/lib/python3.10/site-packages/pandas/core/algorithms.py:1317, in SelectNSeries.compute(self, method)
   1315 dtype = self.obj.dtype
   1316 if not self.is_valid_dtype_n_method(dtype):
-> 1317     raise TypeError(f"Cannot use method '{method}' with dtype {dtype}")
   1319 if n <= 0:
   1320     return self.obj[[]]

TypeError: Cannot use method 'nlargest' with dtype object

Expected behavior
Raise an error similar to pandas.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal]
  • Method of cuDF install: [from source]
@galipremsagar galipremsagar added bug Something isn't working Python Affects Python cuDF API. labels Aug 23, 2023
@galipremsagar galipremsagar self-assigned this Aug 23, 2023
rapids-bot bot pushed a commit that referenced this issue Aug 24, 2023
closes #13945 

This PR contains changes that raises an error message exactly matching pandas for `nsmallest` and `nlargest`.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #13946
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant