Skip to content

Commit

Permalink
Deprecate strings_to_categorical in cudf.read_parquet (#13540)
Browse files Browse the repository at this point in the history
This PR deprecates `string_to_categorical` because of the ambiguous naming to the end users and no real use for this parameter at this point. Pandas or pyarrow don't have this parameter, it was exclusive to `cudf`.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Nghia Truong (https://github.com/ttnghia)

URL: #13540
  • Loading branch information
galipremsagar authored Jun 9, 2023
1 parent be501f5 commit c270986
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 3 deletions.
6 changes: 6 additions & 0 deletions python/cudf/cudf/io/parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,6 +448,12 @@ def read_parquet(
):
"""{docstring}"""

if strings_to_categorical is not False:
warnings.warn(
"`strings_to_categorical` is deprecated and will be removed in "
"a future version of cudf.",
FutureWarning,
)
# Do not allow the user to set file-opening options
# when `use_python_file_object=False` is specified
if use_python_file_object is False:
Expand Down
10 changes: 7 additions & 3 deletions python/cudf/cudf/tests/test_parquet.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
TIMEDELTA_TYPES,
assert_eq,
assert_exceptions_equal,
expect_warning_if,
set_random_null_mask_inplace,
)

Expand Down Expand Up @@ -310,9 +311,12 @@ def test_parquet_reader_strings(tmpdir, strings_to_categorical, has_null):
assert os.path.exists(fname)

if strings_to_categorical is not None:
gdf = cudf.read_parquet(
fname, engine="cudf", strings_to_categorical=strings_to_categorical
)
with expect_warning_if(strings_to_categorical is not False):
gdf = cudf.read_parquet(
fname,
engine="cudf",
strings_to_categorical=strings_to_categorical,
)
else:
gdf = cudf.read_parquet(fname, engine="cudf")

Expand Down
5 changes: 5 additions & 0 deletions python/cudf/cudf/utils/ioutils.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,11 @@
strings_to_categorical : boolean, default False
If True, return string columns as GDF_CATEGORY dtype; if False, return a
as GDF_STRING dtype.
.. deprecated:: 23.08
This parameter is deprecated and will be removed in a future
version of cudf.
categorical_partitions : boolean, default True
Whether directory-partitioned columns should be interpreted as categorical
or raw dtypes.
Expand Down

0 comments on commit c270986

Please sign in to comment.