Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Make replace_re multine behavior consistent with contains_re #9845

Closed
andygrove opened this issue Dec 6, 2021 · 2 comments · Fixed by #9878
Closed

[FEA] Make replace_re multine behavior consistent with contains_re #9845

andygrove opened this issue Dec 6, 2021 · 2 comments · Fixed by #9878
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python)

Comments

@andygrove
Copy link
Contributor

andygrove commented Dec 6, 2021

Is your feature request related to a problem? Please describe.

Given the input string "A\nB\nC" and the regexp pattern "^B", contains_re will not find a match but replace_re will.

Describe the solution you'd like
I would like replace_re to match the behavior of contains_re where ^ only matches the start of the string and does not match newlines within the string.

Describe alternatives you've considered
None

Additional context
None

@andygrove andygrove added feature request New feature or request Needs Triage Need team to review and classify labels Dec 6, 2021
@davidwendt davidwendt self-assigned this Dec 6, 2021
@davidwendt
Copy link
Contributor

I can add the regex_flags parameter to replace_re like it is in contains_re so that the ^ and $ matching behavior should be the same.

@davidwendt davidwendt added strings strings issues (C++ and Python) libcudf Affects libcudf (C++/CUDA) code. labels Dec 7, 2021
@andygrove
Copy link
Contributor Author

That sounds great. Thanks @davidwendt

rapids-bot bot pushed a commit that referenced this issue Dec 15, 2021
Closes #9845 

Adds a `cudf::strings::regex_flags` parameter to the `cudf::strings::replace_re` functions so the matching logic will be the same as for `cudf::strings::contains_re` which already has this parameter.

This is a breaking change since it adds this new parameter and changes the default behavior. The previous default behavior is equivalent to specifying the `regex_flags::MULTILINE` flag now to be consistent with the default behavior of `contains_re`.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Mike Wilson (https://github.com/hyperbolic2346)

URL: #9878
@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants