Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/7.0] Fix two auto-atomicity Regex bugs #74834

Merged
merged 3 commits into from
Aug 31, 2022

Conversation

github-actions[bot]
Copy link
Contributor

Backport of #74726 to release/7.0

/cc @stephentoub

Customer Impact

Testing

Risk

IMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

We walk concatenations in order to combine adjacent loops, e.g. `a+a+a+` becomes `a{3,}`.  We also combine loops with individual items that are compatible, e.g. `a+ab` becomes `a{2,}b`. However, we're doing these operations on atomic loops as well, which is sometimes wrong. Since an atomic loop consumes as much as possible and never gives anything back, combining it with a subsequent loop will end up essentially ignoring any minimum specified in the latter loop. We thus can't combine atomic loops if the second loop has a minimum; this includes the case where the second "loop" is just an individual item.
We currently consider \w and \b non-overlapping, which allows a \w loop followed by a \b to be made atomic.  The problem with this is that \b is zero-width, and it could be followed by something that does overlap with the \w. When matching at a location that is a word boundary, it is possible the first loop could give up something that matches the subsequent construct, and thus it can't be made atomic. (We could probably restrict this further to still allow atomicity when the first loop has a non-0 lower bound, but it doesn't appear to be worth the complication.)
@ghost
Copy link

ghost commented Aug 30, 2022

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #74726 to release/7.0

/cc @stephentoub

Customer Impact

Testing

Risk

IMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

Author: github-actions[bot]
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: -

@danmoseley
Copy link
Member

Approved. Correctness of core functionality

@carlossanlop
Copy link
Member

The CI failure was a cancellation/timeout.
Approved and signed off.
Ready to merge. :shipit:

@carlossanlop carlossanlop merged commit a0cc4ea into release/7.0 Aug 31, 2022
@carlossanlop carlossanlop deleted the backport/pr-74726-to-release/7.0 branch August 31, 2022 02:06
@ghost ghost locked as resolved and limited conversation to collaborators Sep 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants