-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regexp_replace support regexp [databricks] #4063
Regexp_replace support regexp [databricks] #4063
Conversation
951d0b7
to
98c415a
Compare
build |
3 similar comments
build |
build |
build |
16a3f5c
to
d4bc916
Compare
build |
Signed-off-by: Andy Grove <andygrove@nvidia.com>
d4bc916
to
1f5d80d
Compare
build |
build |
case '^' | '$' if replace => | ||
// this is a bit extreme and it would be good to replace with finer-grained | ||
// rules | ||
throw new RegexUnsupportedException("regexp_replace on GPU does not support ^ or $") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed #4091 as a follow-on issue to improve support here
build |
Failed with:
|
build |
build |
build failed against spark 3.3.0
|
rep: Expression): GpuExpression = GpuStringReplace(lhs, regexp, rep) | ||
}), | ||
(a, conf, p, r) => new GpuRegExpReplaceMeta(a, conf, p, r)).disabledByDefault( | ||
"The GPU implementation of regexp_replace is disabled by default. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalization is incorrect since this is tacked onto a prefix to make the sentence, and also this ends up being very redundant in the end as seen in the generated docs. Probably can just say "the implementation is not 100% compatible. See the compatibility guide for more details.
.
rep: Expression): GpuExpression = GpuStringReplace(lhs, regexp, rep) | ||
}), | ||
(a, conf, p, r) => new GpuRegExpReplaceMeta(a, conf, p, r)).disabledByDefault( | ||
"The GPU implementation of regexp_replace is disabled by default. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same capitalization and redundancy issues here.
} | ||
}), | ||
(a, conf, p, r) => new GpuRegExpReplaceMeta(a, conf, p, r)).disabledByDefault( | ||
"The GPU implementation of regexp_replace is disabled by default. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same capitalization and redundancy issues here.
} | ||
}), | ||
(a, conf, p, r) => new GpuRegExpReplaceMeta(a, conf, p, r)).disabledByDefault( | ||
"The GPU implementation of regexp_replace is disabled by default. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same capitalization and redundancy issues here.
@@ -2950,8 +2950,8 @@ object GpuOverrides extends Logging { | |||
("str", TypeSig.STRING, TypeSig.STRING), | |||
("regexp", TypeSig.lit(TypeEnum.STRING), TypeSig.STRING)), | |||
(a, conf, p, r) => new GpuRLikeMeta(a, conf, p, r)).disabledByDefault( | |||
"The GPU implementation of rlike is not " + | |||
"compatible with Apache Spark. See the compatibility guide for more information."), | |||
"The GPU implementation of rlike is disabled by default. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same capitalization and redundancy issues here.
override def tagExprForGpu(): Unit = { | ||
expr.regexp match { | ||
case Literal(null, _) => | ||
willNotWorkOnGpu(s"RegExpReplace with null pattern is not supported on GPU") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Catalyst class not being placed on the GPU should already be printed before this, so should be able to simplify to something like:
willNotWorkOnGpu(s"RegExpReplace with null pattern is not supported on GPU") | |
willNotWorkOnGpu(s"null pattern is not supported on GPU") |
applies to all the other similar willNotWorkOnGpu
calls added
|
||
GpuOverrides.extractLit(expr.pos).foreach { lit => | ||
if (lit.value.asInstanceOf[Int] != 1) { | ||
willNotWorkOnGpu("Only a search starting position of 1 is supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
willNotWorkOnGpu("Only a search starting position of 1 is supported") | |
willNotWorkOnGpu("only a search starting position of 1 is supported") |
|
||
GpuOverrides.extractLit(expr.pos).foreach { lit => | ||
if (lit.value.asInstanceOf[Int] != 1) { | ||
willNotWorkOnGpu("Only a search starting position of 1 is supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
willNotWorkOnGpu("Only a search starting position of 1 is supported") | |
willNotWorkOnGpu("only a search starting position of 1 is supported") |
Co-authored-by: Jason Lowe <jlowe@nvidia.com>
Co-authored-by: Jason Lowe <jlowe@nvidia.com>
build |
buuild |
build |
build |
build |
build |
build |
Closes #4001
Note that this PR disables
regexp_replace
by default where it was enabled before and fell back to CPU if the pattern used any regular expression syntax, and performed a literal string replacement on GPU in other cases.