Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] better regexp_extract support #7900

Closed
nvliyuan opened this issue Mar 20, 2023 · 1 comment
Closed

[FEA] better regexp_extract support #7900

nvliyuan opened this issue Mar 20, 2023 · 1 comment
Labels
duplicate This issue or pull request already exists feature request New feature or request

Comments

@nvliyuan
Copy link
Collaborator

I wish we can support terms ending with line anchors in regexp_extract function:

from pyspark.sql.functions import regexp_extract
from pyspark.sql.types import StructType, StructField, StringType
schema = StructType([
    StructField('id', StringType(), True),
    StructField('text', StringType(), True)
])
data = [('1', 'aaa12'),
        ('2', 'bbb123'),
        ('3', 'ccc')]
df = spark.createDataFrame(data=data, schema=schema)
result = df.withColumn("numbers", regexp_extract("text", "(12$|123$)", 1))
result.show()

image

@nvliyuan nvliyuan added feature request New feature or request ? - Needs Triage Need team to review and classify labels Mar 20, 2023
@NVnavkumar
Copy link
Collaborator

This is a duplicate of a #6882

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Mar 21, 2023
@mattahrens mattahrens closed this as not planned Won't fix, can't repro, duplicate, stale Mar 21, 2023
@mattahrens mattahrens added the duplicate This issue or pull request already exists label Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants