Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] support spark.sql.legacy.timeParserPolicy #50

Closed
revans2 opened this issue May 29, 2020 · 3 comments
Closed

[FEA] support spark.sql.legacy.timeParserPolicy #50

revans2 opened this issue May 29, 2020 · 3 comments
Assignees
Labels
feature request New feature or request P0 Must have for release SQL part of the SQL/Dataframe plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented May 29, 2020

Is your feature request related to a problem? Please describe.
When parsing dates and times it would be good if we could also follow the spark.sql.legacy.timeParserPolicy config.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify SQL part of the SQL/Dataframe plugin labels May 29, 2020
@sameerz
Copy link
Collaborator

sameerz commented Sep 22, 2020

Trace through where this config is used in Spark and if the plugin cannot match the same functionality, fall back to the CPU.

@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Sep 22, 2020
@sameerz sameerz added the P0 Must have for release label Sep 22, 2020
@sameerz sameerz added this to the Oct 26 - Nov 6 milestone Oct 23, 2020
@andygrove andygrove self-assigned this Oct 23, 2020
wjxiz1992 pushed a commit to wjxiz1992/spark-rapids that referenced this issue Oct 29, 2020
- CUDF 0.9.1
- XGBoost4J 1.0.0-Beta2
@andygrove
Copy link
Contributor

andygrove commented Nov 11, 2020

The default value for spark.sql.legacy.timeParserPolicy is EXCEPTION in which case Spark throws an exception if any of the following functions are unable to parse data using the specified pattern, and suggests that the conversion may work with LEGACY. If the config is set to CORRECTED then the conversion will return null instead of throwing an exception.

  • unix_timestamp
  • from_unixtime
  • from_utc_timstamp
  • to_unix_timestamp
  • to_utc_timestamp
  • to_date
  • to_timestamp
  • date_format

I propose that we follow the same behavior but fall back to CPU for LEGACY for these functions until we have a reason to add support for specific legacy formats that are no longer supported in Spark 3.0 and later. If we do end up doing that we can then just fall back to CPU for legacy formats that we do not support.

@andygrove
Copy link
Contributor

Resolved by #1113 for functions, and I filed a follow on #1111 for handling this for CSV reads

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request P0 Must have for release SQL part of the SQL/Dataframe plugin
Projects
None yet
Development

No branches or pull requests

3 participants