Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_to_date_with_window_functions failed in non-UTC nightly CI #10186

Closed
res-life opened this issue Jan 11, 2024 · 2 comments · Fixed by #10189
Closed

[BUG] test_to_date_with_window_functions failed in non-UTC nightly CI #10186

res-life opened this issue Jan 11, 2024 · 2 comments · Fixed by #10189
Assignees
Labels
bug Something isn't working

Comments

@res-life
Copy link
Collaborator

res-life commented Jan 11, 2024

Describe the bug

[2024-01-10T17:39:58.141Z] FAILED ../../src/main/python/window_function_test.py::test_to_date_with_window_functions[DATAGEN_SEED=1704903198, INJECT_OOM, IGNORE_ORDER({'local': True})] - pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec

[2024-01-10T17:39:58.141Z] Project [cast(gettimestamp(cast(date_1#301254 as string), yyyy-MM-dd, Some(Canada/Newfoundland), false) as date) AS my_date#301262, id#301253, date_2#301255]

[2024-01-10T17:39:58.141Z] +- Scan ExistingRDD[id#301253,date_1#301254,date_2#301255]

Steps/Code to reproduce bug
Reported by nightly CI: rapids_it-non-utc-dev #33

[2024-01-10T17:39:58.140Z] ../../src/main/python/window_function_test.py:1714: 

[2024-01-10T17:39:58.140Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2024-01-10T17:39:58.140Z] ../../src/main/python/asserts.py:637: in assert_gpu_and_cpu_are_equal_sql

[2024-01-10T17:39:58.140Z]     assert_gpu_and_cpu_are_equal_collect(do_it_all, conf, is_cpu_first=is_cpu_first)

[2024-01-10T17:39:58.140Z] ../../src/main/python/asserts.py:595: in assert_gpu_and_cpu_are_equal_collect

[2024-01-10T17:39:58.140Z]     _assert_gpu_and_cpu_are_equal(func, 'COLLECT', conf=conf, is_cpu_first=is_cpu_first, result_canonicalize_func_before_compare=result_canonicalize_func_before_compare)

[2024-01-10T17:39:58.140Z] ../../src/main/python/asserts.py:503: in _assert_gpu_and_cpu_are_equal

[2024-01-10T17:39:58.140Z]     from_gpu = run_on_gpu()

[2024-01-10T17:39:58.140Z] ../../src/main/python/asserts.py:496: in run_on_gpu

[2024-01-10T17:39:58.140Z]     from_gpu = with_gpu_session(bring_back, conf=conf)

[2024-01-10T17:39:58.140Z] ../../src/main/python/spark_session.py:164: in with_gpu_session

[2024-01-10T17:39:58.140Z]     return with_spark_session(func, conf=copy)

[2024-01-10T17:39:58.140Z] /opt/conda/lib/python3.9/contextlib.py:79: in inner

[2024-01-10T17:39:58.140Z]     return func(*args, **kwds)

[2024-01-10T17:39:58.140Z] ../../src/main/python/spark_session.py:131: in with_spark_session

[2024-01-10T17:39:58.140Z]     ret = func(_spark)

[2024-01-10T17:39:58.140Z] ../../src/main/python/asserts.py:205: in <lambda>

[2024-01-10T17:39:58.140Z]     bring_back = lambda spark: limit_func(spark).collect()

[2024-01-10T17:39:58.140Z] ../../../spark-3.1.1-bin-hadoop3.2/python/pyspark/sql/dataframe.py:677: in collect

[2024-01-10T17:39:58.140Z]     sock_info = self._jdf.collectToPython()

[2024-01-10T17:39:58.140Z] /home/jenkins/agent/workspace/jenkins-rapids_it-non-utc-dev-33/jars/spark-3.1.1-bin-hadoop3.2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py:1304: in __call__

[2024-01-10T17:39:58.140Z]     return_value = get_return_value(

[2024-01-10T17:39:58.140Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

[2024-01-10T17:39:58.140Z] 

[2024-01-10T17:39:58.140Z] a = ('xro1742283', <py4j.java_gateway.GatewayClient object at 0x7feeacd142e0>, 'o1742282', 'collectToPython')

[2024-01-10T17:39:58.140Z] kw = {}

[2024-01-10T17:39:58.140Z] converted = IllegalArgumentException('Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec\nProject [...:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:238)\n\tat java.lang.Thread.run(Thread.java:750)\n', None)

[2024-01-10T17:39:58.140Z] 

[2024-01-10T17:39:58.140Z]     def deco(*a, **kw):

[2024-01-10T17:39:58.140Z]         try:

[2024-01-10T17:39:58.140Z]             return f(*a, **kw)

[2024-01-10T17:39:58.140Z]         except py4j.protocol.Py4JJavaError as e:

[2024-01-10T17:39:58.140Z]             converted = convert_exception(e.java_exception)

[2024-01-10T17:39:58.140Z]             if not isinstance(converted, UnknownException):

[2024-01-10T17:39:58.141Z]                 # Hide where the exception came from that shows a non-Pythonic

[2024-01-10T17:39:58.141Z]                 # JVM exception message.

[2024-01-10T17:39:58.141Z] >               raise converted from None

[2024-01-10T17:39:58.141Z] E               pyspark.sql.utils.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec

[2024-01-10T17:39:58.141Z] E               Project [cast(gettimestamp(cast(date_1#301254 as string), yyyy-MM-dd, Some(Canada/Newfoundland), false) as date) AS my_date#301262, id#301253, date_2#301255]

[2024-01-10T17:39:58.141Z] E               +- Scan ExistingRDD[id#301253,date_1#301254,date_2#301255]


@res-life res-life added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jan 11, 2024
@res-life res-life self-assigned this Jan 11, 2024
@res-life
Copy link
Collaborator Author

Reproduced by

TZ=America/New_York ./integration_tests/run_pyspark_from_build.sh -s -k test_to_date_with_window_functions

When TZ=Asia/Shanghai, it passes. America/New_York is a PST time zone, we current do not support, should update test case.

../../src/main/python/window_function_test.py::test_to_date_with_window_functions[DATAGEN_SEED=1704956309, INJECT_OOM, IGNORE_ORDER({'local': True})] ### CPU RUN ###
### GPU RUN ###
24/01/11 01:58:42 WARN GpuOverrides: 
        !Exec <ProjectExec> cannot run on GPU because not all expressions can be replaced
          @Expression <Alias> cast(gettimestamp(cast(date_1#16 as string), yyyy-MM-dd, Some(America/New_York), false) as date) AS my_date#24 could run on GPU
            !Expression <Cast> cast(gettimestamp(cast(date_1#16 as string), yyyy-MM-dd, Some(America/New_York), false) as date) cannot run on GPU because Timezone America/New_York is not supported yet. Only Non DST (daylight saving time) timezone is supported.
              !Expression <GetTimestamp> gettimestamp(cast(date_1#16 as string), yyyy-MM-dd, Some(America/New_York), false) cannot run on GPU because Timezone America/New_York is not supported yet. Only Non DST (daylight saving time) timezone is supported.
                @Expression <Cast> cast(date_1#16 as string) could run on GPU
                  @Expression <AttributeReference> date_1#16 could run on GPU
                @Expression <Literal> yyyy-MM-dd could run on GPU
          @Expression <AttributeReference> id#15 could run on GPU
          @Expression <AttributeReference> date_2#17 could run on GPU
          ! <RDDScanExec> cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.execution.RDDScanExec
            @Expression <AttributeReference> id#15 could run on GPU
            @Expression <AttributeReference> date_1#16 could run on GPU
            @Expression <AttributeReference> date_2#17 could run on GPU

24/01/11 01:58:43 ERROR GpuOverrideUtil: Encountered an exception applying GPU overrides java.lang.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec
Project [cast(gettimestamp(cast(date_1#16 as string), yyyy-MM-dd, Some(America/New_York), false) as date) AS my_date#24, id#15, date_2#17]
+- Scan ExistingRDD[id#15,date_1#16,date_2#17]

java.lang.IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ProjectExec
Project [cast(gettimestamp(cast(date_1#16 as string), yyyy-MM-dd, Some(America/New_York), false) as date) AS my_date#24, id#15, date_2#17]
+- Scan ExistingRDD[id#15,date_1#16,date_2#17]

@res-life
Copy link
Collaborator Author

Fix: #10189

@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants