Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Shim service provider failure when using jar built with -DallowConventionalDistJar #5596

Closed
jlowe opened this issue May 23, 2022 · 1 comment · Fixed by #5599
Closed
Assignees
Labels
bug Something isn't working

Comments

@jlowe
Copy link
Member

jlowe commented May 23, 2022

Describe the bug
Running with a jar built with -DallowConventionalDistJar=true failed on startup:

22/05/23 17:53:19 ERROR SparkContext: Error initializing SparkContext.
java.lang.AssertionError: assertion failed: Classpath should contain the resource for META-INF/services/com.nvidia.spark.rapids.SparkShimServiceProvider
	at scala.Predef$.assert(Predef.scala:223)
	at com.nvidia.spark.rapids.ShimLoader$.detectShimProvider(ShimLoader.scala:307)
	at com.nvidia.spark.rapids.ShimLoader$.findShimProvider(ShimLoader.scala:355)
	at com.nvidia.spark.rapids.ShimLoader$.initShimProviderIfNeeded(ShimLoader.scala:100)
	at com.nvidia.spark.rapids.ShimLoader$.getShimClassLoader(ShimLoader.scala:250)
	at com.nvidia.spark.rapids.ShimLoader$.loadClass(ShimLoader.scala:383)
	at com.nvidia.spark.rapids.ShimLoader$.newInstanceOf(ShimLoader.scala:389)
	at com.nvidia.spark.rapids.ShimLoader$.newDriverPlugin(ShimLoader.scala:418)
	at com.nvidia.spark.SQLPlugin.driverPlugin(SQLPlugin.scala:29)

Examining the contents of the jar shows the service provider file is present but empty:

$ jar tvf dist/target/rapids-4-spark_2.12-22.06.0-SNAPSHOT-cuda11.jar | grep SparkShimServiceProvider
     0 Mon May 23 12:48:28 CDT 2022 META-INF/services/com.nvidia.spark.rapids.SparkShimServiceProvider
   779 Mon May 23 12:47:10 CDT 2022 com/nvidia/spark/rapids/SparkShimServiceProvider.class
  1392 Mon May 23 12:47:08 CDT 2022 com/nvidia/spark/rapids/shims/spark321/SparkShimServiceProvider$.class
  2037 Mon May 23 12:47:08 CDT 2022 com/nvidia/spark/rapids/shims/spark321/SparkShimServiceProvider.class

Steps/Code to reproduce bug

  • Build jar for Spark 3.2.1 via: mvn clean package -Dbuildver=321 -DskipTests -DallowConventionalDistJar=true
  • Run jar against Spark 3.2.1

Expected behavior
RAPIDS Accelerator starts up without an exception.

Environment details (please complete the following information)
Spark 3.2.1

@jlowe jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify labels May 23, 2022
@jlowe jlowe changed the title [BUG] Shim service provider failure when using jar built with [BUG] Shim service provider failure when using jar built with -DallowConventionalDistJar May 23, 2022
@gerashegalov gerashegalov self-assigned this May 23, 2022
@gerashegalov gerashegalov added this to the May 23 - Jun 3 milestone May 23, 2022
@gerashegalov
Copy link
Collaborator

There is a conflict between the truncate and unzip in the conventional call path

@gerashegalov gerashegalov removed the ? - Needs Triage Need team to review and classify label May 23, 2022
gerashegalov added a commit that referenced this issue May 24, 2022
This PR closes #5596 by moving truncate close to when shim service file
concatentation is about to generate a new list. 

The empty file for ShimServiceProvider created during init-properties is not populated via
concat in the conventional jar path, later overwriting the original file
from the aggregator jar. it's also bad practice to have side effects
while still initializing properties. This PR corrects these issues.

Signed-off-by: Gera Shegalov <gera@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants