Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport #672

Closed
pvchandu opened this issue Sep 4, 2020 · 8 comments
Closed
Labels
bug Something isn't working

Comments

@pvchandu
Copy link

pvchandu commented Sep 4, 2020

I am trying out the new RAPIDS accelerator for Databricks. I am running the mortgate notebook to get started.
I followed the instructions in the documentation https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-with-rapids-accelerator-on-databricks.html.

When I run the code cell to read the data, it is failing with the following error.

Error:
java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport

Full Error:

Py4JJavaError Traceback (most recent call last)
command-1671055577733705> in module>
3 # we want a few big files instead of lots of small files
4 spark.conf.set('spark.sql.files.maxPartitionBytes', '200G')
5 acq = read_acq_csv(spark, orig_acq_path)
6 acq.repartition(12).write.parquet(tmp_acq_path, mode='overwrite')
7 perf = read_perf_csv(spark, orig_perf_path)

command-1671055577733703> in read_acq_csv(spark, path)
82 .option('delimiter', '|')
83 .schema(_csv_acq_schema)
84 .load(path)
85 .withColumn('quarter', _get_quarter_from_csv_file_name())
86

/databricks/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
176 self.options(**options)
177 if isinstance(path, basestring):
178 return self._df(self._jreader.load(path))
179 elif path is not None:
180 if type(path) != list:

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in call(self, *args)
1303 answer = self.gateway_client.send_command(command)
1304 return_value = get_return_value(
1305 answer, self.gateway_client, self.target_id, self.name)
1306
1307 for temp_arg in temp_args:

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
126 def deco(*a, **kw):
127 try:
128 return f(*a, **kw)
129 except py4j.protocol.Py4JJavaError as e:
130 converted = convert_exception(e.java_exception)

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(

Py4JJavaError: An error occurred while calling o385.load.
: java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at com.databricks.backend.daemon.driver.ClassLoaders$ReplWrappingClassLoader.loadClass(ClassLoaders.scala:65)
at java.lang.ClassLoader.loadClass(ClassLoader.java:405)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
at java.util.ServiceLoader$LazyIterator.access$700(ServiceLoader.java:323)
at java.util.ServiceLoader$LazyIterator$2.run(ServiceLoader.java:407)
at java.security.AccessController.doPrivileged(Native Method)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:409)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:255)
at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:249)
at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
at scala.collection.TraversableLike.filter(TraversableLike.scala:347)
at scala.collection.TraversableLike.filter$(TraversableLike.scala:347)
at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:700)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:784)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:317)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:251)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
at py4j.Gateway.invoke(Gateway.java:295)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:251)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.sources.v2.ReadSupport
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader.loadClass(ClassLoaders.scala:151)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 63 more

@pvchandu pvchandu added ? - Needs Triage Need team to review and classify bug Something isn't working labels Sep 4, 2020
@tgravescs
Copy link
Collaborator

@pvchandu thanks for trying the plugin and reporting the issue. I just tried out the instructions again and they are working fine for me.
I'm not exactly sure what is wrong because the error is purely the spark class, not a plugin class.

Did you pick the Databricks 7.0 ML runtime? Are you using aws or azure?
After you run the generate-init-script.ipynb notebook did you do step 5 to put the init.sh script into the cluster configuration and then restart the cluster?

@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Sep 8, 2020
@tgravescs
Copy link
Collaborator

One thing I would suggest doing is just remove the init script from the cluster configuration and make sure that starts up fine and you can run. If that works then there is probably a problem with the init script and perhaps try regenerating it.

@pvchandu
Copy link
Author

pvchandu commented Sep 9, 2020

@tgravescs, I tested this out with NC6s_v3 as mentioned in the documentation. It worked well.
But, when I used NC12s_v3 or NC24s_v3 to create my cluster, this is not working. By the way I am using Azure Databricks with 7.0 ML DBR.

@tgravescs
Copy link
Collaborator

we don't support nodes with multiple GPUs on Databricks right now. The plugin has a restriction that each executor only has 1 GPU and it seems like the last time I tried on Databricks they did not support configuring it to have multiple executors each with 1 GPU on a multi-gpu node. normally in Apache Spark you would set spark.executor.resource.gpu.amount=1 and that would get your 1 gpu per executor but last time I tried that wasn't working on Databricks. Feel free to try to see if anything has changed there.

@pvchandu
Copy link
Author

pvchandu commented Sep 9, 2020

That makes sense now. By default, each node is on executor and we cannot change that in Databricks even today. Are there any plans to support multiple GPUs on a single node ?

@tgravescs
Copy link
Collaborator

we don't have any concrete plans because on any other setup you would just change it to split one node into multiple executors, I'll bring this up to others that this is limitation on Databricks.

@pvchandu
Copy link
Author

Thanks Thomas. This is pretty limiting on databricks environment given that majority of the users are moving to databricks. I added the following feedback for databricks as well.

https://feedback.azure.com/forums/909463-azure-databricks/suggestions/41361244-support-rapids-plugin-with-multiple-gpu-nodes

Will appreciate if you can collaborate with Databricks and figure this story out.

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
…IDIA#672)

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>

Signed-off-by: spark-rapids automation <70000568+nvauto@users.noreply.github.com>
@jlowe
Copy link
Member

jlowe commented Jan 24, 2024

Closing this as the NoClassDefFoundError was resolved and the multiple GPUs per executor request is tracked by #1486.

@jlowe jlowe closed this as completed Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants