Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maven compile/package plugin executions for Spark302 and Spark301 #3257

Merged
merged 45 commits into from
Aug 23, 2021

Conversation

tgravescs
Copy link
Collaborator

@tgravescs tgravescs commented Aug 19, 2021

start of #3224

Initial pr that aims to leave existing build the way it is but adds in some profiles to build all the source against a particular version of Spark. In this case added profiles for spark301 and spark302. You are meant to only build against a single version of Spark at a time. Eventually we will have build scripts to build each version independently and install them and then the dist package will pull the installed versions by classifier.

Currently this is setup to build with 2 profiles: -Prelease302,with-classifier

The with-classifier profile changes the dependencies to look for the classifier version of jars and the release302 set the versions to build spark 302. it was done this way to minimize the code needed in each pom. The with-classifier profile can be reused against all the spark versions.

This creates jars with a classifier in each of the submodules, so the sql jar looks like:

You can also specify the property instead of the profiles like: -Dbuildver={302/301} and that will activate the profiles automatically.

rapids-4-spark-sql_2.12-21.10.0-SNAPSHOT-spark302.jar
rapids-4-spark-sql_2.12-21.10.0-SNAPSHOT-spark302tests.jar

The dist jar is the only one that stays without classifier. We can discuss the integration test jar more as needed, right now we only deploy one version which is ok as long as we don't have spark dependent api which it likely doesn't because it was using public api for most things. So I can easily remove the classifier version there if we decide its not needed.

the dist/pom.xml was a bit weird where I had to duplicate all dependencies with profile to get it to pull them into final jar properly for some reason. I'll investigate this more, but for now copying it like this pulls everything and contents are the same between the version built with classifier and version now (except for the shim layers not pulled in obviously)

We could go down to 1 profile but would just require duplicating code. I was looking for a way to activate multiple profiles with a property but didn't see a good way to do it with duplicating code again.

The other change here is to copy all the shim layer spark301 code into the spark302 code. I diffed them and the only differences is in the package names and copyrights. Once we have all the builds using the new system we can commonize the code possible from the shims.

@tgravescs
Copy link
Collaborator Author

build

dist/pom.xml Show resolved Hide resolved
dist/pom.xml Outdated Show resolved Hide resolved
integration_tests/pom.xml Outdated Show resolved Hide resolved
dist/pom.xml Show resolved Hide resolved
shims/pom.xml Outdated Show resolved Hide resolved
shuffle-plugin/pom.xml Show resolved Hide resolved
@tgravescs
Copy link
Collaborator Author

addressed review comments, you can now just build with mvn -Dbuildver={302/301} and it will activate with-classifier and the release302/301.

@tgravescs
Copy link
Collaborator Author

build

1 similar comment
@tgravescs
Copy link
Collaborator Author

build

@jlowe
Copy link
Member

jlowe commented Aug 20, 2021

I think this is close. Looks like some code needs to be updated in the new 302 copies of code, hopefully we can come up with a better way to do this to significantly reduce the amount of duplicated code.

Also seeing this in the build log which would be good to fix:

13:23:22  [WARNING] 
13:23:22  [WARNING] Some problems were encountered while building the effective model for com.nvidia:rapids-4-spark-tools_2.12:jar:21.10.0-SNAPSHOT
13:23:22  [WARNING] 'build.plugins.plugin.version' for org.scala-tools:maven-scala-plugin is missing. @ line 84, column 21
13:23:22  [WARNING] 
13:23:22  [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
13:23:22  [WARNING] 
13:23:22  [WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
13:23:22  [WARNING] 

@jlowe
Copy link
Member

jlowe commented Aug 20, 2021

Hmm, looks like the tools module issue is unrelated, I'll fix in a separate PR.

@abellina
Copy link
Collaborator

Last CI failure seems relevant. Otherwise LGTM and the process to add a new shim is straightforward. +1 on finding a way to remove duplication of code, but as far as I understand that's a separate stage and it could break the current build if done wholesale.

@tgravescs
Copy link
Collaborator Author

tgravescs commented Aug 20, 2021

yeah its weird, its like the 3.0.2 shim isn't picking up the dependency on sql plugin it all works locally for me though.

@tgravescs
Copy link
Collaborator Author

build

@tgravescs
Copy link
Collaborator Author

build

@tgravescs
Copy link
Collaborator Author

Note, I'm trying to debug the pre-merge build here so I'll end up reverting these last few commits. The failure doesn't make any sense and can't reproduce locally.

@tgravescs
Copy link
Collaborator Author

build

@tgravescs
Copy link
Collaborator Author

tgravescs commented Aug 20, 2021

ok, the issue is some 301 shim layer code got checked in and I need to copy to 302 and upmerge

@tgravescs
Copy link
Collaborator Author

build

1 similar comment
@pxLi
Copy link
Collaborator

pxLi commented Aug 23, 2021

build

@pxLi
Copy link
Collaborator

pxLi commented Aug 23, 2021

last CI failure was due to one java gateway (gw1) crashed, passed on retry

@tgravescs
Copy link
Collaborator Author

@jlowe could you take another look? I think the main updates since you looked last were to shim layer where upmerged and copied to the 302 shim

@tgravescs tgravescs merged commit 7c4c6a9 into NVIDIA:branch-21.10 Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants