Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in reporting of time taken to transition plan to GPU #3315

Merged
merged 4 commits into from
Sep 8, 2021

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Aug 26, 2021

@jlowe asked me earlier about how long it takes to take a CPU plan and convert it into a GPU enabled plan. I had dug into this early on, but I have not done anything with it recently. Sadly there is no good way to report this type of a metric query wide. Nor is there a good way to measure how long it takes compared to how long Spark takes in total to parse and optimize the query. But this should provide users with a little bit of knowledge if they ask for it.

I also cleaned up a few small code warnings while I was at it.

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2 revans2 added the task Work required that improves the product but is not user facing label Aug 26, 2021
@revans2 revans2 added this to the Aug 16 - Aug 27 milestone Aug 26, 2021
@revans2 revans2 self-assigned this Aug 26, 2021
@revans2
Copy link
Collaborator Author

revans2 commented Aug 26, 2021

build

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the FAQ or other documentation to mention this feature?

val ret = applyOverrides(updatedPlan, conf)
val end = System.nanoTime()
if (conf.shouldExplain) {
val timeMs = (end - start) / 1000000.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider NANOSECONDS.toMillis(end - start)

Also consider a utility function to measure duration to avoid duplicating code

def withDurationLog[T](block: => T, conf: RapidsConf, format: String): T 

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started off using NANOSECONDS.toMillis, but it returns a long, not a double. If I wanted just millisecond precision I would have measured the values in milliseconds to begin with. I also looked at MILLISECONDS.toNanos(1) to produce the magic number, but decided against it. Happy to move back to that if you think it is better.

I also thought about writing a utility, but the code is only in 2 places and it is 6 lines of code in each place. So best case I would save 3 lines of code total. 12 lines as it is now vs 6 lines for the utility body + 1 line for the utility def + 2 lines to call the utility from each place. It just didn't feel like it was worth it at this time. But for code cleanliness if you want me to I will do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about fractional millis.

It's easily conceivable that we want to instrument more and more durations in code and this PR already needs it twice. Thus I'd favor a util method making it easy.

@revans2
Copy link
Collaborator Author

revans2 commented Aug 26, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Aug 26, 2021

build

jlowe
jlowe previously approved these changes Aug 26, 2021
gerashegalov
gerashegalov previously approved these changes Aug 26, 2021
@revans2
Copy link
Collaborator Author

revans2 commented Aug 26, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Aug 26, 2021

The previous build failed with what appeared to be running out of host memory.

@jlowe
Copy link
Member

jlowe commented Aug 31, 2021

Need to resolve the merge conflict otherwise this looks ready to go.

@revans2 revans2 dismissed stale reviews from gerashegalov and jlowe via 78c7208 September 7, 2021 21:48
@revans2
Copy link
Collaborator Author

revans2 commented Sep 7, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Sep 7, 2021

Sorry I had to upmerge

@revans2 revans2 merged commit e6cea73 into NVIDIA:branch-21.10 Sep 8, 2021
@revans2 revans2 deleted the report_gpu_translation_time branch September 8, 2021 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants