Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in application timeline to profiling tool #2760

Merged
merged 5 commits into from
Jun 23, 2021

Conversation

revans2
Copy link
Collaborator

@revans2 revans2 commented Jun 21, 2021

This adds in application timeline generation the the profiling tool. It is helpful in understanding what was running and where to know if a stage was long because it was preempted by another stage or if there was skew, etc. This really helped me see the high startup overhead, even if it is not reflected in benchmark numbers because the benchmarks start on the SQL query, not when setting up the data.

I used this to help debug some data/time skew issues.

example

@revans2 revans2 added the task Work required that improves the product but is not user facing label Jun 21, 2021
@revans2 revans2 added this to the June 21 - July 2 milestone Jun 21, 2021
@revans2 revans2 self-assigned this Jun 21, 2021
@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

build

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

build

@tgravescs
Copy link
Collaborator

would be nice to have some sort of tests, perhaps unit tests that verify the slots as well as negative tests, what does this do with truncated log files, etc.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

@tgravescs I added in a test and moved away from using SQL to gather the data for the timeline.

@revans2
Copy link
Collaborator Author

revans2 commented Jun 21, 2021

build

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need to update the tools/README, right now its just manually updated when we add new options

scheduleCallback: (A, Int) => Unit,
errorOnMissingSlot: Boolean,
slotsFreeUntil: mutable.Buffer[Long]): Unit = {
toSchedule.toSeq.sortWith{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
toSchedule.toSeq.sortWith{
toSchedule.toSeq.sortWith {

// TITLE
// EXEC(s) | TASK TIMING
// STAGES | STAGE TIMING
// STAGE RANGES | STAGE RANGE TIMING
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should clarify what stage range means vs just stages, from just looking at the output graph it was unclear to me until I read the code

@revans2
Copy link
Collaborator Author

revans2 commented Jun 22, 2021

build

@revans2
Copy link
Collaborator Author

revans2 commented Jun 22, 2021

@tgravescs I think I have addressed all of your comments.

@tgravescs tgravescs merged commit 5a251e7 into NVIDIA:branch-21.08 Jun 23, 2021
@revans2 revans2 deleted the timeline branch June 23, 2021 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants