Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail tasks exceeding no-workers-timeout #8806

Merged
merged 21 commits into from
Aug 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions distributed/distributed-schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,16 +81,15 @@ properties:
- string
- "null"
description: |
Shut down the scheduler after this duration if there are pending tasks,
but no workers that can process them. This can either mean that there are
no workers running at all, or that there are idle workers but they've been
excluded through worker or resource restrictions.
Timeout for tasks in an unrunnable state.

If task remains unrunnable for longer than this, it fails. A task is considered unrunnable IFF
it has no pending dependencies, and the task has restrictions that are not satisfied by
any available worker or no workers are running at all.

In adaptive clusters, this timeout must be set to be safely higher than
the time it takes for workers to spin up.

Works in conjunction with idle-timeout.

work-stealing:
type: boolean
description: |
Expand Down
2 changes: 1 addition & 1 deletion distributed/distributed.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ distributed:
# after they have been removed from the scheduler
events-cleanup-delay: 1h
idle-timeout: null # Shut down after this duration, like "1h" or "30 minutes"
no-workers-timeout: null # Shut down if there are tasks but no workers to process them
no-workers-timeout: null # If a task remains unrunnable for longer than this, it fails.
work-stealing: True # workers should steal tasks from each other
work-stealing-interval: 100ms # Callback time for work stealing
worker-saturation: 1.1 # Send this fraction of nthreads root tasks to workers
Expand Down
Loading
Loading