Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cluster.wait_for_workers (called by client.wait_for_workers if the client.client exists) #6346

Closed
dchudz opened this issue May 14, 2022 · 1 comment · Fixed by #6700
Closed
Labels
feature Something is missing good first issue Clearly described and easy to accomplish. Good for beginners to the project. p2 Affects more than a few users but doesn't prevent core functions

Comments

@dchudz
Copy link
Contributor

dchudz commented May 14, 2022

Currently users call client.wait to wait for a certain number of workers.

Currently a common pattern is:

cluster.scale(100)
client.wait_for_workers(100)

Cluster managers (like Coiled) would like to include custom handling to give users nice information about what's happening (including especially failures) while waiting for workers. At the moment, this isn't possible.

Proposal to make this possible:

  • add cluster.wait_for_workers (with logic similar to the existing client.wait_for_workers)
  • client.wait_for_workers should call cluster.wait_for_workers if there's a cluster object.

Then cluster managers can override cluster.wait_for_workers to provide custom handling, which will be called whether the user calls cluster.wait_for_workers or client.wait_for_workers.

This suggestion arose from discussion with @jacobtomlinson in #6041.

@fjetter fjetter added good first issue Clearly described and easy to accomplish. Good for beginners to the project. p2 Affects more than a few users but doesn't prevent core functions feature Something is missing labels May 16, 2022
@idorrington92 idorrington92 mentioned this issue Jul 9, 2022
2 tasks
@consideRatio
Copy link
Contributor

consideRatio commented Dec 25, 2023

  • client.wait_for_workers should call cluster.wait_for_workers if there's a cluster object.

This should be if there is a cluster object that implements such function only right? I think this led to a failure in dask_gateway that hasn't yet added support for this - which it can, but it is currently broken unexpectedly until it does i think.

This was reported in dask/dask-gateway#782.


Should dask/distributed be updated to handle a cluster object not implementing this new feature? (besides updating dask-gateway to support it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Something is missing good first issue Clearly described and easy to accomplish. Good for beginners to the project. p2 Affects more than a few users but doesn't prevent core functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants