Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation about how horovod works #35

Closed
pescap opened this issue Mar 14, 2022 · 6 comments
Closed

Add documentation about how horovod works #35

pescap opened this issue Mar 14, 2022 · 6 comments
Assignees
Labels
HPC TID on HPC stale

Comments

@pescap
Copy link
Owner

pescap commented Mar 14, 2022

Here.

It could be interesting to add bibliography (from academia, and blogs (Medium, etc...).

Try to find the best articles to understand Horovod, how it works (technically), its structure.

I suggest to list the references, for example:

Documentation

Meet Horovod: Uber’s Open Source Distributed Deep Learning Framework for TensorFlow
Add a summary of what matters/you learnt or understood in this article

Distributed Deep Learning with Horovod
summary

and so on.

You can start with these references (and add your notes gradually):

https://towardsdatascience.com/distributed-deep-learning-with-horovod-2d1eea004cb2
https://arxiv.org/pdf/1706.02677.pdf
Horovod Article: https://arxiv.org/pdf/1802.05799.pdf
Uber presentation: https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9321-distributed-deep-learning-with-horovod.pdf

Add more references if you find more.

Important. Read: lululxvi/deepxde#39

@pescap pescap added the HPC TID on HPC label Mar 14, 2022
@Setol21 Setol21 added this to the Documentation 0.1 milestone Mar 15, 2022
@pescap
Copy link
Owner Author

pescap commented Mar 22, 2022

Horovod accelerated PINNs: NeuralSolvers.
Read the paper, comment it, and understand the code. We aim to implement the same ideas with DeepXDE.

Also has MirroredStrategy: https://github.com/tensordiffeq/TensorDiffEq

Read: https://eng.uber.com/michelangelo-machine-learning-platform/

@Ehaw04
Copy link
Collaborator

Ehaw04 commented Mar 22, 2022

[Uber Open Summit 2018] Horovod: Deep Learning distribuido en 5 líneas de Python:

https://www.youtube.com/watch?v=4y0TDK3KoCA&t=900s&ab_channel=UberEngineering

@github-actions
Copy link

This issue is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Mar 30, 2022
@github-actions
Copy link

github-actions bot commented Apr 7, 2022

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 7, 2022
@pescap
Copy link
Owner Author

pescap commented Apr 7, 2022

#58

@pescap pescap reopened this Apr 7, 2022
@github-actions
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HPC TID on HPC stale
Projects
None yet
Development

No branches or pull requests

4 participants