-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELAY][VM] Enable heterogeneous execution for Relay VM #6337
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look, mostly in the python sources.
Yay! I'm so excited for this! I'll do a deep dive today There are a number tests in tests/python/relay/dyn that skip running on GPU while waiting for this feature, i.e. https://github.com/apache/incubator-tvm/blob/942c90ba7a7b9bccf6d9bce43808aba2bd6c9787/tests/python/relay/dyn/test_dynamic_op_level3.py#L30-L31 Do you want to enable those as part of this test? Or I can do it as a second PR. |
@mbrookhart Thanks for reminding, I just enabled all the dynamic op tests except for level6 because topk has a problem for GPU which I have already had a TODO in the test_any. We need to look into it later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nitpicks, I'd like to see a little more documentation on the passes, I'm not sure I fully understand what you're doing just from looking at the code, but overall it looks really good, I'm excited!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
half way through the pr. will come back and review the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @zhiics @mbrookhart @leandron @jwfromm |
Sorry for my delay! I've been out the last few days moving. Anyway, looking over what was changed since my last review, I'm happy to give it a post-merge approval, looks great! Thanks @icemelon9 and @zhiics. I'll start enabling the dynamic tests on gpu and I'll work on fixing anything that fails (including topk) |
* vm heterogeneous execution * context analysis on module * fix profiler * fix memory plan * add more unification * add serialization * add gpu tests for test_adt * cache visited functions * path compression * C++ context analysis * remove python context analysis * add tests * clean * lint * fix * enable gpu test for dynamic namespace * remove GetParamsContext * fix comments and add doc for context analysis * cache context * cache allocator * rebase and fix comments
* vm heterogeneous execution * context analysis on module * fix profiler * fix memory plan * add more unification * add serialization * add gpu tests for test_adt * cache visited functions * path compression * C++ context analysis * remove python context analysis * add tests * clean * lint * fix * enable gpu test for dynamic namespace * remove GetParamsContext * fix comments and add doc for context analysis * cache context * cache allocator * rebase and fix comments
* vm heterogeneous execution * context analysis on module * fix profiler * fix memory plan * add more unification * add serialization * add gpu tests for test_adt * cache visited functions * path compression * C++ context analysis * remove python context analysis * add tests * clean * lint * fix * enable gpu test for dynamic namespace * remove GetParamsContext * fix comments and add doc for context analysis * cache context * cache allocator * rebase and fix comments
Currently, the dynamic models can only be executed for on CPU. The GPU execution is not allowed for these models because they have shape functions to do runtime type inference. These functions may contain various control logic to derive the shape of a tensor at runtime and they are never compute intensive, therefore are designed to be executed on CPU. That being said, we must use CPU to execute these functions even when trying to run the whole model on other devices. This PR enables the heterogeneous execution for Relay VM to support dynamic models on devices other than CPU.
More specifically, it includes the following changes:
Followup PRs will fix/add schedules for some ops to enable GPU execution for Bert and TF objection detection models.
cc @icemelon9 @jroesch @mbrookhart @wweic