[TOPI] sparse_dense Op sparse_data input added #6889

ANSHUMAN87 · 2020-11-09T18:05:17Z

Change Summary:
Current sparse_dense Op in Topi support only weight as sparse input.
This PR add support for data also to be as sparse input.
So, in either case any one can be a sparse and other one will be dense.

This is a follow up PR from #6685

cc @tkonolige , @siju-samuel !

tkonolige · 2020-11-09T18:25:56Z

I'm not sure that the correct course of action is to add a flag to sparse_dense to support AB^T with B sparse. This makes all the implementations of sparse_dense confusing because they now have to check this flag and use a compute/schedule depending on if it is enabled or not. I'd instead prefer to make a new op called dense_sparse that does AB^T with B sparse.

Alternatively, I don't really see a reason for supporting AB^T with B sparse directly. Instead, when we convert from a frontend to tvm, we can just insert the correct transposes. In a lot of ways this is a better choice because we do not need to reimplement the same operators and the operators prefer the data to be in this format. I think this is the best choice.

ANSHUMAN87 · 2020-11-09T18:37:43Z

I'm not sure that the correct course of action is to add a flag to sparse_dense to support AB^T with B sparse. This makes all the implementations of sparse_dense confusing because they now have to check this flag and use a compute/schedule depending on if it is enabled or not. I'd instead prefer to make a new op called dense_sparse that does AB^T with B sparse.

Alternatively, I don't really see a reason for supporting AB^T with B sparse directly. Instead, when we convert from a frontend to tvm, we can just insert the correct transposes. In a lot of ways this is a better choice because we do not need to reimplement the same operators and the operators prefer the data to be in this format. I think this is the best choice.

Thanks @tkonolige for response. I also had similar dilemma at the front. But later i resolved to re-utilize the existing Op, in order to enable reuse of small portion of code and it bind the concept of Op together too. But i am in favor of 'dense_sparse' Op as well(But that would be the name for existing Op i think 🙂 ). However i think we do not have much impact on the schedule side, in between these 2 flavor of Ops.

But the utilizing Op with transpose, i felt we are adding additional overhead every-time may not be much in smaller matrix.
Please let me know your thoughts on this. Thanks!

tkonolige · 2020-11-09T19:05:41Z

I think it is still possible to reuse code with a separate op, but maybe its to a lesser extent.

I'm not sure how much overhead the transposes will add to the code. We don't have to pay the cost of transposing the sparse matrix because that can be done at compile time (the sparse matrix is usually a weight). Could you maybe do some benchmarking and see if the duplication of code is worth it from a performance standpoint?

ANSHUMAN87 · 2020-11-09T19:42:28Z

Sure I will check on how much overhead added in case of transpose with existing Op case.

ANSHUMAN87 · 2020-11-11T15:36:58Z

Hi @tkonolige , below is the benchmark data i have obtained for 4 different input dimensions.

NOTE: Here with Transpose means using existing sparse_dense Op with additional Transpose layer.
Without Transpose means using the new flavor of the Op implemented in the current PR.
And all the data is collected with iteration=3000 and repeat=3 .

Case 1{Sparse_input = [1024, 512], dense_input=[512, 4096]}:

Case 2{Sparse_input = [1024, 512], dense_input=[512, 1024]}:

Case 3{Sparse_input = [2048, 512], dense_input=[512, 2048]}:

Case 4{Sparse_input = [4096, 512], dense_input=[512, 4096]}:

Clearly as the dimension increases the amount of improvement is significant. So i think we should keep the new flavor of the Op implemented in current PR.

tkonolige · 2020-11-11T18:25:09Z

Just to check, you're only transposing the dense matrix? Also, what is the density of the sparse matrix?

I'm curious, could you do a benchmark with a smaller batch size (dense_input=[16, 1024])? And could you also test on cuda? And if you have time, could you give a breakdown of time spent in the matrix multiply vs time spent in the transpose?

From these numbers, it does look like it is worthwhile to do this. It just means a lot more sparse kernels to write.

ANSHUMAN87 · 2020-11-13T14:40:18Z

Just to check, you're only transposing the dense matrix? Also, what is the density of the sparse matrix?

I'm curious, could you do a benchmark with a smaller batch size (dense_input=[16, 1024])? And could you also test on cuda? And if you have time, could you give a breakdown of time spent in the matrix multiply vs time spent in the transpose?

From these numbers, it does look like it is worthwhile to do this. It just means a lot more sparse kernels to write.

Yes , the additional transpose is done on final output from sparse_dense Op which is a dense tensor.
Cuda scheduling has issues currently which i am working on, maybe next PR i will handle it.
However i think cuda benchmark is not needed to conclude the benifit of new Op added in this PR.

A lower dimension output without transpose{Sparse_input = [4096, 1024], dense_input=[1024, 16]}::

With transpose:

With this i think we can start the code review.

ANSHUMAN87 · 2020-11-14T17:16:38Z

Gentle ping @tkonolige !!!
I am not too sure who else from TVM official reviewer or committer interested in sparse. If you are aware of anyone please feel free to invite. TIA!

tkonolige

I'm still not sold on this approach. But maybe someone else can chime in. If we do go with the sparse_data flag, I think we should rename it to something like sparse_rhs or sparse_b. Or we could do the same thing as tf and have adjoint_a and adjoint_b.

src/relay/op/nn/sparse.cc

python/tvm/topi/nn/sparse.py

python/tvm/relay/op/nn/nn.py

ANSHUMAN87 · 2020-11-16T18:30:13Z

But maybe someone else can chime in.

Thanks @tkonolige for your feedback. I believe the performance stats are quite clear to opt for a new Op in the case.
However if you have any kind of concern, you can always let me know either here or offline too. We can discuss more in detail to reach to a more convincing point.

@tqchen : Tristan is helping here with his valuable review efforts. But i think we need a third opinion here (possibly an official reviewer or committer) to help proceed with the PR to next level. As i am not very sure who might be interested in Sparse changes, so if you can help tag few people here. TIA!

tkonolige · 2020-11-16T18:39:18Z

Sorry, I meant that I'm not sure about the using flags approach vs having a separate operator for sparse x dense vs dense x sparse. From your benchmarks, it does look like we need some way of doing dense x sparse directly.

ANSHUMAN87 · 2020-11-18T13:18:14Z

@tqchen : Tristan is helping here with his valuable review efforts. But i think we need a third opinion here (possibly an official reviewer or committer) to help proceed with the PR to next level. As i am not very sure who might be interested in Sparse changes, so if you can help tag few people here. TIA!

Gentle ping @tqchen !

include/tvm/relay/attrs/nn.h

python/tvm/relay/op/nn/nn.py

python/tvm/topi/nn/sparse.py

ANSHUMAN87 · 2020-11-25T15:26:06Z

@tkonolige : All your comments are addressed. Would you please check and approve. So that we can proceed with this PR!

tkonolige

Looks good. I've left a couple comments. I'm approving, but I'm not a committer, so you'll need someone else to approve too.

Could you also add a test to tests/python/topi/python/test_topi_sparse.py with sparse_lhs=True.

python/tvm/relay/op/nn/nn.py

python/tvm/topi/cuda/sparse.py

ANSHUMAN87 · 2020-12-08T16:16:29Z

tests/python/topi/python/test_topi_sparse.py

Sorry for such delayed response. Now i have addressed your remaining comments too!

ANSHUMAN87 · 2020-12-08T16:28:38Z

@tqchen , @jroesch , @FrozenGene, @junrushao1994 : Would you please help proceed with the PR. TIA!

FrozenGene · 2020-12-09T14:30:29Z

@tqchen , @jroesch , @FrozenGene, @junrushao1994 : Would you please help proceed with the PR. TIA!

Just see your mention and sorry for later reply. Will do one round of review tomorrow.

FrozenGene

generally lgtm. @antinucleon could you help to have one round of review? as you have done some work of sparse

include/tvm/relay/attrs/nn.h

ANSHUMAN87 · 2020-12-14T14:56:30Z

Gentle ping @antinucleon , @FrozenGene , @tqchen , @jroesch !
Unless if any further comments, I think we can proceed with the merge, as we have greens from @tkonolige , @FrozenGene !
I have lot to improvise post this merge which are partially dependent on this PR!

FrozenGene · 2020-12-15T02:55:24Z

@ANSHUMAN87 please go ahead

ANSHUMAN87 · 2020-12-15T03:28:08Z

@ANSHUMAN87 please go ahead

Thanks a lot @FrozenGene, @tkonolige!

* [TOPI] sparse_dense op sparse_data input added * [1] clang issue resolved * [2] python format resolved * [3] lint error resolved * [4] Review comments handled * [5] Lint error resolved * [6] Review comments handled * [7] Review comments handled * [8] Review comments handled

ANSHUMAN TRIPATHY added 3 commits November 9, 2020 21:56

[TOPI] sparse_dense op sparse_data input added

c11d23d

[1] clang issue resolved

0b86634

[2] python format resolved

bb733b4

ANSHUMAN87 force-pushed the sparse-32 branch 2 times, most recently from c3d8874 to 06f9911 Compare November 9, 2020 18:17

[3] lint error resolved

f7b8d98

ANSHUMAN87 force-pushed the sparse-32 branch from 06f9911 to f7b8d98 Compare November 10, 2020 09:07

tkonolige reviewed Nov 16, 2020

View reviewed changes

src/relay/op/nn/sparse.cc Outdated Show resolved Hide resolved

python/tvm/topi/nn/sparse.py Outdated Show resolved Hide resolved

python/tvm/topi/nn/sparse.py Outdated Show resolved Hide resolved

python/tvm/relay/op/nn/nn.py Outdated Show resolved Hide resolved

ANSHUMAN TRIPATHY added 2 commits November 20, 2020 02:11

[4] Review comments handled

98444f2

[5] Lint error resolved

64fa8ae

tkonolige reviewed Nov 19, 2020

View reviewed changes

include/tvm/relay/attrs/nn.h Outdated Show resolved Hide resolved

tkonolige reviewed Nov 19, 2020

View reviewed changes

ANSHUMAN TRIPATHY added 2 commits November 20, 2020 03:07

[6] Review comments handled

6825222

[7] Review comments handled

4623979

ANSHUMAN87 force-pushed the sparse-32 branch from a466aed to 4623979 Compare November 20, 2020 04:53

tkonolige approved these changes Nov 25, 2020

View reviewed changes

python/tvm/relay/op/nn/nn.py Outdated Show resolved Hide resolved

python/tvm/topi/cuda/sparse.py Outdated Show resolved Hide resolved

[8] Review comments handled

a4ed84f

ANSHUMAN87 force-pushed the sparse-32 branch from 824d239 to a4ed84f Compare December 8, 2020 16:22

FrozenGene reviewed Dec 10, 2020

View reviewed changes

include/tvm/relay/attrs/nn.h Show resolved Hide resolved

FrozenGene approved these changes Dec 15, 2020

View reviewed changes

FrozenGene merged commit 862655b into apache:main Dec 15, 2020

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TOPI] sparse_dense Op sparse_data input added #6889

[TOPI] sparse_dense Op sparse_data input added #6889

ANSHUMAN87 commented Nov 9, 2020

tkonolige commented Nov 9, 2020

ANSHUMAN87 commented Nov 9, 2020 •

edited

Loading

tkonolige commented Nov 9, 2020

ANSHUMAN87 commented Nov 9, 2020

ANSHUMAN87 commented Nov 11, 2020

tkonolige commented Nov 11, 2020

ANSHUMAN87 commented Nov 13, 2020 •

edited

Loading

ANSHUMAN87 commented Nov 14, 2020

tkonolige left a comment

ANSHUMAN87 commented Nov 16, 2020

tkonolige commented Nov 16, 2020

ANSHUMAN87 commented Nov 18, 2020

ANSHUMAN87 commented Nov 25, 2020

tkonolige left a comment

ANSHUMAN87 commented Dec 8, 2020

ANSHUMAN87 commented Dec 8, 2020 •

edited

Loading

FrozenGene commented Dec 9, 2020

FrozenGene left a comment

ANSHUMAN87 commented Dec 14, 2020

FrozenGene commented Dec 15, 2020

ANSHUMAN87 commented Dec 15, 2020

[TOPI] sparse_dense Op sparse_data input added #6889

[TOPI] sparse_dense Op sparse_data input added #6889

Conversation

ANSHUMAN87 commented Nov 9, 2020

tkonolige commented Nov 9, 2020

ANSHUMAN87 commented Nov 9, 2020 • edited Loading

tkonolige commented Nov 9, 2020

ANSHUMAN87 commented Nov 9, 2020

ANSHUMAN87 commented Nov 11, 2020

tkonolige commented Nov 11, 2020

ANSHUMAN87 commented Nov 13, 2020 • edited Loading

ANSHUMAN87 commented Nov 14, 2020

tkonolige left a comment

Choose a reason for hiding this comment

ANSHUMAN87 commented Nov 16, 2020

tkonolige commented Nov 16, 2020

ANSHUMAN87 commented Nov 18, 2020

ANSHUMAN87 commented Nov 25, 2020

tkonolige left a comment

Choose a reason for hiding this comment

ANSHUMAN87 commented Dec 8, 2020

ANSHUMAN87 commented Dec 8, 2020 • edited Loading

FrozenGene commented Dec 9, 2020

FrozenGene left a comment

Choose a reason for hiding this comment

ANSHUMAN87 commented Dec 14, 2020

FrozenGene commented Dec 15, 2020

ANSHUMAN87 commented Dec 15, 2020

ANSHUMAN87 commented Nov 9, 2020 •

edited

Loading

ANSHUMAN87 commented Nov 13, 2020 •

edited

Loading

ANSHUMAN87 commented Dec 8, 2020 •

edited

Loading