Make sure reductions benefit from sparsity #244

dkarrasch · 2022-09-01T12:25:16Z

Fixes #237.

src/sparsematrix.jl

SobhanMP · 2022-09-01T17:12:24Z

maybe also for vectors?

Co-authored-by: Sobhan Mohammadpour <sobhan.mohammadpour@umontreal.ca>

dkarrasch · 2022-09-01T17:18:26Z

I took the AdjOrTrans stuff from your PR, not sure why the coauthorship doesn't show up correctly. So, sum, all, any etc. over the entiry array are all instances of mapreduce, so things need to be managed at a lower level. In fact, I believe they are handled quite well already. One could think about specializing even more and make mapreduce(..., typeof(|), ...) and mapreduce(..., typeof(&), ...), which correspond to all and any, be computed via the difference between length and nnz or something. EDIT: I don't think that will work, because nonzeros(x) can still have zeros.

SobhanMP · 2022-09-01T17:21:01Z

the only thing bugging is any of matrix

julia> using Revise, SparseArrays, LinearAlgebra
       for f in [sum, any, all],
           t in [Int, Float64, Bool],
           a in [
               sprand(t, 100000, 100000, 0.00000001),
               sprand(t, 100000, 0.0001),
           ]
           f != sum && t != Bool && continue
           @show f
           f(a)
           f(transpose(a))
           f(a.nzval)
           @time f(a)
           @time f(transpose(a))
           @time f(a.nzval)
       end
[ Info: Precompiling SparseArrays [3f01184e-e22b-5df5-ae63-d93ebab69eaf]
f = sum
  0.000005 seconds (1 allocation: 16 bytes)
  0.000006 seconds (2 allocations: 64 bytes)
  0.000007 seconds (2 allocations: 64 bytes)
f = sum
  0.000003 seconds (1 allocation: 16 bytes)
  0.000003 seconds (2 allocations: 48 bytes)
  0.000003 seconds (2 allocations: 48 bytes)
f = sum
  0.000004 seconds (1 allocation: 16 bytes)
  0.000002 seconds (2 allocations: 64 bytes)
  0.000004 seconds (2 allocations: 64 bytes)
f = sum
  0.000001 seconds (1 allocation: 16 bytes)
  0.000002 seconds (2 allocations: 48 bytes)
  0.000002 seconds (2 allocations: 48 bytes)
f = sum
  0.000004 seconds
  0.000005 seconds (1 allocation: 48 bytes)
  0.000005 seconds (1 allocation: 48 bytes)
f = sum
  0.000239 seconds
  0.000232 seconds (1 allocation: 32 bytes)
  0.000003 seconds (1 allocation: 32 bytes)
f = any
  0.067989 seconds
  0.223590 seconds (1 allocation: 48 bytes)
  0.000019 seconds (1 allocation: 48 bytes)
f = any
  0.000007 seconds
  0.000006 seconds (1 allocation: 32 bytes)
  0.000001 seconds (1 allocation: 32 bytes)
f = all
  0.000002 seconds
  0.000003 seconds (1 allocation: 48 bytes)
  0.000002 seconds (1 allocation: 48 bytes)
f = all
  0.000003 seconds
  0.000003 seconds (1 allocation: 32 bytes)
  0.000002 seconds (1 allocation: 32 bytes)

i agree it's better to handle things at a lower level

i think it's fine?

dkarrasch · 2022-09-01T17:26:21Z

i think it's fine?

I thought it should show the portrait. 😄

Are the timings without compilation?

SobhanMP · 2022-09-01T17:26:43Z

i think so (updated)

codecov-commenter · 2022-09-01T17:36:02Z

Codecov Report

Merging #244 (170a811) into main (dfcc48a) will increase coverage by 0.22%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
+ Coverage   91.82%   92.05%   +0.22%     
==========================================
  Files          12       12              
  Lines        7307     7314       +7     
==========================================
+ Hits         6710     6733      +23     
+ Misses        597      581      -16

Impacted Files	Coverage Δ
src/sparsematrix.jl	`95.37% <100.00%> (+0.67%)`	⬆️
src/sparsevector.jl	`95.14% <100.00%> (+0.01%)`	⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

dkarrasch · 2022-09-01T19:22:49Z

Wonderful benchmark! The issue with any (and all, though less visible because it drops out early) is that

https://github.com/JuliaLang/julia/blob/71131c97cb00483597fcd357625c054693171aab/base/reduce.jl#L1210-L1221

has precendence over

https://github.com/JuliaLang/julia/blob/71131c97cb00483597fcd357625c054693171aab/base/reducedim.jl#L1010-L1025

see line 1023, because the latter is completely generic. So we need to hook in here to avoid that iterator-based implementation and redirect to mapreduce.

dkarrasch · 2022-09-01T19:50:56Z

Could you double check your benchmark? I stil can't get a local dev version running.

SobhanMP · 2022-09-01T20:02:22Z

nvm give me sec.

SobhanMP · 2022-09-01T20:04:29Z

yup looks good

julia> using Revise, SparseArrays, LinearAlgebra
       for f in [sum, any, all],
           t in [Int, Float64, Bool],
           a in [
               sprand(t, 100000, 100000, 0.00000001),
               sprand(t, 100000, 0.0001),
           ]
           f != sum && t != Bool && continue
           println("\n\n")
           @show f, typeof(a)
           f(a)
           f(transpose(a))
           f(a.nzval)
       
           @time f(a)
           # f(view(a, axes(a)...))
           # @time f(view(a, axes(a)...))
           @time f(transpose(a))
           @time f(a.nzval)
       end
[ Info: Precompiling SparseArrays [3f01184e-e22b-5df5-ae63-d93ebab69eaf]



(f, typeof(a)) = (sum, SparseMatrixCSC{Int64, Int64})
  0.000004 seconds (1 allocation: 16 bytes)
  0.000007 seconds (2 allocations: 64 bytes)
  0.000008 seconds (2 allocations: 64 bytes)



(f, typeof(a)) = (sum, SparseVector{Int64, Int64})
  0.000005 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 48 bytes)
  0.000005 seconds (2 allocations: 48 bytes)



(f, typeof(a)) = (sum, SparseMatrixCSC{Float64, Int64})
  0.000005 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 64 bytes)
  0.000005 seconds (2 allocations: 64 bytes)



(f, typeof(a)) = (sum, SparseVector{Float64, Int64})
  0.000006 seconds (1 allocation: 16 bytes)
  0.000004 seconds (2 allocations: 48 bytes)
  0.000005 seconds (2 allocations: 48 bytes)



(f, typeof(a)) = (sum, SparseMatrixCSC{Bool, Int64})
  0.000004 seconds
  0.000003 seconds (1 allocation: 48 bytes)
  0.000004 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (sum, SparseVector{Bool, Int64})
  0.000004 seconds
  0.000004 seconds (1 allocation: 32 bytes)
  0.000004 seconds (1 allocation: 32 bytes)



(f, typeof(a)) = (any, SparseMatrixCSC{Bool, Int64})
  0.000012 seconds (2 allocations: 64 bytes)
  0.000007 seconds (3 allocations: 112 bytes)
  0.000003 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (any, SparseVector{Bool, Int64})
  0.000003 seconds
  0.000003 seconds (1 allocation: 32 bytes)
  0.000001 seconds (1 allocation: 32 bytes)



(f, typeof(a)) = (all, SparseMatrixCSC{Bool, Int64})
  0.000007 seconds (2 allocations: 64 bytes)
  0.000004 seconds (3 allocations: 112 bytes)
  0.000002 seconds (1 allocation: 48 bytes)



(f, typeof(a)) = (all, SparseVector{Bool, Int64})
  0.000004 seconds
  0.000004 seconds (1 allocation: 32 bytes)
  0.000003 seconds (1 allocation: 32 bytes)

src/sparsematrix.jl

src/sparsevector.jl

dkarrasch · 2022-09-06T14:46:21Z

@SobhanMP Could you please check that performance is good, without the transpose cases? If yes, then I think this is ready to go.

dkarrasch · 2022-09-06T16:23:59Z

I realized we already have

https://github.com/JuliaLang/julia/blob/fa3981bf83a016e2fb48f51204ccbf9d8d66397c/stdlib/LinearAlgebra/src/adjtrans.jl#L382-L385

so the desired adjoint/transpose behavior should already be included. So, even if we don't have tests for which specific code route should be taken, we should test that reduction over adjoints of sparse matrices is fast.

dkarrasch · 2022-09-16T10:40:40Z

Together with JuliaLang/julia#46605, all benchmarks run within nanoseconds and plain sparse arrays and their transpose take pretty much the same amount of time. Let's go with this.

This patch updates SparseArrays. In particular it contains JuliaSparse/SparseArrays.jl#260 which is necessary to make progress in #46759. All changes: - Fix ambiguities with Base. (JuliaSparse/SparseArrays.jl#268) - add == for vectors (JuliaSparse/SparseArrays.jl#248) - add undef initializers (JuliaSparse/SparseArrays.jl#263) - Make sure reductions benefit from sparsity (JuliaSparse/SparseArrays.jl#244) - Remove fkeep! from the documentation (JuliaSparse/SparseArrays.jl#261) - Fix direction of circshift (JuliaSparse/SparseArrays.jl#260) - Fix `vcat` of sparse vectors with numbers (JuliaSparse/SparseArrays.jl#253) - decrement should always return a vector (JuliaSparse/SparseArrays.jl#241) - change order of arguments in fkeep, fix bug with fixed elements (JuliaSparse/SparseArrays.jl#240) - Sparse matrix/vectors with fixed sparsity pattern. (JuliaSparse/SparseArrays.jl#201)

Add count w/o predicate

934a681

fredrikekre reviewed Sep 1, 2022

View reviewed changes

src/sparsematrix.jl Outdated Show resolved Hide resolved

dkarrasch added 3 commits September 1, 2022 15:38

lower hook, handle adjortrans

8d240da

use more fallbacks

422fddd

add iszero for sparse vectors

c1a0a23

handle mapreduce for AdjOrTrans

6484bbd

Co-authored-by: Sobhan Mohammadpour <sobhan.mohammadpour@umontreal.ca>

dkarrasch force-pushed the dk/count branch from 4d97410 to 6484bbd Compare September 1, 2022 17:13

SobhanMP mentioned this pull request Sep 1, 2022

add faster sum/any/all/iszero #243

Closed

fix performance of all and any

9d17f12

SobhanMP reviewed Sep 1, 2022

View reviewed changes

src/sparsematrix.jl Show resolved Hide resolved

SobhanMP reviewed Sep 1, 2022

View reviewed changes

src/sparsevector.jl Outdated Show resolved Hide resolved

dkarrasch added 2 commits September 2, 2022 11:13

rm AdjOrTrans handling

33dec50

same with _any and _all

6a3bc99

SobhanMP mentioned this pull request Sep 2, 2022

Fix mapreduce on AdjOrTrans JuliaLang/julia#46605

Merged

dkarrasch added 2 commits September 5, 2022 19:54

more tests

3406762

improve coverage

3cd18db

dkarrasch changed the title ~~Add count w/o predicate~~ Make sure reductions benefit from sparsity Sep 6, 2022

dkarrasch added 3 commits September 6, 2022 10:48

fix copy-paste errors

ae1f7ef

fix test?

1db38ba

fix all-zero case

170a811

dkarrasch merged commit 0d63db0 into main Sep 16, 2022

dkarrasch deleted the dk/count branch September 16, 2022 10:42

fredrikekre mentioned this pull request Sep 22, 2022

Update SparseArrays dependency JuliaLang/julia#46790

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sure reductions benefit from sparsity #244

Make sure reductions benefit from sparsity #244

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022

dkarrasch commented Sep 1, 2022 •

edited

Loading

SobhanMP commented Sep 1, 2022 •

edited

Loading

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022 •

edited

Loading

codecov-commenter commented Sep 1, 2022 •

edited

Loading

dkarrasch commented Sep 1, 2022

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022 •

edited

Loading

SobhanMP commented Sep 1, 2022

dkarrasch commented Sep 6, 2022

dkarrasch commented Sep 6, 2022

dkarrasch commented Sep 16, 2022

Make sure reductions benefit from sparsity #244

Make sure reductions benefit from sparsity #244

Conversation

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022

dkarrasch commented Sep 1, 2022 • edited Loading

SobhanMP commented Sep 1, 2022 • edited Loading

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022 • edited Loading

codecov-commenter commented Sep 1, 2022 • edited Loading

Codecov Report

dkarrasch commented Sep 1, 2022

dkarrasch commented Sep 1, 2022

SobhanMP commented Sep 1, 2022 • edited Loading

SobhanMP commented Sep 1, 2022

dkarrasch commented Sep 6, 2022

dkarrasch commented Sep 6, 2022

dkarrasch commented Sep 16, 2022

dkarrasch commented Sep 1, 2022 •

edited

Loading

SobhanMP commented Sep 1, 2022 •

edited

Loading

SobhanMP commented Sep 1, 2022 •

edited

Loading

codecov-commenter commented Sep 1, 2022 •

edited

Loading

SobhanMP commented Sep 1, 2022 •

edited

Loading