Support matrix multiplication (Continue #93) #146

jishnub · 2020-09-14T15:47:59Z

This PR continues #93, along the lines of the strategy 2 mentioned here, ie. the axes of the two terms have to be compatible. Now these are possible:

julia> a = [1 2; 3 4];

julia> oa = OffsetArray(a, (2, 2))
2×2 OffsetArray(::Array{Int64,2}, 3:4, 3:4) with eltype Int64 with indices 3:4×3:4:
 1  2
 3  4

julia> oa * oa
2×2 OffsetArray(::Array{Int64,2}, 3:4, 3:4) with eltype Int64 with indices 3:4×3:4:
  7  10
 15  22

julia> v = [5, 6];

julia> ov = OffsetVector(v, (2,));

julia> oa * ov
2-element OffsetArray(::Array{Int64,1}, 3:4) with eltype Int64 with indices 3:4:
 17
 39

julia> oa2 = OffsetArray(a, (0, 0));

julia> oa2 * a
2×2 Array{Int64,2}:
  7  10
 15  22

julia> oa2 * v
2-element OffsetArray(::Array{Int64,1}, 1:2) with eltype Int64 with indices 1:2:
 17
 39

julia> v' * oa2
1×2 Adjoint{Int64,Array{Int64,1}}:
 23  34

julia> ov' * ov
61

Still WIP, doesn't play well with Adjoint and Transpose yet.

To-do:

oa' * ov
oa' * oa

Solving these might be possible only with an explicit dependence on LinearAlgebra. I guess that's ok?

This is actually somewhat less ambitious than the original PR, as product with arbitrary matrices is not handled here and defaults to LinearAlgebra.

codecov · 2020-09-14T16:02:59Z

Codecov Report

Merging #146 into master will increase coverage by 0.63%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
+ Coverage   93.60%   94.23%   +0.63%     
==========================================
  Files           2        2              
  Lines         250      260      +10     
==========================================
+ Hits          234      245      +11     
+ Misses         16       15       -1

Impacted Files	Coverage Δ
src/OffsetArrays.jl	`94.90% <100.00%> (+0.73%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 93e60d3...0529fe8. Read the comment docs.

johnnychen94

The additional overhead comes from #144, I think we may need a way to opt-out the overflow check.

Solving these might be possible only with an explicit dependence on LinearAlgebra. I guess that's ok?

This looks okay to me.

src/OffsetArrays.jl

- New methods for * for combinations of OffsetArray and AbstractArray Unfortunately this introduces method ambiguities with Base. - As an alternative, tried defining a new promote_rule, but somehow it doesn't get invoked. New convenience constructors - Wrap any (non-offset) array as an OffsetArray with standard indices - Remove layer of indirection when constructing with explicit offsets Cosmetic edits to constructor section for improved readability

jishnub · 2020-09-15T13:18:09Z

I'm starting to think that we should not solve the matrix multiplications involving adjoint and transpose here. These should really be solved in LinearAlgebra. The reason some of these work is that LinearAlgebra supports these.

In any case I'm not too sure how to approach arbitrarily nested OffsetArrays wrt dispatch.

johnnychen94 · 2020-09-15T13:33:30Z

I have the same feeling that such kind of support (including the ones you've added) should be done upstream: Base, LinearAlgebra or so. Supporting these operations in OffsetArrays is like Whack-a-mole.

Would like to get @mcabbott involved for some advice

mcabbott · 2020-09-15T14:55:47Z

Yes, would be nice if this worked. My impression is that * and mul! are teetering right on the edge of what's possible with multiple dispatch without going crazy, and so that trying to make this work by adding methods might be too fragile. I haven't looked, but could it be done somewhere lower down? Keep everything generic, down to similar(..., axes(A,1), ...), and then check once somewhere inside mul! before passing on down to gemm! and friends?

johnnychen94 · 2020-09-15T21:15:22Z

I think what's missing here is a protocol in LinearAlgebra to generically deal with array with offsets. It currently only disabled such ability.

The one I just made is:

# LinearAlgebra
offsetted(A::AbstractArray, offsets) = A
unoffsetted(A::AbstractArray) = A, ntuple(_->0, ndims(A))

# OffsetArrays

# this isn't right because it override the function definition
offsetted(A::AbstractArray{T, N}, offsets) where {T, N} = OffsetArray{T, N, typeof(A)}(A, offsets)
unoffsetted(A::OffsetArray) = A.parent, A.offsets

The name is perhaps not good, but you get the idea.

For a simple experiment for 2*2 matrix:

# https://github.com/JuliaLang/julia/blob/8bdf5695e25e77d53f8934d9f135858d174e30d1/stdlib/LinearAlgebra/src/matmul.jl#L914-L950

+ offsetted(A::AbstractArray, offsets) = A
+ unoffsetted(A::AbstractArray) = A, ntuple(_->0, ndims(A))
+ 
 function matmul2x2!(C::AbstractMatrix, tA, tB, A::AbstractMatrix, B::AbstractMatrix,
                     _add::MulAddMul = MulAddMul())
+     A, Ao = unoffsetted(A)
+     B, Bo = unoffsetted(B)
+ 
+     if !(Ao[2] == Bo[1])
+         throw(ArgumentError("offset doesn't match"))
+     end
-  require_one_based_indexing(C, A, B)

 ...
-      C
+     offsetted(C, (Ao[1], Bo[2]))
end

# OffsetArrays
+ import LinearAlgebra
+ LinearAlgebra.offsetted(A::AbstractArray{T, N}, offsets) where {T, N} = OffsetArray{T, N, typeof(A)}(A, offsets)
+ LinearAlgebra.unoffsetted(A::OffsetArray) = A.parent, A.offsets

and it seems work pretty well:

julia> ao = ones(0:1, 0:1)
2×2 OffsetArray(::Matrix{Float64}, 0:1, 0:1) with eltype Float64 with indices 0:1×0:1:
 1.0  1.0
 1.0  1.0

julia> ao * ao
2×2 OffsetArray(::Matrix{Float64}, 0:1, 0:1) with eltype Float64 with indices 0:1×0:1:
 2.0  2.0
 2.0  2.0

I'm not very sure if this goes to the BLAS call for large size cases. Performance might be an issue. This is just some simple experiment, I haven't yet checked it.

Any ideas on how to properly dispatch the offsetted protocol?

mcabbott · 2020-09-15T21:51:34Z

Can't this be simpler though? mul! simply has to check that axes(A,2) == axes(B,1) etc, and then strip everything off (I guess with some unoffset function which OffsetArrays can extend), before proceeding as usual. No problem if only A is an OffsetArray, as long as OneTo(10) == 1:10 etc.

And * makes a new array before calling mul!, which it can do with something like C = similar(A, axes(A,1), axes(B,2)) etc, instead of size. This results in C:: OffsetArray, which is sure to pass mul!'s checks. It never has to put a wrapper back on the result.

At least for ordinary Arrays, I guess StaticArrays's * doesn't go to mul!. Not sure about sparse.

johnnychen94 · 2020-09-16T04:03:16Z

A lot of codes in LinearAlgebra preassume it being base-1 indexes. Only stripped the offset off might not be sufficient and requires a lot of more effort to make things correct.

For example,

# https://github.com/JuliaLang/julia/blob/8bdf5695e25e77d53f8934d9f135858d174e30d1/stdlib/LinearAlgebra/src/matmul.jl#L927-L928
        A11 = copy(A[1,1]'); A12 = copy(A[2,1]')
        A21 = copy(A[1,2]'); A22 = copy(A[2,2]')

timholy · 2020-09-19T10:14:26Z

I really don't think this should be fixed here. It just needs someone to go through LinearAlgebra and start generalizing the 1-based assumptions. I'm not saying that's a small job (there's a reason it hasn't been done yet), but it's the only reasonable path to get to where we want to be.

mcabbott · 2020-09-19T19:39:32Z

Here's a sketch of what I was trying to say above:

https://gist.github.com/mcabbott/03225572217660aa7694f46caaa6e15f

There aren't so many calls to similar which need to be changed from size to axes. Anything with offsets will miss the StridedMatrix etc methods of mul!, and can be caught just before it goes to generic_matmatmul!. OffsetArrays would just need to overload one function remove_offset_indexing, and of course similar. I presume if Base.has_offset_axes(C, A, B) will get compiled away.

Is there a flaw in this plan?

timholy · 2020-09-19T20:14:13Z

That's not bad; it's certain more generic than having each package that uses offset axes implement the same linear algebra functions over and over again. But fundamentally I don't see a need to "strip" unconventional axes; indeed, some day (perhaps when Array gets rewritten in pure Julia) maybe even Julia's own default arrays will support offset axes. In such circumstances there may not be an outer container to strip away: in that case https://gist.github.com/mcabbott/03225572217660aa7694f46caaa6e15f#file-offset_matmul-jl-L71 may return arrays of the same type as the inputs, and then how do you avoid a StackOverflowError?

To me it seems the sensible thing to do is start going through the multiplication routines and generalize them to use firstindex rather than 1, and lastindex instead of size(A, n). Then you're designing algorithms based on behavior rather than on types, which is usually more generalizable across changes in implementation elsewhere in the ecosystem.

mcabbott · 2020-09-19T20:52:30Z

Thanks for having a look. I guess this is an attempt to change as little as possible. It wouldn't preclude doing a better matmul3x3! later... but wouldn't you would still need something like this unwrapping story before handing pointers to gemm!, for cases when you can?

The contract for remove_offset_indexing is that it has to return an array for which has_offset_axes is false. It can come with a warning sign... But as long as you don't define a method for this, it should fall back the same test and the same error as you get now:

using AxisArrays # a wrapper with no remove_offset_indexing method
aa = AxisArray(rand(2,2), x='a':'b', y=10:11)
@which aa' * aa # AbstractArray, no special methods, no special axes

ao = AxisArray(zeros(0:1, 3:4), x='a':'b', y=10:11) # with indices 0:1×3:4
ao' * ao # ArgumentError: offset arrays are not supported

It's a bit awkward that there's no simple way here for the method for OffsetArrays to be employed despite this not being the outermost wrapper. Maybe there's a better design?

If mul! called something like remove_offsets(A, axes(A)), then perhaps OffsetArrays could overload instead remove_offsets(::Any, ::Tuple{IdOffsetRange}) instead, and apply the inverse wrapper... that seems a bit crazy, and won't dispatch to ::StridedMatrix etc.

timholy · 2020-09-19T21:57:49Z

unwrapping story before handing pointers to gemm!, for cases when you can?

Since gemm! only needs pointers, you still don't have to explicitly strip the axes, as long as there is an appropriate pointer method.

mcabbott · 2020-09-19T22:20:37Z

I guess I was thinking of the pointer as being the maximally unwrapped form, just the memory (& the eltype). Before you pass that along, you'd need some logic to be sure they are all in the same convention. I suppose that is why pointer isn't defined for OffsetArrays, nor for Adjoints of complex arrays?

timholy · 2020-09-20T06:56:49Z

Well, mostly I think there hasn't been any reason to have it so far...because matrix multiplication hasn't been supported.

But again, "maximally unwrapped" is a property of our current implementations, not a generic property.

struct ZeroIndexedSVector{L,T} <: AbstractVector{T}
    data::RawBuffer{UInt8}                # RawBuffer needs to be implemented in Julia (gc management) but is coming some day...
end
Base.axes(::ZeroIndexedSVector{L}) where L = (ZeroRange(L),)  # https://github.com/JuliaArrays/CustomUnitRanges.jl/blob/master/src/ZeroRange.jl

does not have a wrapper you can strip---instead you have to wrap it with something else.

mcabbott · 2020-09-20T07:30:17Z

OK, good example. It's a bit ugly if remove_offset_indexing(:: ZeroIndexedSVector) adds an OffsetArray wrapper, just to feed the existing generic_matmul! something it understands.

The other objection to this design is that, with my AxisArray example above, it ought to be able to tell what to do just from seeing Base.axes(ao). If mul! knows how to send OffsetArrays to gemm!, then it should know here. But I don't see a simple way to make that work without unwrapping. You'd need something like pointer and strides to be defined for the wrapper type, so the package still has to co-operate. And you'd seemingly have to re-write all the code which decides whether or not to call BLAS, which right now dispatches on StridedMatrix etc. I guess this leads to JuliaLang/julia#25558 (ArrayLayouts.jl) and JuliaLang/julia#30432 (strides) etc.

timholy · 2020-09-20T07:38:50Z

pointer and strides can be implemented. Often both strided and non-strided arrays can be wrapped by any given type, but if you defer to the child's implementation then it all works out naturally. A good example is SubArray: strides(V.parent) might throw an error, if the parent itself isn't strided. OffsetArrays can mimic this design.

mcabbott · 2020-09-20T08:09:56Z

Yes to strides-of-the-parent, and JuliaLang/julia#30432 seems like a very neat mechanism to let this pass information about non-stridedness too.

Not sure I'm volunteering to re-write mul! in terms of that just yet though! But xref FluxML/NNlib.jl#191 which is an attempt to do this for batched_mul!.

simeonschaub · 2020-11-03T21:47:52Z

What's the current state of this? It would be really nice if matrix multiply just worked with offset SArrays. Does this actually break cases that worked before?

timholy · 2020-11-03T23:26:41Z

I'm pretty firmly of the opinion that most of the work needs to be done in LinearAlgebra and/or Base, not here. I know folks probably think the magic of OffsetArrays mostly happens here, but that's not really true (plus about a gazillion bugfix PRs submitted by lots of other people after that).

simeonschaub · 2020-11-03T23:46:30Z

Oh yes, I definitely agree with you! This PR in its current form looks like a worthwhile improvement, at least to me, though, but I might have missed something in the discussions. Or does this cause method ambiguities, even without trying to support Transpose and Adjoint?

timholy · 2020-11-04T10:25:00Z

If it doesn't break anything now, and its later removal doesn't break anything later, then we could use this as a bandaid. It's a bandaid that may slow progress on the right solution, though. So it's worth asking whether this PR is in our best interests, long-term.

I won't stand in the way if folks want to merge this.

johnnychen94 · 2020-11-06T05:37:08Z

Adding such a bandaid can easily cause unexpected performance regression for some blas operations without enough specifications. The fallback blas implementations are usually quite slow and identifying the performance regression isn't usually a trivial task; I'd rather avoid this whack-a-mole patch.

jishnub requested review from timholy and johnnychen94 September 14, 2020 17:36

johnnychen94 reviewed Sep 14, 2020

View reviewed changes

src/OffsetArrays.jl Outdated Show resolved Hide resolved

src/OffsetArrays.jl Outdated Show resolved Hide resolved

Ryan Bennink and others added 2 commits September 15, 2020 09:37

use offsets instead of axes

0529fe8

jishnub force-pushed the matmul branch from 8db31b0 to 0529fe8 Compare September 15, 2020 05:57

timholy mentioned this pull request Nov 30, 2020

Fixed segfault when adding OffsetArray and UniformScaling JuliaLang/julia#38544

Merged

mcabbott mentioned this pull request May 4, 2021

Improvements to rules for norm JuliaDiff/ChainRules.jl#337

Merged

mcabbott mentioned this pull request Dec 26, 2021

Offset matrix multiplication via generic_matmatmul! #270

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support matrix multiplication (Continue #93) #146

Support matrix multiplication (Continue #93) #146

jishnub commented Sep 14, 2020 •

edited

Loading

codecov bot commented Sep 14, 2020 •

edited

Loading

johnnychen94 left a comment •

edited

Loading

jishnub commented Sep 15, 2020

johnnychen94 commented Sep 15, 2020 •

edited

Loading

mcabbott commented Sep 15, 2020

johnnychen94 commented Sep 15, 2020 •

edited

Loading

mcabbott commented Sep 15, 2020

johnnychen94 commented Sep 16, 2020 •

edited

Loading

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020 •

edited

Loading

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020

timholy commented Sep 20, 2020 •

edited

Loading

mcabbott commented Sep 20, 2020

timholy commented Sep 20, 2020

mcabbott commented Sep 20, 2020

simeonschaub commented Nov 3, 2020

timholy commented Nov 3, 2020

simeonschaub commented Nov 3, 2020

timholy commented Nov 4, 2020

johnnychen94 commented Nov 6, 2020

Support matrix multiplication (Continue #93) #146

Are you sure you want to change the base?

Support matrix multiplication (Continue #93) #146

Conversation

jishnub commented Sep 14, 2020 • edited Loading

To-do:

codecov bot commented Sep 14, 2020 • edited Loading

Codecov Report

johnnychen94 left a comment • edited Loading

Choose a reason for hiding this comment

jishnub commented Sep 15, 2020

johnnychen94 commented Sep 15, 2020 • edited Loading

mcabbott commented Sep 15, 2020

johnnychen94 commented Sep 15, 2020 • edited Loading

mcabbott commented Sep 15, 2020

johnnychen94 commented Sep 16, 2020 • edited Loading

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020 • edited Loading

timholy commented Sep 19, 2020

mcabbott commented Sep 19, 2020

timholy commented Sep 20, 2020 • edited Loading

mcabbott commented Sep 20, 2020

timholy commented Sep 20, 2020

mcabbott commented Sep 20, 2020

simeonschaub commented Nov 3, 2020

timholy commented Nov 3, 2020

simeonschaub commented Nov 3, 2020

timholy commented Nov 4, 2020

johnnychen94 commented Nov 6, 2020

jishnub commented Sep 14, 2020 •

edited

Loading

codecov bot commented Sep 14, 2020 •

edited

Loading

johnnychen94 left a comment •

edited

Loading

johnnychen94 commented Sep 15, 2020 •

edited

Loading

johnnychen94 commented Sep 15, 2020 •

edited

Loading

johnnychen94 commented Sep 16, 2020 •

edited

Loading

mcabbott commented Sep 19, 2020 •

edited

Loading

timholy commented Sep 20, 2020 •

edited

Loading