Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add views founded on cartesian indexing #8235

Closed
wants to merge 1 commit into from
Closed

Conversation

timholy
Copy link
Sponsor Member

@timholy timholy commented Sep 4, 2014

Here's a demo of what a View type founded on cartesian indexing would look like. This is a complete implementation of the constructor and scalar indexing (i.e., V[1,2,3,4] returning a scalar rather than an array); the only big missing piece is creating a view from a view. And has been discussed elsewhere, for some AbstractArray types (notably, BitArrays) you'd want specialized implementations of certain functions.

Very briefly, the way these work is dirt simple:

A = rand(10, 10)
B = subview(A, 2:4, 3:5)

Given this, B[1,3] simply gets directly translated into A[2,5]. The big advantage is that this works for any AbstractArray. You won't get optimal performance for BitArrays, but SparseMatrices should work perfectly well straight out of the box, as well as many, many other types of AbstractArrays.

If you look at the code, you'll see that the large majority of functions are one-liners. The one exception is that if you say B[4], then it has to do linear indexing, so it generates the equivalent of

i1 = 4
(i2,i1) = divrem(i1, size(B,1)) 
A[1+i1, 2+i2]

But notice there are no loops in here---everything is generated specifically for the problem at hand.

The main issue is that to support both sub and slice (does ArrayViews.jl even support slices?), you need a lot of methods. In the table below, ncreate is the number of constructors generated (actually one for subview and one for sliceview), and nindex is the number of indexing methods (actually one for getindex and one for setindex!).

# Generate methods up to 4 dimensions
julia> @time reload("ArrayViewsAPL")
Warning: replacing module ArrayViewsAPL
ncreate = 30
nindex = 150
elapsed time: 3.632765916 seconds (22582744 bytes allocated)

# Generate methods up to 5 dimensions
julia> @time reload("ArrayViewsAPL")
Warning: replacing module ArrayViewsAPL
ncreate = 62
nindex = 372
elapsed time: 9.111417113 seconds (47277988 bytes allocated)

# Generate methods up to 6 dimensions
julia> @time reload("ArrayViewsAPL")
Warning: replacing module ArrayViewsAPL
ncreate = 126
nindex = 882
elapsed time: 25.07280445 seconds (115817664 bytes allocated, 1.17% gc time)

# Generate methods up to 7 dimensions
julia> @time reload("ArrayViewsAPL")
Warning: replacing module ArrayViewsAPL
ncreate = 254
nindex = 2032
elapsed time: 77.621801799 seconds (324388508 bytes allocated, 0.99% gc time)

# Generate methods up to 8 dimensions
julia> @time reload("ArrayViewsAPL")
Warning: replacing module ArrayViewsAPL
ncreate = 510
nindex = 4590
elapsed time: 261.558635945 seconds (991729360 bytes allocated, 1.01% gc time)

The reason I was holding out for stagedfunctions is that you could support arbitrary dimensionality without waiting 5 minutes for the file to finish getting through code lowering. But if people are eager to move forward with this, perhaps we shouldn't wait.

@andreasnoack
Copy link
Member

Thank you for preparing this request. I have just tried it and one problem seems to be that

julia> typeof(A)
Array{Float64,2}

julia> super(typeof(Base.Views.subview(A, 1:2,1:2)))
AbstractArray{Float64,2}

and not DenseArray as it would be in ArrayViews. Hence you might want to add

View{T,N,<:DenseArray{T,N},NTuple}

to the StridedArray typeunion.

I also think that the ability to handle views from views properly is important for views to work well for linear algebra, which is what I am mostly concerned about.

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 4, 2014

Adding that declaration would preclude making a view of a SparseMatrixCSC, so I'd be pretty reluctant to do that. Would a more satisfying solution be typealias StridedArray{T,N,A<:DenseArray} Union(DenseArray{T,N}, View{T,N,A})?

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 4, 2014

Also meant to add: I completely agree that being able to create views-of-views is essential. The main purpose here is to round out the menu of options: (1) ArrayViews, (2) this (ArrayViewsAPL without the stagedfunctions), (3) ArrayViewsAPL. I felt this was far enough to get some feedback & clarity of which of these we want. (And they are not mutually exclusive, we could combine 1 with either 2 or 3. But there are some disadvantages there, too; which do you create by default?)

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 4, 2014

Cross-linking to other issues: #8176, #5556, #7941, #5949. Probably others too.

@simonster
Copy link
Member

This is interesting, and impressively compact. Any idea what performance is like vs. ArrayViews.jl?

As I sit here running code that is spending a ridiculous amount of time in GC mostly because of allocation in array view creation, I'm also wondering whether we should think about storing the indexes/dims in n-field immutables instead of tuples until we have more efficient tuple storage so that they can be stored inline. At least then there would be fewer objects to keep track of, although it's still a problem that the view itself has to be heap-allocated because of the reference to the underlying array.

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 4, 2014

I actually didn't do any benchmarking/optimization before pushing this. But surprisingly, it looks almost indistinguishable (I'm a bit puzzled, frankly):

using Base.Cartesian
import ArrayViews

@ngenerate N Float64 function sum_cart{T,N}(A::AbstractArray{T,N}, n)
    s = 0.0
    for j = 1:n
        @nloops N i A begin
            @inbounds s += @nref N A i
        end
    end
    s
end

function sum_linear(A, n)
    s = 0.0
    for j = 1:n
        for i = 1:length(A)
            s += A[i]
        end
    end
    s
end

function create_v(A, n, index1, index2)
    B = Base.Views.subview(A, index1, index2)
    for i = 1:n
        B = Base.Views.subview(A, index1, index2)
    end
    B
end

function create_av(A, n, index1, index2)
    B = ArrayViews.view(A, index1, index2)
    for i = 1:n
        B = ArrayViews.view(A, index1, index2)
    end
    B
end


A = rand(1000,10000);
println("Contiguous")
B = create_v(A, 1, 1:1000, 1:10000);
gc(); @time B = create_v(A, 10^5, 1:1000, 1:10000);
C = create_av(A, 1, 1:1000, 1:10000);
gc(); @time B = create_av(A, 10^5, 1:1000, 1:10000);
sum_cart(A, 1)
@time sum_cart(A, 100)
sum_cart(B, 1)
@time sum_cart(B, 100)
sum_cart(C, 1)
@time sum_cart(C, 100)
sum_linear(A, 1)
@time sum_linear(A, 10)
sum_linear(B, 1)
@time sum_linear(B, 10)
sum_linear(C, 1)
@time sum_linear(C, 10)

println("Strided")
B = create_v(A, 1, 1:3:1000, 1:10000);
gc(); @time B = create_v(A, 10^5, 1:3:1000, 1:10000);
C = create_av(A, 1, 1:3:1000, 1:10000);
gc(); @time B = create_av(A, 10^5, 1:3:1000, 1:10000);
sum_cart(A, 1)
@time sum_cart(A, 100)
sum_cart(B, 1)
@time sum_cart(B, 100)
sum_cart(C, 1)
@time sum_cart(C, 100)
sum_linear(A, 1)
@time sum_linear(A, 10)
sum_linear(B, 1)
@time sum_linear(B, 10)
sum_linear(C, 1)
@time sum_linear(C, 10)

Results:

Contiguous
elapsed time: 0.009442188 seconds (17600320 bytes allocated)   # View (this)
elapsed time: 0.009511905 seconds (16000304 bytes allocated)   # ArrayViews
elapsed time: 1.3197334 seconds (96 bytes allocated)                   # sum_cart(Array)
elapsed time: 2.502015224 seconds (96 bytes allocated)               # sum_cart(View)
elapsed time: 2.502180468 seconds (96 bytes allocated)               # sum_cart(ArrayView)
elapsed time: 0.140489961 seconds (96 bytes allocated)               # sum_linear(Array)
elapsed time: 2.54033012 seconds (96 bytes allocated)                 # sum_linear(View)
elapsed time: 2.541821435 seconds (96 bytes allocated)               # sum_linear(ArrayView)
Strided
elapsed time: 0.009395409 seconds (16800320 bytes allocated)   # View (this)
elapsed time: 0.009779217 seconds (14400296 bytes allocated)   # ArrayViews
elapsed time: 1.312894481 seconds (96 bytes allocated)               # sum_cart(Array)
elapsed time: 1.444001719 seconds (96 bytes allocated)               # sum_cart(View)
elapsed time: 1.526083467 seconds (96 bytes allocated)               # sum_cart(ArrayView)
elapsed time: 0.14095664 seconds (96 bytes allocated)                 # sum_linear(Array)
elapsed time: 0.975470814 seconds (96 bytes allocated)               # sum_linear(View)
elapsed time: 0.951245512 seconds (96 bytes allocated)               # sum_linear(ArrayView)

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 4, 2014

Ah, right; if I were using the ContiguousView properly, I would have used (:, :) rather than (1:1000, 1:10000) for the indexing (even though those two are conceptually equivalent). If I do that, the performance of ArrayViews on that one benchmark (sum_linear with the ContiguousView) improves a lot:

elapsed time: 0.184469448 seconds (96 bytes allocated)

but all others remain the same.

One could, of course, augment View with a contiguity parameter in the same way as ArrayViews. However, I think we'll get rather more bang if we just finish #6437 and not worry about linear indexing.

@andreasnoack
Copy link
Member

Adding that declaration would preclude making a view of a SparseMatrixCSC, so I'd be pretty reluctant to do that. Would a more satisfying solution be typealias StridedArray{T,N,A<:DenseArray} Union(DenseArray{T,N}, View{T,N,A})?

I don't understand this comment. Your suggestion is pretty much what I thought I suggested. I don't want to restrict View to DenseArrays in general.

Have you done experiments with moving the inclusion of views.jl up in sysimg.jl? I have tried a few. To have indexing using xviews in general, it seems that views.jl should be loaded before array.jl. However, at that point no string functionality has been defined yet which you are using in views.jl.

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 9, 2014

Your suggestion is pretty much what I thought I suggested. I don't want to restrict View to DenseArrays in general.

Oh, shoot, I totally missed the second part of your statement. The part in code is strikingly similar to the declaration of the View type. Of course you were correct.

And no, I haven't tried seriously integrating this. Mostly I'm interested in finding out what people think of the tradeoff between:

  • compilation times get worse as maximum(DIMS) increases (4-5 minutes if DIMS == 1:8)
  • more restrictive behavior as maximum(DIMS) decreases (we can't create views of any array of dimensionality larger than maximum(DIMS))

stagedfunctions fix both of these. But if folks are OK with the compromises here, I can add the code to create views-of-views (that's a must-have) and look seriously at merging this.

@andreasnoack
Copy link
Member

Adding five minutes to compilation time is probably not acceptable and neither is different behavior for different values of DIMS in terms of viewing and copying. Maybe the best solution is to experiment on this branch and see how far we can get with building the system image when getindex returns Views. Would a possible solution be to special case the version for one dimension in a way that doesn't use strings such that it can go before array.jl in sysimg.jl?

@timholy
Copy link
Sponsor Member Author

timholy commented Sep 25, 2014

Now that we have staged functions, I'm redoing this.

@timholy timholy closed this Sep 25, 2014
@timholy timholy mentioned this pull request Dec 19, 2014
8 tasks
@tkelman tkelman deleted the teh/views branch August 16, 2017 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants