Arrays with Missing Hunks #180

r-barnes · 2018-10-01T18:03:07Z

Is it feasible that DArray could support distributed arrays in which pieces are missing?

For instance, say I have the following:

X X    X
X X X X
X X

where each X represents a dense portion of an array and the blanks represent a portion which can be modeled as representing a single value throughout.

In my mind, DArray has some sort of address table and, when it does a lookup, it could note the gap and return an appropriately formatted response using the constant value.

The text was updated successfully, but these errors were encountered:

vchuravy · 2018-10-01T18:22:48Z

Hm yes, that seems feasible. Sounds like you would want to use something like a https://github.com/JuliaArrays/FillArrays.jl, the only problem is that we don't support heterogeneous chunk-types. #170 as an example is actually about ensuring that the chunk-types by construction are consistent.

It might be feasible to do promote to a joint super-type, but that might break other things
like similar. So yes feasible in theory, but not yet in practice.

r-barnes · 2018-10-01T19:16:02Z

Just for reference: for the data I'm currently working with it's the difference between 700 GB and 2 TB of RAM.

FillArrays looks pretty perfect. I wonder if you have suggestions of places I could look in the codebase if I wanted to try hacking on this?

vchuravy · 2018-10-01T19:24:16Z

#170 modifies exactly the core constructor that you would want to modify. Once you have a DArray that is constructed in the way you want it comes with the challenge of making sure that all the operations you are interested in work.

r-barnes · 2018-10-01T19:56:18Z

I'll take a dig.

There might be a cheap-hack approach, though: having multiple sections of the DArray reference the same underlying memory:

@everywhere using DistributedArrays
r1  = @spawnat 2 zeros(4,4) 
r2  = @spawnat 3 zeros(4,4) 
r3  = @spawnat 4 rand(4,4)
ras = [r1 r2; r3 r3]
D   = DArray(ras)

Unfortunately, my experiments in #183 make me nervous about this.

vchuravy · 2018-10-01T20:27:30Z

Hm yes I suspect that there might be quite some functions written with the assumption that each chunk will be on a different processor, even though it clearly doesn't need to be assumed.

r-barnes · 2018-10-01T21:16:01Z

The following seems to work for me:

@everywhere using DistributedArrays
@everywhere using FillArrays
r1  = @spawnat 2 FillArrays.Zeros(4,4) 
r2  = @spawnat 3 FillArrays.Zeros(4,4) 
r3  = @spawnat 4 rand(4,4)
r5  = @spawnat 5 rand(4,4)
ras = [r1 r3; r5 r2]
D   = DArray(ras)

[@fetchfrom p typeof(D[:L]) for p in workers()]
[@fetchfrom p eltype(D[:L]) for p in workers()]

And this can be done even with #170 in place.

vchuravy · 2018-10-01T21:33:30Z

Hm that fails for me on current master:

D   = DArray(ras)
ERROR: MethodError: no method matching Zeros{Float64,2,Tuple{Int64,Int64}}(::Array{Float64,2})
Stacktrace:
 [1] empty_localpart(::Type, ::Int64, ::Type) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:66
 [2] macro expansion at ./task.jl:264 [inlined]
 [3] macro expansion at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:84 [inlined]
 [4] macro expansion at ./task.jl:244 [inlined]
 [5] DArray(::Tuple{Int64,Int64}, ::Array{Future,2}, ::Tuple{Int64,Int64}, ::Array{Int64,2}, ::Array{Tuple{UnitRange{Int64},UnitRange{Int64}},2}, ::Array{Array{Int64,1},1}) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:82
 [6] macro expansion at ./task.jl:266 [inlined]
 [7] macro expansion at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:194 [inlined]
 [8] macro expansion at ./task.jl:244 [inlined]
 [9] DArray(::Array{Future,2}) at /home/vchuravy/.julia/dev/DistributedArrays/src/darray.jl:192
 [10] top-level scope at none:0

vchuravy · 2018-10-01T21:40:36Z

Ah it works on #175..., but we need a method to select the "right" array-type. Since right now it could happen that it chooses a different T as a primary eltype and I think the process local version of this have diverged...

r-barnes · 2018-10-01T22:03:28Z

I'm running [aaf54ef3] DistributedArrays v0.5.1 right now.

(Is there a good development cycle for working from the master branch of a local repo?)

Isn't eltype the type of the data held by an array? It seems as though that could be enforced constant across the entire DArray, even if the containers holding the elements differ.

r-barnes · 2018-10-01T22:11:08Z

Is it possible to make the type more general to encompass the possibility of submatrices of different types?

vchuravy · 2018-10-01T22:53:35Z

(Is there a good development cycle for working from the master branch of a local repo?)

I generally use ]dev for that which can also take a local path

Isn't eltype the type of the data held by an array? It seems as though that could be enforced constant across the entire DArray, even if the containers holding the elements differ.

Yes the eltype being consitent is even more important and that is an invariant we need to uphold.

Is it possible to make the type more general to encompass the possibility of submatrices of different types?

There are two alternatives:

a = Fill(1.0, 2, 2)
b = zeros(2, 2)

AT = Union{typeof(a), typeof(b)}

and then teach DArray to reason about this union (immediate question would be what is similar(AT))

Or promote_type(typeof(a), typeof(b)) which is an AbstractArray{Float64, 2} so that could work quite well.

vchuravy mentioned this issue Oct 1, 2018

[WIP/DNM]make similar produce the right chunktype #175

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arrays with Missing Hunks #180

Arrays with Missing Hunks #180

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018 •

edited

Loading

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018 •

edited

Loading

vchuravy commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018 •

edited

Loading

Arrays with Missing Hunks #180

Arrays with Missing Hunks #180

Comments

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018 • edited Loading

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018 • edited Loading

vchuravy commented Oct 1, 2018

vchuravy commented Oct 1, 2018

r-barnes commented Oct 1, 2018

r-barnes commented Oct 1, 2018

vchuravy commented Oct 1, 2018 • edited Loading

r-barnes commented Oct 1, 2018 •

edited

Loading

r-barnes commented Oct 1, 2018 •

edited

Loading

vchuravy commented Oct 1, 2018 •

edited

Loading