-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arrays with Missing Hunks #180
Comments
Hm yes, that seems feasible. Sounds like you would want to use something like a https://github.com/JuliaArrays/FillArrays.jl, the only problem is that we don't support heterogeneous chunk-types. #170 as an example is actually about ensuring that the chunk-types by construction are consistent. It might be feasible to do promote to a joint super-type, but that might break other things |
Just for reference: for the data I'm currently working with it's the difference between 700 GB and 2 TB of RAM. FillArrays looks pretty perfect. I wonder if you have suggestions of places I could look in the codebase if I wanted to try hacking on this? |
#170 modifies exactly the core constructor that you would want to modify. Once you have a DArray that is constructed in the way you want it comes with the challenge of making sure that all the operations you are interested in work. |
I'll take a dig. There might be a cheap-hack approach, though: having multiple sections of the DArray reference the same underlying memory: @everywhere using DistributedArrays
r1 = @spawnat 2 zeros(4,4)
r2 = @spawnat 3 zeros(4,4)
r3 = @spawnat 4 rand(4,4)
ras = [r1 r2; r3 r3]
D = DArray(ras) Unfortunately, my experiments in #183 make me nervous about this. |
Hm yes I suspect that there might be quite some functions written with the assumption that each chunk will be on a different processor, even though it clearly doesn't need to be assumed. |
The following seems to work for me: @everywhere using DistributedArrays
@everywhere using FillArrays
r1 = @spawnat 2 FillArrays.Zeros(4,4)
r2 = @spawnat 3 FillArrays.Zeros(4,4)
r3 = @spawnat 4 rand(4,4)
r5 = @spawnat 5 rand(4,4)
ras = [r1 r3; r5 r2]
D = DArray(ras)
[@fetchfrom p typeof(D[:L]) for p in workers()]
[@fetchfrom p eltype(D[:L]) for p in workers()] And this can be done even with #170 in place. |
Hm that fails for me on current master:
|
Ah it works on #175..., but we need a method to select the "right" array-type. Since right now it could happen that it chooses a different T as a primary eltype and I think the process local version of this have diverged... |
I'm running (Is there a good development cycle for working from the master branch of a local repo?) Isn't |
Is it possible to make the type more general to encompass the possibility of submatrices of different types? |
I generally use
Yes the eltype being consitent is even more important and that is an invariant we need to uphold.
There are two alternatives:
and then teach Or |
Is it feasible that DArray could support distributed arrays in which pieces are missing?
For instance, say I have the following:
where each
X
represents a dense portion of an array and the blanks represent a portion which can be modeled as representing a single value throughout.In my mind, DArray has some sort of address table and, when it does a lookup, it could note the gap and return an appropriately formatted response using the constant value.
The text was updated successfully, but these errors were encountered: