-
-
Notifications
You must be signed in to change notification settings - Fork 41
Conversation
Codecov Report
@@ Coverage Diff @@
## master #52 +/- ##
=======================================
Coverage 93.33% 93.33%
=======================================
Files 6 6
Lines 90 90
=======================================
Hits 84 84
Misses 6 6
Continue to review full report at Codecov.
|
@@ -116,7 +116,7 @@ function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat) | |||
However, we perform the transformations by the NNs always in the first dim | |||
so we need to adjust (i.e. transpose) one of the inputs, | |||
which we do on the branch input here =# | |||
return Array(branch(x)') * trunk(y) | |||
return branch(x)' * trunk(y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change would lead to allocation (and hence, can affect the speed of forward pass), as typeof(x') will be LinearAlgebra.Adjoint{Float64, Matrix{Float64}}
which isn't a concrete type, by wrapping Array()
around it we make it concrete matrix type. It was introduced recently in #45 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
julia> using LinearAlgebra
julia> isconcretetype(LinearAlgebra.Adjoint{Float64, Matrix{Float64}})
true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you did in #45 is to bring parametric datatype to DeepONet, which is helpful. But here, the function function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat)
is despetched depends on different type of x
and y
, which are concrete. And the return type is also depends on the despetched type as will, so there is no type unstable problem 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of Array
blocks the train from CUDA, and this is the root cause this example could not run on CUDA.
This change would lead to allocation (and hence, can affect the speed of forward pass)
What you said is not true, instead, Array
allocates new memory and slow down forward pass in this model. Taking off Array
should be nice to this model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, might be because I was looking at the cpu performance, anyways there wasn't a huge difference either way and if it causing it to fail on GPU, then it should surely be removed 😊
function train_don() | ||
# if has_cuda() | ||
# @info "CUDA is on" | ||
# device = gpu | ||
# CUDA.allowscalar(false) | ||
# else | ||
function train_don(; n=300, cuda=true, learning_rate=0.001, epochs=400) | ||
if cuda && has_cuda() | ||
@info "Training on GPU" | ||
device = gpu | ||
else | ||
@info "Training on CPU" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the same style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep the example style close to Flux-style or style in model-zoo. Should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then I'll make other examples follow the same style after this PR is merged.
Looks good to me |
Type stable julia> using NeuralOperators
julia> using Flux
julia> using CUDA
julia> batch_size = 2
2
julia> a = [0.83541104, 0.83479851, 0.83404712, 0.83315711, 0.83212979, 0.83096755,
0.82967374, 0.82825263, 0.82670928, 0.82504949, 0.82327962, 0.82140651,
0.81943734, 0.81737952, 0.8152405, 0.81302771];
julia> a = repeat(a, outer=(1, batch_size)) |> gpu;
julia> sensors = collect(range(0, 1, length=16)');
julia> sensors = repeat(sensors, outer=(batch_size, 1)) |> gpu;
julia> model = DeepONet((16, 22, 30), (2, 16, 24, 30), σ, tanh;
init_branch=Flux.glorot_normal, bias_trunk=false) |> gpu
DeepONet with
branch net: (Chain(Dense(16, 22, σ), Dense(22, 30, σ)))
Trunk net: (Chain(Dense(2, 16, tanh; bias=false), Dense(16, 24, tanh; bias=false), Dense(24, 30, tanh; bias=false)))
julia> y = model(a, sensors);
julia> @code_warntype model(a, sensors)
MethodInstance for (::DeepONet{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}})(::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
from (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat) in NeuralOperators at /media/yuehhua/Workbench/workspace/NeuralOperators.jl/src/DeepONet.jl:111
Arguments
a::DeepONet{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}}
x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
y::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
Locals
trunk::Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}
branch::Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}
Body::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
1 ─ %1 = Base.getproperty(a, :branch_net)::Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}
│ %2 = Base.getproperty(a, :trunk_net)::Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}
│ (branch = %1)
│ (trunk = %2)
│ %5 = (branch)(x)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
│ %6 = NeuralOperators.:var"'"(%5)::LinearAlgebra.Adjoint{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}
│ %7 = (trunk)(y)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
│ %8 = (%6 * %7)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
└── return %8 |
Resolves #49, @Abhishek-1Bhatt you could take a look.