Fix DeepONet for CUDA #52

yuehhua · 2022-03-08T15:43:29Z

Resolves #49, @Abhishek-1Bhatt you could take a look.

codecov · 2022-03-08T15:49:15Z

Codecov Report

Merging #52 (65e0749) into master (cf8f4bd) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #52   +/-   ##
=======================================
  Coverage   93.33%   93.33%           
=======================================
  Files           6        6           
  Lines          90       90           
=======================================
  Hits           84       84           
  Misses          6        6

Impacted Files	Coverage Δ
src/DeepONet.jl	`60.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf8f4bd...65e0749. Read the comment docs.

ba2tro · 2022-03-09T01:32:17Z

src/DeepONet.jl

@@ -116,7 +116,7 @@ function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat)
    However, we perform the transformations by the NNs always in the first dim
    so we need to adjust (i.e. transpose) one of the inputs,
    which we do on the branch input here =#
-    return Array(branch(x)') * trunk(y)
+    return branch(x)' * trunk(y)


This change would lead to allocation (and hence, can affect the speed of forward pass), as typeof(x') will be LinearAlgebra.Adjoint{Float64, Matrix{Float64}} which isn't a concrete type, by wrapping Array() around it we make it concrete matrix type. It was introduced recently in #45 .

julia> using LinearAlgebra julia> isconcretetype(LinearAlgebra.Adjoint{Float64, Matrix{Float64}}) true

What you did in #45 is to bring parametric datatype to DeepONet, which is helpful. But here, the function function (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat) is despetched depends on different type of x and y, which are concrete. And the return type is also depends on the despetched type as will, so there is no type unstable problem 😃

Use of Array blocks the train from CUDA, and this is the root cause this example could not run on CUDA.

This change would lead to allocation (and hence, can affect the speed of forward pass)

What you said is not true, instead, Array allocates new memory and slow down forward pass in this model. Taking off Array should be nice to this model.

Oh, might be because I was looking at the cpu performance, anyways there wasn't a huge difference either way and if it causing it to fail on GPU, then it should surely be removed 😊

foldfelis · 2022-03-09T01:50:34Z

example/Burgers/src/Burgers_deeponet.jl

-function train_don()
-    # if has_cuda()
-    #     @info "CUDA is on"
-    #     device = gpu
-    #     CUDA.allowscalar(false)
-    # else
+function train_don(; n=300, cuda=true, learning_rate=0.001, epochs=400)
+    if cuda && has_cuda()
+        @info "Training on GPU"
+        device = gpu
+    else
+        @info "Training on CPU"


Follow the same style?

I keep the example style close to Flux-style or style in model-zoo. Should be consistent.

OK, then I'll make other examples follow the same style after this PR is merged.

foldfelis · 2022-03-09T02:28:07Z

Looks good to me

yuehhua · 2022-03-09T04:37:25Z

Type stable

julia> using NeuralOperators

julia> using Flux

julia> using CUDA

julia> batch_size = 2
2

julia> a = [0.83541104, 0.83479851, 0.83404712, 0.83315711, 0.83212979, 0.83096755,
                    0.82967374, 0.82825263, 0.82670928, 0.82504949, 0.82327962, 0.82140651,
                    0.81943734, 0.81737952, 0.8152405, 0.81302771];

julia> a = repeat(a, outer=(1, batch_size)) |> gpu;

julia> sensors = collect(range(0, 1, length=16)');

julia> sensors = repeat(sensors, outer=(batch_size, 1)) |> gpu;

julia> model = DeepONet((16, 22, 30), (2, 16, 24, 30), σ, tanh;
                   init_branch=Flux.glorot_normal, bias_trunk=false) |> gpu
DeepONet with
branch net: (Chain(Dense(16, 22, σ), Dense(22, 30, σ)))
Trunk net: (Chain(Dense(2, 16, tanh; bias=false), Dense(16, 24, tanh; bias=false), Dense(24, 30, tanh; bias=false)))

julia> y = model(a, sensors);

julia> @code_warntype model(a, sensors)
MethodInstance for (::DeepONet{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}})(::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
  from (a::DeepONet)(x::AbstractArray, y::AbstractVecOrMat) in NeuralOperators at /media/yuehhua/Workbench/workspace/NeuralOperators.jl/src/DeepONet.jl:111
Arguments
  a::DeepONet{Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}}
  x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
  y::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
Locals
  trunk::Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}
  branch::Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}
Body::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
1 ─ %1 = Base.getproperty(a, :branch_net)::Chain{Tuple{Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(σ), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}
│   %2 = Base.getproperty(a, :trunk_net)::Chain{Tuple{Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}, Dense{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Flux.Zeros}}}
│        (branch = %1)
│        (trunk = %2)
│   %5 = (branch)(x)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
│   %6 = NeuralOperators.:var"'"(%5)::LinearAlgebra.Adjoint{Float32, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}
│   %7 = (trunk)(y)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
│   %8 = (%6 * %7)::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
└──      return %8

fix DeepONet for CUDA

65e0749

yuehhua requested a review from foldfelis March 8, 2022 15:43

ba2tro reviewed Mar 9, 2022

View reviewed changes

foldfelis reviewed Mar 9, 2022

View reviewed changes

foldfelis merged commit 93cfd2d into SciML:master Mar 9, 2022

yuehhua deleted the deeponet branch March 17, 2022 03:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix DeepONet for CUDA #52

Fix DeepONet for CUDA #52

yuehhua commented Mar 8, 2022

codecov bot commented Mar 8, 2022 •

edited

Loading

ba2tro Mar 9, 2022

foldfelis Mar 9, 2022

foldfelis Mar 9, 2022

yuehhua Mar 9, 2022

ba2tro Mar 9, 2022

foldfelis Mar 9, 2022

yuehhua Mar 9, 2022

foldfelis Mar 9, 2022

foldfelis commented Mar 9, 2022

yuehhua commented Mar 9, 2022

Fix DeepONet for CUDA #52

Fix DeepONet for CUDA #52

Conversation

yuehhua commented Mar 8, 2022

codecov bot commented Mar 8, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

foldfelis commented Mar 9, 2022

yuehhua commented Mar 9, 2022

codecov bot commented Mar 8, 2022 •

edited

Loading