Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix up Euclidean distance at the origin #787

Merged
merged 11 commits into from
Sep 11, 2020
Merged

Conversation

willtebbutt
Copy link
Member

@willtebbutt willtebbutt commented Sep 10, 2020

(Hopefully) fixes some Euclidean distance issues when the inputs are close together.

Fixes JuliaGaussianProcesses/KernelFunctions.jl#166

@molet could you try this out and see whether or not it fixes your issue?

I've introduced FiniteDifferences as a test dep because Zygote's own finite differencing isn't sufficiently accurate in this case. I'm not completely sure why.

src/lib/distances.jl Outdated Show resolved Hide resolved
src/lib/distances.jl Outdated Show resolved Hide resolved
@molet
Copy link

molet commented Sep 10, 2020

Fixes JuliaGaussianProcesses/KernelFunctions.jl#166

@molet could you try this out and see whether or not it fixes your issue?

Yes, it will fix the issue above:

julia> x = rand(1, 10)
julia> k_se(θ) = TransformedKernel(SEKernel(), ScaleTransform(θ))
julia> k_exp(θ) = TransformedKernel(ExponentialKernel(), ScaleTransform(θ))
julia> θ = 2.0

julia> for k in [k_se, k_exp]
           obj(θ) = sum(kernelmatrix(k(θ), x, x[:, 1:1]; obsdim = 2))
           println((obj(θ), Zygote.gradient(obj, θ)))
       end
(8.579178644842253, (-1.1450487011851709,))
(7.600532275422401, (-0.9108241610417317,))

@willtebbutt
Copy link
Member Author

@DhairyaLGandhi I think this looks good now. It's already gone through a round of review with @molet who found the problem in the first place and agrees this fixes it. I'm going to be unavailable for the next week or so, so would appreciate if you could handle this from here -- feel free to push to this branch directly if you feel that there are more changes necessary.

test/runtests.jl Outdated
@testset "Interface" begin
include("interface.jl")
end
# @testset "Interface" begin
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these need to be uncommented?

src/lib/distances.jl Outdated Show resolved Hide resolved
@@ -59,26 +59,30 @@ end
@adjoint function colwise(s::Euclidean, x::AbstractMatrix, y::AbstractMatrix)
d = colwise(s, x, y)
return d, function (Δ::AbstractVector)
x̄ = (Δ ./ d)' .* (x .- y)
x̄ = (Δ ./ max.(d, eps(eltype(d))))' .* (x .- y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can do a similar trick here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this would work:

.* map(di->ifelse(iszero(di), di, inv(di)), d))' .* (x .- y)

Copy link
Member

@oxinabox oxinabox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can generalize https://github.com/FluxML/Zygote.jl/pull/787/files#r487137860
then that would be cool.
If not then i think it should be done there still.

Don't forget to uncomment out the tests before merging.

willtebbutt and others added 3 commits September 11, 2020 17:11
@willtebbutt
Copy link
Member Author

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 11, 2020

Build succeeded:

@bors bors bot merged commit ed366f3 into master Sep 11, 2020
δ = sqrt(sum(abs2, D))
function euclidean(Δ::Real)
x̄ = (Δ / δ) .* D
x̄ = ifelse(iszero(δ), D, (Δ / δ) .* D)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this mathematically incorrect if x !== y? E.g., in the 1D case the derivative of the Euclidean distance x -> |x - a| with respect to x is not defined for x = a since the limit from the left and right do not agree (they are -1 and 1, so also not close to 0).

I understand that this might be unsatisfying for practical applications, also due to floating point issues, but it seems mathematically actually the NaN value resulting from the previous implementation makes sense if x !== y.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we tend to use a subgradient-based convention that if gradient limit from the left has opposite sign to the gradient approaching from the right.
Or more generally: if 0 is an element of the subgradient,
then we say the gradient is zero.

This has the benificial effect of ensuring that the gradient is considered to be zero at local minima,
which from perspective of doing gradient based optimisation is what you want.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 👍 Thanks for the explanation!

@oxinabox oxinabox deleted the wct/euclidean-distance-fix branch September 11, 2020 17:46
@devmotion
Copy link
Collaborator

It seems there is no new release yet? Would it be possible to make one?

@oxinabox
Copy link
Member

Will is on holidays, I guess he forgot to make one.
I will make one now

@oxinabox
Copy link
Member

Ah Dhairya is ahead of me
ed366f3#commitcomment-42336366

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conditioning Euclidean distance
4 participants