Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leonenko-Prozano-Savani differential estimator for Shannon, Renyi and Tsallis entropies #310

Merged
merged 8 commits into from
Aug 28, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ Further additions to the library in v3:
- New entropy definition: identification entropy (`Identification`).
- Minor documentation fixes.
- `GaussianCDFEncoding` now can be used with vector-valued inputs.
- New `LeonenkoProzantoSavani` differential entropy estimator. Works with `Shannon`,
`Renyi` and `Tsallis` entropies.

### Bug fixes

Expand Down
17 changes: 16 additions & 1 deletion docs/refs.bib
Original file line number Diff line number Diff line change
Expand Up @@ -832,4 +832,19 @@ @article{Amigó2018
pages={813},
year={2018},
publisher={MDPI}
}
}

@article{LeonenkoProzantoSavani2008,
author = {Nikolai Leonenko and Luc Pronzato and Vippal Savani},
title = {A class of Rényi information estimators for multidimensional densities},
volume = {36},
journal = {The Annals of Statistics},
number = {5},
publisher = {Institute of Mathematical Statistics},
pages = {2153 -- 2182},
abstract = {A class of estimators of the Rényi and Tsallis entropies of an unknown distribution f in ℝm is presented. These estimators are based on the kth nearest-neighbor distances computed from a sample of N i.i.d. vectors with distribution f. We show that entropies of any order q, including Shannon’s entropy, can be estimated consistently with minimal assumptions on f. Moreover, we show that it is straightforward to extend the nearest-neighbor method to estimate the statistical distance between two distributions using one i.i.d. sample from each.},
keywords = {Entropy estimation, estimation of divergence, estimation of statistical distance, Havrda–Charvát entropy, nearest-neighbor distances, Rényi entropy, Tsallis entropy},
year = {2008},
doi = {https://doi.org/10.1214/07-AOS539},
URL = {https://doi.org/10.1214/07-AOS539}
}
132 changes: 129 additions & 3 deletions docs/src/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,11 @@ kldivergence(py, px)

## Differential entropy: estimator comparison

### Shannon entropy

Here, we compare how the nearest neighbor differential entropy estimators
([`Kraskov`](@ref), [`KozachenkoLeonenko`](@ref), [`Zhu`](@ref), [`ZhuSingh`](@ref), etc.)
converge towards the true entropy value for increasing time series length.
converge towards the true [`Shannon`](@ref) entropy value for increasing time series length.

ComplexityMeasures.jl also provides entropy estimators based on
[order statistics](https://en.wikipedia.org/wiki/Order_statistic). These estimators
Expand Down Expand Up @@ -100,7 +102,8 @@ knn_estimators = [
Gao(ent; k = 3, corrected = false, w),
Gao(ent; k = 3, corrected = true, w),
Goria(ent; k = 3, w),
Lord(ent; k = 20, w) # more neighbors for accurate ellipsoid estimation
Lord(ent; k = 20, w), # more neighbors for accurate ellipsoid estimation
LeonenkoProzantoSavani(ent; k = 3),
]

# Test each estimator `nreps` times over time series of varying length.
Expand Down Expand Up @@ -137,7 +140,7 @@ end
# -------------
fig = Figure(resolution = (700, 11 * 200))
labels_knn = ["KozachenkoLeonenko", "Kraskov", "Zhu", "ZhuSingh", "Gao (not corrected)",
"Gao (corrected)", "Goria", "Lord"]
"Gao (corrected)", "Goria", "Lord", "LeonenkoProzantoSavani"]
labels_os = ["Vasicek", "Ebrahimi", "AlizadehArghami", "Correa"]

for (i, e) in enumerate(knn_estimators)
Expand Down Expand Up @@ -169,6 +172,129 @@ fig
All estimators approach the true differential entropy, but those based on order statistics
are negatively biased for small sample sizes.

### Rényi entropy

Here, we see how the [`LeonenkoProzantoSavani`](@ref) estimator approaches the known
target [`Rényi`](@ref) entropy of a multivariate normal distribution
for increasing time series length. We'll consider the Rényi entropy with `q = 2`.

```@example MAIN

using ComplexityMeasures
import ComplexityMeasures: information # we're overriding this function in the example
using CairoMakie, Statistics
using Distributions: MvNormal
import Distributions.entropy as dentropy
using Random
rng = MersenneTwister(1234)

"""
information(e::Renyi, 𝒩::MvNormal; base = 2)

Compute the analytical value of the `Renyi` entropy for a multivariate normal distribution.
"""
function information(e::Renyi, 𝒩::MvNormal; base = 2)
q = e.q
if q ≈ 1.0
h = dentropy(𝒩)
else
Σ = 𝒩.Σ
D = length(𝒩.μ)
h = dentropy(𝒩) - (D / 2) * (1 + log(q) / (1 - q))
end
return convert_logunit(h, ℯ, base)
end

nreps = 30
Ns = [100:100:500; 1000:1000:5000]
def = Renyi(q = 2, base = 2)

𝒩 = MvNormal([-1, 1], [1, 0.5])
h_true = information(def, 𝒩; base = 2)

# Estimate `nreps` times for each time series length

hs = [zeros(nreps) for N in Ns]
for (i, N) in enumerate(Ns)
for j = 1:nreps
pts = StateSpaceSet(transpose(rand(rng, 𝒩, N)))
hs[i][j] = information(LeonenkoProzantoSavani(def; k = 5), pts)
end
end

# We plot the mean and standard deviation of the estimator again the true value
hs_mean, hs_stdev = mean.(hs), std.(hs)

fig = Figure()
ax = Axis(fig[1, 1]; ylabel = "h (bits)")
lines!(ax, Ns, hs_mean; color = Cycled(1), label = "LeonenkoProzantoSavani")
band!(ax, Ns, hs_mean .+ hs_stdev, hs_mean .- hs_stdev,
alpha = 0.5, color = (Main.COLORS[1], 0.5))
hlines!(ax, [h_true], color = :black, lw = 5, linestyle = :dash)
axislegend()
fig
```

### Tsallis entropy

Here, we see how the [`LeonenkoProzantoSavani`](@ref) estimator approaches the known
target [`Tsallis`](@ref) entropy of a multivariate normal distribution
for increasing time series length. We'll consider the Rényi entropy with `q = 2`.

```@example MAIN
using ComplexityMeasures
import ComplexityMeasures: information # we're overriding this function in the example
using CairoMakie, Statistics
using Distributions: MvNormal
import Distributions.entropy as dentropy
using Random
rng = MersenneTwister(1234)

"""
information(e::Tsallis, 𝒩::MvNormal; base = 2)

Compute the analytical value of the `Tsallis` entropy for a multivariate normal distribution.
"""
function information(e::Tsallis, 𝒩::MvNormal; base = 2)
q = e.q
Σ = 𝒩.Σ
D = length(𝒩.μ)
# uses the function from the example above
hr = information(Renyi(q = q), 𝒩; base = ℯ) # stick with natural log, convert after
h = (exp((1 - q) * hr) - 1) / (1 - q)
return convert_logunit(h, ℯ, base)
end

nreps = 30
Ns = [100:100:500; 1000:1000:5000]
def = Tsallis(q = 2, base = 2)

𝒩 = MvNormal([-1, 1], [1, 0.5])
h_true = information(def, 𝒩; base = 2)

# Estimate `nreps` times for each time series length

hs = [zeros(nreps) for N in Ns]
for (i, N) in enumerate(Ns)
for j = 1:nreps
pts = StateSpaceSet(transpose(rand(rng, 𝒩, N)))
hs[i][j] = information(LeonenkoProzantoSavani(def; k = 5), pts)
end
end

# We plot the mean and standard deviation of the estimator again the true value
hs_mean, hs_stdev = mean.(hs), std.(hs)

fig = Figure()
ax = Axis(fig[1, 1]; ylabel = "h (bits)")
lines!(ax, Ns, hs_mean; color = Cycled(1), label = "LeonenkoProzantoSavani")
band!(ax, Ns, hs_mean .+ hs_stdev, hs_mean .- hs_stdev,
alpha = 0.5, color = (Main.COLORS[1], 0.5))
hlines!(ax, [h_true], color = :black, lw = 5, linestyle = :dash)
axislegend()
fig
```

## Discrete entropy: permutation entropy

This example plots permutation entropy for time series of the chaotic logistic map. Entropy estimates using [`WeightedOrdinalPatterns`](@ref)
Expand Down
1 change: 1 addition & 0 deletions docs/src/information_measures.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ ZhuSingh
Gao
Goria
Lord
LeonenkoProzantoSavani
Vasicek
AlizadehArghami
Ebrahimi
Expand Down
1 change: 1 addition & 0 deletions src/core/information_measures.jl
Original file line number Diff line number Diff line change
Expand Up @@ -186,5 +186,6 @@ See [`information`](@ref) for usage.
- [`Correa`](@ref).
- [`Vasicek`](@ref).
- [`Ebrahimi`](@ref).
- [`LeonenkoProzantoSavani`](@ref).
"""
abstract type DifferentialInfoEstimator{I <: InformationMeasure} <: InformationMeasureEstimator{I} end
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
using SpecialFunctions: gamma
using Neighborhood: bulksearch
using Neighborhood: Euclidean, Theiler

export LeonenkoProzantoSavani

"""
LeonenkoProzantoSavani <: DifferentialInfoEstimator
LeonenkoProzantoSavani(definition = Shannon(); k = 1, w = 0)

The `LeonenkoProzantoSavani` estimator [LeonenkoProzantoSavani2008](@cite)
computes the [`Shannon`](@ref), [`Renyi`](@ref), or
[`Tsallis`](@ref) differential [`information`](@ref) of a multi-dimensional
[`StateSpaceSet`](@ref), with logarithms to the `base` specified in `definition`.

## Description

The estimator uses `k`-th nearest-neighbor searches. `w` is the Theiler window, which
determines if temporal neighbors are excluded during neighbor searches (defaults to `0`,
meaning that only the point itself is excluded when searching for neighbours).

For details, see [LeonenkoProzantoSavani2008](@citet).
"""
struct LeonenkoProzantoSavani{I <: InformationMeasure} <: DifferentialInfoEstimator{I}
definition::I
k::Int
w::Int

function LeonenkoProzantoSavani(definition::I, k, w) where I
if !(I <: Shannon || (I <: Renyi) || I <: Tsallis)
s = "`definition` must be either a `Shannon`, `Renyi` or `Tsallis` instance."
throw(ArgumentError(s))
end
if k <= 1
throw(ArgumentError("`k` must be ≥ 2. Got k=$k."))
end

new{I}(definition, k, w)
end
end
function LeonenkoProzantoSavani(definition = Shannon(); k = 2, w = 0)
return LeonenkoProzantoSavani(definition, k, w)
end

function information(est::LeonenkoProzantoSavani, x::AbstractVector{<:Real})
return information(est, StateSpaceSet(x))
end

function information(est::LeonenkoProzantoSavani{<:Shannon}, x::AbstractStateSpaceSet)
h = Î(1.0, est, x) # measured in nats
return convert_logunit(h, ℯ, est.definition.base)
end

function information(est::LeonenkoProzantoSavani{<:Renyi}, x::AbstractStateSpaceSet)
q = est.definition.q
base = est.definition.base

if q ≈ 1.0
h = Î(q, est, x) # measured in nats
else
h = log(Î(q, est, x)) / (1 - q) # measured in nats
end
return convert_logunit(h, ℯ, base)
end

function information(est::LeonenkoProzantoSavani{<:Tsallis}, x::AbstractStateSpaceSet)
q = est.definition.q
base = est.definition.base

if q ≈ 1.0
h = Î(q, est, x) # measured in nats
else
h = (Î(q, est, x) - 1) / (1 - q) # measured in nats
end
return convert_logunit(h, ℯ, base)
end

# TODO: this gives nan??
kahaaga marked this conversation as resolved.
Show resolved Hide resolved
# Use notation from original paper
function Î(q, est::LeonenkoProzantoSavani, x::AbstractStateSpaceSet{D}) where D
(; k, w) = est
N = length(x)
Vₘ = ball_volume(D)
Cₖ = (gamma(k) / gamma(k + 1 - q))^(1 / (1 - q))
tree = KDTree(x, Euclidean())
idxs, ds = bulksearch(tree, x, NeighborNumber(k), Theiler(w))
if q ≈ 1.0 # equations 3.9 & 3.10 in Leonenko et al. (2008)
kahaaga marked this conversation as resolved.
Show resolved Hide resolved
h = (1 / N) * sum(log.(ξᵢ_shannon(last(dᵢ), Vₘ, N, D, k) for dᵢ in ds))
else # equations 3.1 & 3.2 in Leonenko et al. (2008)
h = (1 / N) * sum(ξᵢ_renyi_tsallis(last(dᵢ), Cₖ, Vₘ, N, D)^(1 - q) for dᵢ in ds)
end
return h
end
ξᵢ_renyi_tsallis(dᵢ, Cₖ, Vₘ, N::Int, D::Int) = (N - 1) * Cₖ * Vₘ * (dᵢ)^D
ξᵢ_shannon(dᵢ, Vₘ, N::Int, D::Int, k) = (N - 1) * exp(-digamma(k)) * Vₘ * (dᵢ)^D
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ include("ZhuSingh.jl")
include("Gao.jl")
include("Goria.jl")
include("Lord.jl")
include("LeonenkoProzantoSavani.jl")
1 change: 1 addition & 0 deletions test/infomeasures/estimators/estimators.jl
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ testfile("ebrahimi.jl")
testfile("vasicek.jl")
testfile("gao.jl")
testfile("lord.jl")
testfile("leonenkoprozantosavani.jl")
85 changes: 85 additions & 0 deletions test/infomeasures/estimators/leonenkoprozantosavani.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
using ComplexityMeasures, Test
using ComplexityMeasures: convert_logunit
import ComplexityMeasures: information
using Random
rng = Xoshiro(1234)

# Constructors
@test LeonenkoProzantoSavani(Shannon()) isa LeonenkoProzantoSavani{<:Shannon}

@test_throws ArgumentError LeonenkoProzantoSavani(Kaniadakis())
@test_throws ArgumentError LeonenkoProzantoSavani(k = 1)

# -------------------------------------------------------------------------------------
# Check if the estimator converge to true values for some distributions with
# analytically derivable entropy.
# -------------------------------------------------------------------------------------
# Entropy to log with base b of a uniform distribution on [0, 1] = ln(1 - 0)/(ln(b)) = 0
U = 0.00
pts = rand(rng, npts)
ea = information(LeonenkoProzantoSavani(Shannon(base = 2), k = 5), pts)
@test U - max(0.01, U*0.03) ≤ ea ≤ U + max(0.01, U*0.03)

# Entropy with natural log of 𝒩(0, 1) is 0.5*ln(2π) + 0.5.
N = 0.5*log(2π) + 0.5
N_base3 = ComplexityMeasures.convert_logunit(N, ℯ, 3)
npts = 100000

pts = randn(rng, npts)
ea_n3_s = information(LeonenkoProzantoSavani(Shannon(base = 3), k = 5), pts)
ea_n3_r = information(LeonenkoProzantoSavani(Renyi(base = 3, q = 1), k = 5), pts)
ea_n3_t = information(LeonenkoProzantoSavani(Tsallis(base = 3, q = 1), k = 5), pts)
@test ea_n3_r ≈ ea_n3_t ≈ ea_n3_s

# Check that we're less than 10% off target
@test abs(N_base3 - ea_n3)/N_base3 ≤ 0.10

# -------------------------------------------------------------------------------------
# Renyi entropy.
# ------------------------------------------------------------------------------------
using Distributions: MvNormal
import Distributions.entropy as dentropy
function information(e::Renyi, 𝒩::MvNormal; base = 2)
q = e.q
if q ≈ 1.0
h = dentropy(𝒩)
else
Σ = 𝒩.Σ
D = length(𝒩.μ)
h = dentropy(𝒩) - (D / 2) * (1 + log(q) / (1 - q))
end
return convert_logunit(h, ℯ, base)
end

# We know the analytical expression for the Rényi entropy of a multivariate normal.
# It is implemented in the function above.
𝒩 = MvNormal([0, 1], [1, 1])
h_true = information(Renyi(q = 2), 𝒩, base = 2)
𝒩pts = StateSpaceSet(transpose(rand(rng, 𝒩, npts)))
h_estimated = information(LeonenkoProzantoSavani(Renyi(q = 2, base = 2), k = 10), 𝒩pts)

# Check that we're less than 10% off target
@test abs(h_estimated - h_true)/h_true < 0.1

# -------------------------------------------------------------------------------------
# Tsallis entropy.
# ------------------------------------------------------------------------------------
# Eq. 15 in Nielsen & Nock (2011); https://arxiv.org/pdf/1105.3259.pdf
function information(e::Tsallis, 𝒩::MvNormal; base = 2)
q = e.q
Σ = 𝒩.Σ
D = length(𝒩.μ)
hr = information(Renyi(q = q), 𝒩; base)
h = (exp((1 - q) * hr) - 1) / (1 - q)
return convert_logunit(h, ℯ, base)
end

# We know the analytical expression for the Tsallis entropy of a multivariate normal.
# It is implemented in the function above.
𝒩 = MvNormal([0, 1], [1, 1])
h_true = information(Tsallis(q = 2), 𝒩, base = 2)
𝒩pts = StateSpaceSet(transpose(rand(rng, 𝒩, npts)))
h_estimated = information(LeonenkoProzantoSavani(Tsallis(q = 2, base = 2), k = 10), 𝒩pts)

# Check that we're less than 10% off target
@test abs(h_estimated - h_true)/h_true < 0.1
Loading