Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analytical form for convolution and bind of two Normal distributions #226

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nignatiadis
Copy link
Contributor

@nignatiadis nignatiadis commented Aug 21, 2022

Hi Chad,

I am following up here on our discussion from Zulip. I followed your advice therein. Let me explain a couple more points of this PR:

  1. It defines convolve as a new combinator. The reason is that for this problem, the Normal->Normal bind is really just the convolution of two normals.

  2. In Zulip you suggested using mean(d), and std(d) to make the code more generic. Currently these would throw a method error, so I had to add some code for the calculation of mean and std for Affine(ly transformed) measures.

  3. Compared to the Zulip discussion, I also added code to dispatch when σ (of the Normal kernel) is fixed, but possibly unequal to 1. This relates to a conversation we had a while ago about constraints in kernels/likelihoods.

One possible concern (though not a concern for my use case) is the following:

julia> Normal(0,1)   MeasureTheory.kernel(Normal{(,)}, μ=identity)
Normal= 0.0, σ = 1.41421)

julia> Normal(0,1)   MeasureTheory.kernel(Normal{(,)})
MeasureBase.Bind{Normal{(, ), Tuple{Int64, Int64}}, ParameterizedTransitionKernel{Type{Normal}, ComposedFunction{typeof(params), MeasureBase.var"#f#8"{Normal, (,)}}, (,), Tuple{MeasureBase.var"#54#56"{Int64}}}}(
    Normal= 0, σ = 1),
    ParameterizedTransitionKernel(Normal, MeasureBase.params  MeasureBase.var"#f#8"{Normal, (,)}(), (μ = MeasureBase.var"#54#56"{Int64}(1),)))

Let me know what you think!

@github-actions
Copy link
Contributor

Package name latest stable
Mitosis.jl
Soss.jl

@codecov
Copy link

codecov bot commented Aug 21, 2022

Codecov Report

Merging #226 (06b8933) into master (6a42ef6) will increase coverage by 0.37%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##           master     #226      +/-   ##
==========================================
+ Coverage   44.60%   44.98%   +0.37%     
==========================================
  Files          42       44       +2     
  Lines        1307     1325      +18     
==========================================
+ Hits          583      596      +13     
- Misses        724      729       +5     
Impacted Files Coverage Δ
src/MeasureTheory.jl 50.00% <ø> (ø)
src/combinators/convolve.jl 0.00% <0.00%> (ø)
src/combinators/affine.jl 53.88% <85.71%> (+1.28%) ⬆️
src/parameterized/pairwise/normal_normal.jl 100.00% <100.00%> (ø)
src/distproxy.jl 50.00% <0.00%> (+16.66%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@github-actions
Copy link
Contributor

Package name latest stable
Mitosis.jl
Soss.jl

@cscherrer
Copy link
Collaborator

Thanks @nignatiadis !

I see how the result in the normal case is equivalent to a convolution, but more generally I think bind builds a compound distribution. So for us, maybe a compound measure?

There's a nice overview here. Eventually (though not necessarily this PR, unless you want to) we should have all of the examples in that article as unit tests :)

@nignatiadis
Copy link
Contributor Author

Yes exactly! A lot of my research is about such compound distributions. I plan to add a few more to this PR (though not all in the Wikipedia list).

@nignatiadis
Copy link
Contributor Author

Also, I had never heard the term bind used before for this operation, but then noticed that bind in the package had exactly the semantics I was interested in. Another way to think about this is as a composition operation on Markov kernels (see e.g. the following, as well as the section just above it).

In my papers I typically just call it the marginal distribution (where marginal refers to the fact, that we start with a joint distribution on (X,Y), with samples generated as: X~μ, Y~k(X), for a measure μ and a kernel k) but then are only interested in the marginal of Y. [Calling this a marginal distribution in this package would cause ambiguity.]

@cscherrer
Copy link
Collaborator

Some background on bind...

In functional programming, a monad is a type constructor m that has two operations:

unit :: a -> m a
bind :: m a -> (a -> m b) -> m b

bind is usually written as >>=, though that's awkward to write and Julia doesn't allow it. For a quick workaround I used , though I'm not sure we should stay with that.

Monads are also required to satisfy the "monad laws":

unit(x) >>= f == f(x)
m >>= unit == m
m >>= (x -> (f(x) >>= g)) == (m >>= f) >>= g

For us,

  • m is $\mathcal{M}$, where $\mathcal{M}(X)$ is "measures on $X$" (mathematically -- we don't have this notation in code)
  • unit is Dirac
  • bind is... well, curently it's bind, but maybe it should be compound or something more measure-specific?

@nignatiadis
Copy link
Contributor Author

This package also aims at a very similar task: https://github.com/filtron/MarkovKernels.jl

@cscherrer
Copy link
Collaborator

Good find! That looks very useful

@mschauer
Copy link
Member

mschauer commented Nov 7, 2022

@filtron

@filtron
Copy link

filtron commented Nov 9, 2022

@filtron

Hello. Can I be of assistance somehow?

@mschauer
Copy link
Member

mschauer commented Nov 9, 2022

I think you can be witness to us appreciating your package and possibly figure out if there is some problem we share

@filtron
Copy link

filtron commented Nov 14, 2022

Pardon my late reply. I am quite new to Julia so encouragement + feedback is much appreciated. Regarding discussing issues we might share, I feel like I am missing some context. We are both interested in convolving Normal distributions but it is not clear to me what the issue is beyond that?

@nignatiadis
Copy link
Contributor Author

@filtron: Thank you for your response! I was very excited to see your MarkovKernels.jl package; thanks for writing it!

Here's the reason I initially tagged your package in this thread:
MeasureTheory.jl and associated packages (e.g., MeasureBase.jl) seek to provide abstractions to work with measures and reason about them in a rigorous way in Julia, see e.g., the following article. One abstraction in these packages is that of Markov kernels and likelihoods, but part of that abstraction is still not set in stone 100% (I think; Chad and Moritz please correct me). Your package has these abstractions as well, along with very useful functionality, and it would be awesome if it could be provided as part of MeasureTheory.jl or using the MeasureTheory.jl interface (which perhaps would require some changes to satisfy requirements underlying your package; a discussion here could be fruitful).

@filtron
Copy link

filtron commented Nov 17, 2022

I would be happy to help get any desired functionality from MarkovKernels.jl into MeasureTheory.jl. I suppose concretely what we are talking about is the concepts I call compose, marginalise, and invert for, by the looks of it, normal(kernels) in cholesky parametrization?

@mschauer
Copy link
Member

mschauer commented Nov 17, 2022

If you want, have a look https://arxiv.org/abs/2010.03509. We also end up with such primitives which describe Bayesian inference in tree models (you can formulate a state space model this way too):

Pullbacks of likelihoods

$$ h_k (x) = \int k(y\mid x) h(y) \mathrm{d} x,\quad (1) $$

for example the fusion step of the Kalman-filter is a form of step (1).

With that you add tilting kernels with likelihoods

$$ k^h(y \mid x) = \frac{k(y\mid x) h(y)}{h_k(x)}, \quad (2) $$

and marginalisation

$$ k^h (y) = \int k^h(y\mid x) \pi(x) \mathrm{d} x, \quad (3) $$

Together (1), (2) and (3) allow you to describe an RTS type smoother made of a backward filter with fusion step (1) and a forward smoother (2).

@filtron
Copy link

filtron commented Nov 30, 2022

I have been meaning to get back to you on this. Please don't mistake my tardiness for disinterest. Doing inference on trees by means of the h-transform is indeed an interesting development. However, not to come off as pedantic, but in my mind there are three ways of doing inference in partially observed Markov processes namely,

  1. Filter forward, predict backwards (i.e. RTS)
  2. Filter forward and "filter" backwards, correct forward filter with backward filter (i.e. two-filter smoother)
  3. "Filter" backward, predict forwards (i.e. Doob's h-transform)

Formally, these alternatives produce the same output. However, algorithmically they are quite different to me and offer different avenues for developing approximate inference methods. The above looks more like alternative 3 to me (I guess 1 & 2 would also have tree analogues).

In any case, I would like to re-iterate that I would be pleased to collaborate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants