Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularise Constraints #386

Merged
merged 107 commits into from
Jun 8, 2024
Merged

Conversation

kellertuer
Copy link
Member

@kellertuer kellertuer commented May 4, 2024

This is a start to rework constraints, make them a bit more flexible and address / resolve #185.

All 3 parts, the function $f$, inequality constraints $g$ and equality constraints $h$ are now stored in their own objectives internally. That way, they can be more flexibly provided – for example a Hessian for one of them.

Besides that, there is more details available on the co-domain of these, especially in case of a single function, the gradient can now also be specified to map into the tangent space of the power manifolds tangent space.

One open point still to check is, how the internal functions can be adopted to this in hopefully a non-breaking way.
A main challenge (or where I stopped today), is, when a function that returns gradients (or a gradient of functions) is stored in an objective – and it hence might be cached – how to best model element access.
This might also be a nice improvement (or a unification with) the LevenbergMarquart type objective.
One could unify that to an objective of type VectorObjective maybe.

Comments on this idea welcome.

🛣️ Roadmap

  • 👨‍💻 rework the internal functions / accessors to the constraints
  • 👨‍💻 unification with the LestSquaresObjetice (VectorObjective ?)
  • 👨‍💻 rewrite all constrained gradients and costs to have a positional range, e.g. ALM and EPM grad.
  • 👨‍💻rewrite that DefaultManifoldProblem passes in a range nothing to trigger the default
  • 👨‍💻refactor all get_gradients to get_gradientwith Colon
  • 👨‍💻deprecate the plural variant
  • 👨‍💻refactor the ConstrainedManifoldObjective that
    • 🔁g and grad_g are internally stored as the new VectorialGradientFunction
    • 🔁 h and grad_h are internally stored as a mew VectorialGradientFunction
  • 👨‍💻adapt the high-level interfaces to have a range as well
  • 🔎 check how this would best interact with Caching.
  • ⚠️ For best of cases stay non-breaking
  • 📈 test coverage
  • 📚 documentation

@kellertuer
Copy link
Member Author

This is conceptually a quite interesting PR. I just spent a bit of time sketching the new idea of e vectorial objective. I think it would reduce code duplication (between inequality constraint access, equality constraint access and all Levenberg Marquardt) quite a bit.

Since I am not 100% familiar with LM; a bit of a feedback would be nice on the ideas sketched here @mateuszbaran: https://manoptjl.org/previews/PR386/plans/objective/#Manopt.VectorialGradientObjective
This is of course only sketched and for example in the rendering I messed up the 1./2./3. in the 3 types of representations for the single-cost-function implementation. But I hope those 3 ( 1., 1., and 1. ;)) capture the existing ones and the new one the issue refers to?

If this sounds good, I could start implementing all that. Note that the basis in LM is now even stored in the type, so it is only there if the type requires it (the CoefficientType)

@mateuszbaran
Copy link
Member

This is a nice direction of improvement. I don't quite get the difference between ComponentVectorialType and PowerManifoldVectorialType -- the second type essentially covers the first one as a special case?

Also, you should be careful to note that VectorialGradientObjective doesn't represent multi-objective optimization but optimization of $g(f(p))$ for some $g\colon \mathbb{R}^n \to \mathbb{R}$, where we might want to encode g in VectorialGradientObjective. For example it is sum of squares for LM, mean/sum for stochastic gradient descent or some other function for robustified nonlinear least squares (see #332). I could also represent multi-objective optimization but we don't currently have any such solvers in Manopt.

Note that the basis in LM is now even stored in the type, so it is only there if the type requires it (the CoefficientType)

That sounds like a good idea.

@kellertuer
Copy link
Member Author

kellertuer commented May 5, 2024

Component is the old one, so basically nested vector, I kept that because I think the power manifold one might (more often) require and actual power manifold while the old vector variant did not.

And sure, one idea would be to have this also inside a VectorOptimisation problem later, but maybe we should then rethink the name, if you feel that might be confusing.

I would maybe thing that a MultiObjective would combine the idea here with the the function g, that is store both internally.

@mateuszbaran
Copy link
Member

Component is the old one, so basically nested vector, I kept that because I think the power manifold one might (more often) require and actual power manifold while the old vector variant did not.

I'm not sure if keeping that separation is actually useful. NestedPowerRepresentation is the same thing. We could have some specializations to avoid explicit construction of power manifold if needed.

I would maybe thing that a MultiObjective would combine the idea here with the the function g, that is store both internally.

Multi-objective optimization doesn't have the function g, its the single-objective optimization that needs it. VectorialGradientObjective doesn't currently specify a unique optimization problem due to g being unknown.

@kellertuer
Copy link
Member Author

I never want to specify said g in this PR since the goal is really to only represent elements that map into Rn, like the equality, or inequality constraints or the vectorial function in LM. At least the first two never have said g.

But wrapping this in a new Objective that provides g would be the way to go I think.

Ok, doing just the Powermanifold thing should be fine as well and we can omit the component one.

@mateuszbaran
Copy link
Member

I never want to specify said g in this PR since the goal is really to only represent elements that map into Rn, like the equality, or inequality constraints or the vectorial function in LM. At least the first two never have said g.

But wrapping this in a new Objective that provides g would be the way to go I think.

That's fine but then maybe let's use a name without objective in it?

@kellertuer
Copy link
Member Author

Interesting idea. For a bit of background: This started, when a student of mine said, it might be nice to have a Hessian with the cost in the constrained objective.

So I encapsuled that and instead of saving f and ? grad_f` in the objective, it now (in this PR) stores an objective.
So the constrained objective is now and objective plus constraints.

So I would be fine with the idea that the constraints g and h are objectives (though vectorial) themselves. I also do not have enough experience in vectorial optimization whether they sometimes would really just have a vectorial cost? maybe the g you used to get a number is something that is just parametrised in an objective and not a concrete function? Then VectorialObjective would indeed be fine and describing well what we have – a function (and derivative information) that maps into a vector space.

Wel, but I am also fine giving it another name, just that I struggle a bit with a good name for now. Do you have ideas for a name?

@mateuszbaran
Copy link
Member

Maybe just VectorialGradientFunction and VectorialFunction?

I also do not have enough experience in vectorial optimization whether they sometimes would really just have a vectorial cost?

Yes, that's what multi-objective optimization deals with. The goal is to explore the Pareto front. It is, in a way, equivalent to exploration of the impact of g on the result of single-objective optimization of the composite function $g(f(p))$.

maybe the g you used to get a number is something that is just parametrised in an objective and not a concrete function?

I don't understand, why wouldn't it be a concrete function?

@kellertuer
Copy link
Member Author

I don't understand, why wouldn't it be a concrete function?

Maybe some vector optimisation area I do not know? I do not know much.

But then the vectorial objective we have here is fine for vector optim as well just that a vecetorobjetive need a vectorial objective plus g
Like the constraint objective needs an objective, one or two vectorial objectives.

So I till neither see what would be wrong with the vectorial objective nor do I have any other good name here.

@mateuszbaran
Copy link
Member

Maybe some vector optimisation area I do not know? I do not know much.

Yes, it is fine for multi-objective/vector-valued optimization but to me using it directly for anything else is confusing. VectorialGradientObjective sounds like something I'd only (or primarily) be using for multi-objective optimization. For single-valued optimization, VectorialGradientObjective is not a complete objective.

But then the vectorial objective we have here is fine for vector optim as well just that a vecetorobjetive need a vectorial objective plus g

Using both names (vector objective and vectorial objective) for different things sounds confusing. Maybe one of this things could be names SplitObjective for example?

@kellertuer
Copy link
Member Author

kellertuer commented May 5, 2024

Though not yet used, once we go for vector optimisation I want to keep VectorObjective for that I feel.

Since we already discussed this is in most vases (even for LM) just a part of the objective, we could call the type here VectorFunction? The only thing I do not like in this name is that it actually all contains the vector functions gradient ;)

edit: SplitObjective sounds too vague for me.

@mateuszbaran
Copy link
Member

OK, then what about VectorGradientFunction? It would be fine I think.

@kellertuer
Copy link
Member Author

Sounds good. Will work on that tomorrow then. Thanks for the feedback and the discussions :)

@kellertuer
Copy link
Member Author

I did the renaming and will now start to write the access functions (which will simplify the 3 existing access functions it will replace quite a bit),
I noticed that I am now not sure whether VectorGradientFunction should be <:AbstractManifoldObjective or not. It behaves in many aspects like such a type, but usually requires and argument more (to access the entries) or returns a vector of things instead of just a thing. So maybe it should not be an objective in type even?

@kellertuer
Copy link
Member Author

A final thing to maybe consider: For the power manifold approach one sometimes needs the power manifold to access the elements of the (power manifolds) tangent vector.

For now I just added the power representation type to the new PowerManifoldVectorialType. That would, however, mean, one would often generate the power manifold just to access a component.
I do not have a better idea; storing the power manifold in the objective would ne agains the idea of splitting the objective (or here part of the objective) and the manifold.

@mateuszbaran
Copy link
Member

I noticed that I am now not sure whether VectorGradientFunction should be <:AbstractManifoldObjective or not. It behaves in many aspects like such a type, but usually requires and argument more (to access the entries) or returns a vector of things instead of just a thing. So maybe it should not be an objective in type even?

It could be an objective for a multi-objective optimization problem but I don't think we want to design that feature in this PR so maybe let's not make it <:AbstractManifoldObjective for now.

For now I just added the power representation type to the new PowerManifoldVectorialType. That would, however, mean, one would often generate the power manifold just to access a component.
I do not have a better idea; storing the power manifold in the objective would ne agains the idea of splitting the objective (or here part of the objective) and the manifold.

I will think about it.

@kellertuer
Copy link
Member Author

I agree on the first.

For the second I have 2 ideas

  • we could either always generate PowerManifold(M,vgf.dimension), this keeps the distinction but always generates a power manifold
  • safe the power manifold, which makes the objective explicitly depend on the manifold (which until now we avoided).

I am more tending to the first case, since I think the memory/time spend on creating that is not too bad – and the second would break quite a bit with the current model ideas.

@kellertuer
Copy link
Member Author

I illustrated my problem a bit with the get_gradients and get_gradients! functions.
I think the single gradient access functions are a bit easier, the Jacobian function might again have the need to create the power manifold (to access vector elements in order to get them to coordinates).

mateuszbaran and others added 3 commits June 6, 2024 11:21
Co-authored-by: Mateusz Baran <mateuszbaran89@gmail.com>
@mateuszbaran
Copy link
Member

I did a whole bunch of minor improvements: fixing typos, adding type bounds, a couple of missing tests. Let me know if there is anything you don't like. By the way, it looks like using i and j for constraint indexing is a bit mixed. I converted some i to j but I wasn't particularly thorough.

Copy link
Member

@mateuszbaran mateuszbaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any more design comments here -- it looks solid. Just a bit of polishing. It should be possible to do further optimization work incrementally in the future.

Co-authored-by: Mateusz Baran <mateuszbaran89@gmail.com>
@kellertuer
Copy link
Member Author

Thanks, having a solid design was my main goal. Performance is probably something we can look at in the future then.

@kellertuer
Copy link
Member Author

By the way, it looks like using i and j for constraint indexing is a bit mixed. I converted some i to j but I wasn't particularly thorough.

Yes I think I started with having I for inequality and j vor eq and got that mixed up a bit later. Hope that is not too bad – if that is confusing we should make that more consistent, but for me that Is also fine, as is now, since that never mixes anywhere.

@kellertuer
Copy link
Member Author

I did a whole bunch of minor improvements: fixing typos, adding type bounds, a couple of missing tests. Let me know if there is anything you don't like.

Wow, thanks for fixing all those :) they all look like very great fixes; for bounds I am sometimes not sure when they are useful and when we can skip them, so I trust your expertise here.

@mateuszbaran
Copy link
Member

Yes I think I started with having I for inequality and j vor eq and got that mixed up a bit later. Hope that is not too bad – if that is confusing we should make that more consistent, but for me that Is also fine, as is now, since that never mixes anywhere.

That's not a big issue but there were a few places where docstring and the method it was attached to used different conventions, or type signature in docstring and docstring text. Anyway, that's a minor consistency issue.

Wow, thanks for fixing all those :) they all look like very great fixes; for bounds I am sometimes not sure when they are useful and when we can skip them, so I trust your expertise here.

You're welcome :). I add bounds for type parameters mainly because then I know what to expect there. If a field is a function or a functor object, I prefer no bound. Otherwise I prefer to have a bound unless there is a good reason against it (for example one reason may be wrapping objects from weak dependencies).

@kellertuer kellertuer merged commit 2943e0d into master Jun 8, 2024
15 checks passed
@kellertuer kellertuer deleted the kellertuer/structural_rework_constraints branch June 28, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation enhancement Ready-for-Review A label for pull requests that are feature-ready
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Representing gradient of constraints as tangents on a power manifold
2 participants