Add simple Schedulers #1434

DhairyaLGandhi · 2020-12-21T08:17:17Z

Simple design for scheduling learning rate decay using the Optimiser interface

gxyd · 2021-01-05T11:20:28Z

Is the intention here to add other Schedulers as well that Pytorch implements?


* LambdaLR
* MultiplicativeLR (this PR?)
* StepLR
* MultiStepLR
* ExponentialLR
* CosineAnnealingLR
* ReduceLROnPlateu
* CyclicLR
* OneCycleLR
* CosineAnnealingWarmRestarts

DhairyaLGandhi · 2021-01-05T16:39:32Z

Happy to have help with this! Having said that, the design of the optimisers seems to lend itself fine to adding schedulers. Note that we would want to have a fairly shallow type hierarchy.

We probably want to talk about what all features schedulers might need that can fall off from this design and what the limitations are.

DhairyaLGandhi · 2021-01-05T16:41:09Z

I find the method to find the current epoch number a bit janky, which I would love to improve upon as a start

ToucheSir · 2021-01-05T17:14:29Z

I think this should be reconciled with https://github.com/darsnack/ParameterSchedulers.jl somehow. cc @darsnack and @lorenzoh

darsnack · 2021-01-05T17:26:35Z

ParameterSchedulers.jl should have most of PyTorch's functionality already implemented. The only ones that I'd want to double check are ReduceLROnPlateu and CosineAnnealingWarmRestarts. Probably these are just syntactic sugar to compose the existing schedulers.

@CarloLucibello mentioned integrating ParameterSchedulers.jl into Flux. I would have already made a PR, but I ran head first into Flux's limited optimizer interface. The main issue is that apply! is called on a per parameter basis. So, it ends up getting called many times per iteration, meaning that any scheduler would need to know a priori the number of times it will be called. This is computable but hacky imo. You can't rely on overloading update! since that only works for implicit parameters and breaks composability (i.e. Flux.Optimise.Optimiser doesn't invoke update! on each sub-optimizer, it invokes apply!).

I am currently writing up what a future interface should look like, and I will post it soon. I'll follow up with a PR to Flux that implements it based on the discussion that follows that post.

PS: This really only presents a problem for Flux.train! where the training loop is inaccessible to the user.

darsnack · 2021-01-05T17:34:06Z

Looking at the source of this PR, it will hit the same issue that ScheduledOptim from ParameterSchedulers.jl did. o.current will end up getting updated more than once per iteration since apply! will be called once for every parameter. You could reconcile this by passing params(m) into the constructor and doing the scheduler logic only once per iteration, but that's too hacky.

DhairyaLGandhi · 2021-01-05T19:42:27Z

will end up getting updated more than once per iteration

I'm aware of the bug, which isn't difficult to fix but this was more for design discussion than final implementation. You'll notice the janky name :)

darsnack · 2021-01-05T19:44:00Z

Good to hear — a design discussion is exactly what I think is needed too.

DhairyaLGandhi · 2021-01-05T19:47:11Z

Do post your thoughts here @darsnack, that's kind of the motivation of the PR.

The question on composition is that we can make it safe to call apply on a number of parameters together, but it's not very intuitive. To be clear, i want to move these over to optimisers.jl now, so we have a unified interface. It's written with that in mind.

darsnack · 2021-01-05T20:04:16Z

To be clear, i want to move these over to optimisers.jl now, so we have a unified interface. It's written with that in mind.

Perfect, I was thinking the exact same thing.

ToucheSir · 2021-01-05T21:36:11Z

I do think the interaction between learning rate schedule and optimizer will be very different moving from the current Flux interface to the mostly stateless one in Optimisers.jl. For example, ScheduledOptim.update_func can no longer be a purely mutating function, and I wonder if a PyTorch-esque interface is the best way to go. I know I keep mentioning it, but it's illustrative to see what a mutation-free library like Optax does here.

darsnack · 2021-01-08T16:26:50Z

I posted my thoughts in a discussion. I'm interested to hear how people think the interface can be improved. Personally, I felt that I'm still unsatisfied with hyperparameter accesses, but I can't figure out anything better.

Dhairya Gandhi and others added 6 commits July 13, 2020 03:07

add multisteplr optimiser

9ce973f

use milstones

aea6cda

correctly choose the epoch

068d520

cleanup

a3a4a73

fixes

e786e0c

Merge branch 'master' into dg/multilr

c0b80a8

DhairyaLGandhi changed the title ~~Dg/multilr~~ Add simple Schedulers Dec 21, 2020

DhairyaLGandhi mentioned this pull request Jan 12, 2021

Rethink train design and better callbacks support #1461

Open

DhairyaLGandhi mentioned this pull request Jan 29, 2021

PyTorch feature parity #1431

Open

92 tasks

CarloLucibello mentioned this pull request Feb 17, 2021

Add ParameterSchedulers.jl to docs #1511

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simple Schedulers #1434

Add simple Schedulers #1434

DhairyaLGandhi commented Dec 21, 2020

gxyd commented Jan 5, 2021 •

edited

Loading

DhairyaLGandhi commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

ToucheSir commented Jan 5, 2021

darsnack commented Jan 5, 2021

darsnack commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

darsnack commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

darsnack commented Jan 5, 2021

ToucheSir commented Jan 5, 2021

darsnack commented Jan 8, 2021 •

edited

Loading

Add simple Schedulers #1434

Are you sure you want to change the base?

Add simple Schedulers #1434

Conversation

DhairyaLGandhi commented Dec 21, 2020

gxyd commented Jan 5, 2021 • edited Loading

DhairyaLGandhi commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

ToucheSir commented Jan 5, 2021

darsnack commented Jan 5, 2021

darsnack commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

darsnack commented Jan 5, 2021

DhairyaLGandhi commented Jan 5, 2021

darsnack commented Jan 5, 2021

ToucheSir commented Jan 5, 2021

darsnack commented Jan 8, 2021 • edited Loading

gxyd commented Jan 5, 2021 •

edited

Loading

darsnack commented Jan 8, 2021 •

edited

Loading