Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: Polyak vs EMA #413

Open
dwaitbhatt opened this issue May 14, 2024 · 0 comments
Open

Documentation: Polyak vs EMA #413

dwaitbhatt opened this issue May 14, 2024 · 0 comments

Comments

@dwaitbhatt
Copy link

dwaitbhatt commented May 14, 2024

The Spinning Up documentation for DDPG, TD3 and SAC describe the exponential moving average (EMA) of target network weights as polyak averaging. This seems to be a misnomer, as in Polyak's paper (equation 12) they use an unweighted average of all past iterates, while in EMA we have exponentially larger weights for recent iterates. The Adam paper also mentions EMA as an alternative to polyak averaging (section 7.2). Since the Spinning Up documentation is used by several students studying RL concepts, it would be good to add clarification about this naming convention.

Here is a similar discussion on a Keras issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant