v1.4.0: Agents define their own rewards #125
stephane-caron
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This release brings a great deal of fixes and build system improvements. Notably, we can now run agents both from Python or Bazel interchangeably. There was also a refactoring to the
Reward
class in environments: agents can now define their own task-specific rewards. The standing reward is now part of the PPO balancer, while at the environment level the default is simply a survival reward (+1 at each non-failing step).Thanks to @boragokbakan for contributing to this release 👍
Added
--show
CLI argument to the wheel balancer's Bullet targetReward
class for rewardsChanged
main.py
StandingReward
to the PPO balancerutils.raspi
function callrealtime
submodule in favor ofraspi
Fixed
vcgencheck
This discussion was created from the release v1.4.0.
Beta Was this translation helpful? Give feedback.
All reactions