This repository contains the code for the NeurIPS 2021 submission "Local policy search with Bayesian optimization".
-
Updated
May 28, 2021 - Jupyter Notebook
This repository contains the code for the NeurIPS 2021 submission "Local policy search with Bayesian optimization".
Code for Policy Optimization as Online Learning with Mediator Feedback
This repo implements the REINFORCE algorithm for solving the Cart Pole V1 environment of the Gymnasium library using Python 3.8 and PyTorch 2.0.1.
An implementation of the reinforcement learning for CartPole-v0 by policy optimization
Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Moalla et al. 2024). Uses TorchRL and provides extensive tools for studying representation dynamics in policy optimization.
Policy Optimization with Penalized Point Probability Distance: an Alternative to Proximal Policy Optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
Mirror Descent Policy Optimization
Model-based Policy Gradients
This repository contains the code for the paper "Local policy search with Bayesian optimization".
Implementation of a Deep Reinforcement Learning algorithm, Proximal Policy Optimization (SOTA), on a continuous action space openai gym (Box2D/Car Racing v0)
Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).
Add a description, image, and links to the policy-optimization topic page so that developers can more easily learn about it.
To associate your repository with the policy-optimization topic, visit your repo's landing page and select "manage topics."