Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (RUNE)

Code implementation for Reward Uncertainty for Exploration in Preference-based Reinforcement Learning and scripts to reproduce experiments. This codebase is largely originated and modified from B-Pref.

Install

conda env create -f conda_env.yml
pip install -e .[docs,tests,extra]
cd custom_dmcontrol
pip install -e .
cd custom_dmc2gym
pip install -e .
pip install git+https://github.com/rlworkgroup/metaworld.git@master#egg=metaworld
pip install pybullet

Instructions

Implementation of RUNE algorithm is in train_PEBBLE_explore.py (based on PEBBLE) and train_PrefPPO_explore.py (based on PrefPPO). Default hyperparameters used in paper is included in config files (config/) and training scripts (scripts/).

Example scripts for running experiments in Table 1 can be reproduced with the following:

PEBBLE + RUNE:

./scripts/[env_name]/[max_budget]/run_PEBBLE_rune.sh [date: yyyy-mm-dd]
./scripts/[env_name]/[max_budget]/run_PEBBLE.sh [date: yyyy-mm-dd]

PrefPPO + RUNE:

./scripts/[env_name]/[max_budget]/run_PrefPPO_rune.sh [date: yyyy-mm-dd]
./scripts/[env_name]/[max_budget]/run_PrefPPO.sh [date: yyyy-mm-dd]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agent		agent
config		config
custom_dmc2gym		custom_dmc2gym
custom_dmcontrol		custom_dmcontrol
rlkit/envs		rlkit/envs
scripts		scripts
stable_baselines3		stable_baselines3
LICENSE		LICENSE
README.md		README.md
conda_env.yml		conda_env.yml
logger.py		logger.py
replay_buffer.py		replay_buffer.py
replay_buffer_explore.py		replay_buffer_explore.py
reward_model.py		reward_model.py
reward_model_explore.py		reward_model_explore.py
setup.cfg		setup.cfg
setup.py		setup.py
train_PEBBLE.py		train_PEBBLE.py
train_PEBBLE_explore.py		train_PEBBLE_explore.py
train_PPO.py		train_PPO.py
train_PPO_Unsuper.py		train_PPO_Unsuper.py
train_PrefPPO.py		train_PrefPPO.py
train_PrefPPO_explore.py		train_PrefPPO_explore.py
train_SAC.py		train_SAC.py
utils.py		utils.py
video.py		video.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (RUNE)

Install

Instructions

About

Releases

Packages

Languages

License

rll-research/rune

Folders and files

Latest commit

History

Repository files navigation

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning (RUNE)

Install

Instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages