GitHub - tedmoskovitz/ConstrainedRL4LMs: A library for constrained RLHF.

This codebase contains the implementation of constrained RLHF.

Currently, the code is a bit hacky, supporting only composite reward models formed from the combination of METEOR and intent matching rewards, but we plan to provide a more general implementation soon.

The codebase is structured on the excellent RL4LMs repository: https://github.com/allenai/RL4LMs.

Constrained approaches can be run via the command:

python scripts/training/train_text_generation.py --config_path scripts/training/task_configs/dialog/gpt2_constrained_ppo.yml --log_to_wandb --seed=0 --experiment_name=<exp_name>

More documentation coming soon! In the meantime, please check out the documentation for RL4LMs as a guide. If you find this code useful, please consider citing our paper. Thank you!

@misc{moskovitz2023confronting,
      title={Confronting Reward Model Overoptimization with Constrained RLHF}, 
      author={Ted Moskovitz and Aaditya K. Singh and DJ Strouse and Tuomas Sandholm and Ruslan Salakhutdinov and Anca D. Dragan and Stephen McAleer},
      year={2023},
      eprint={2310.04373},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 264 Commits
rl4lms		rl4lms
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
RL4LMs_logo.png		RL4LMs_logo.png
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

tedmoskovitz/ConstrainedRL4LMs

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages