[TOC]

  1. Title: Secrets of RLHF in Large Language Models Part1: PPO
  2. Author: Rui Zheng et. al.
  3. Publish Year: 18 Jul 2023
  4. Review Date: Mon, Jan 22, 2024
  5. url: arXiv:2307.04964v2

Summary of paper

Motivation

Contribution

Some key terms

RLHF limitation

RLHF in one paragraph

Helpfulness

Alignment metrics

Results

image-20240124222202766

image-20240124225143360

Summary

the paper summarized a bunch of implementation details for the PPO training