[TOC]

  1. Title: Reward Engineering for Generating Semi-Structured Explanation
  2. Author: Jiuzhou Han et. al.
  3. Publish Year: EACL2024
  4. Review Date: Thu, Jun 20, 2024
  5. url: https://github.com/Jiuzhouh/Reward-Engineering-for-Generating-SEG

Summary of paper

Motivation

Contribution

  • the objective is to equip moderately-sized LMs with the ability to not only provide answers but also generate structured explanations

Some key terms

Intro

the author talked about some background on Cui et al. incorporate a generative pre-training mechanism over synthetic graphs by aligning inputs pairs of text-graph to improve the model’s capability in generating semi-structured explanation.

  • the author first stated that they utilise SFT as teh de-facto solution. We then turn our focus to RLHF as a mechanism to further align the explanations with ground-truth on top of SFT.

Related work

the author mention in details about the benchmark they use – ExplaGraph and COPA-SSE

Results

Experiments

the author first talked about the dataset size and said that “The detailed descriptions of all evaluation metrics are provided in Appendix A.”

Human evaluation

image-20240620163853924