Zhiwei He Improving Machine Translation Use Quality Estimation as a Reward Model 2024

[TOC] Title: Improving Machine Translation Use Quality Estimation as a Reward Model 2024 Author: Zhiwei He et. al. Publish Year: 23 Jan 2024 Review Date: Sun, Jan 28, 2024 url: arXiv:2401.12873v1 Summary of paper Contribution In this research, the authors explore using Quality Estimation (QE) models as a basis for reward systems in translation quality improvement through human feedback. They note that while QE has shown promise aligning with human evaluations, there’s a risk of overoptimization where translations receive high rewards despite declining quality....

<span title='2024-01-28 22:53:41 +1100 AEDT'>January 28, 2024</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;285 words&nbsp;·&nbsp;Sukai Huang