Borja_ibarz Reward Learning From Human Preferences and Demonstrations in Atari 2018
[TOC] Title: Reward learning from human preferences and demonstractions in Atari Author: Borja Ibarz et. al. Publish Year: 2018 Review Date: Nov 2021 Summary of paper This needs to be only 1-3 sentences, but it demonstrates that you understand the paper and, moreover, can summarize it more concisely than the author in his abstract. The author proposed a method that uses human expert’s annotation rather than extrinsic reward from the environment to guide the reinforcement learning....