Xuezhou_zhang Robust Policy Gradient Against Strong Data Corruption 2021

[TOC] Title: Robust Policy Gradient Against Strong Data Corruption Author: Xuezhou Zhang et. al. Publish Year: 2021 Review Date: Tue, Dec 27, 2022 Summary of paper Abstract Contribution the author utilised a SVD-denoising technique to identify and remove the possible reward perturbations this approach gives a robust RL algorithm Limitation This approach only solve the attack perturbation that is not consistent. (i.e. not stealthy) Some key terms Policy gradient methods...

<span title='2022-12-27 20:35:10 +1100 AEDT'>December 27, 2022</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;317 words&nbsp;·&nbsp;Sukai Huang