Xuezhou_zhang Robust Policy Gradient Against Strong Data Corruption 2021

[TOC] Title: Robust Policy Gradient Against Strong Data Corruption Author: Xuezhou Zhang et. al. Publish Year: 2021 Review Date: Tue, Dec 27, 2022 Summary of paper Abstract Contribution the author utilised a SVD-denoising technique to identify and remove the possible reward perturbations this approach gives a robust RL algorithm Limitation This approach only solve the attack perturbation that is not consistent. (i.e. not stealthy) Some key terms Policy gradient methods ...

December 27, 2022 · 2 min · 317 words · Sukai Huang

Kiarash_banihashem Defense Against Reward Poisoning Attacks in Reinforcement Learning 2021

[TOC] Title: Defense Against Reward Poisoning Attacks in Reinforcement Learning Author: Kiarash Banihashem et. al. Publish Year: 20 Jun 2021 Review Date: Tue, Dec 27, 2022 Summary of paper Motivation our goal is to design agents that are robust against such attacks in terms of the worst-case utility w.r.t. the true unpoisoned rewards while computing their policies under the poisoned rewards. Contribution we formalise this reasoning and characterize the utility of our novel framework for designing defense policies. In summary, the key contributions include ...

December 27, 2022 · 2 min · 303 words · Sukai Huang