[TOC]
- Title: Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
- Author: Yinglun Xu et. al.
- Publish Year: 30 May 2022
- Review Date: Tue, Dec 27, 2022
Summary of paper
Motivation
- we study data poisoning attacks on online deep reinforcement learning (DRL) where the attacker is oblivious to the learning algorithm used by the agent and does not necessarily have full knowledge of the environment.
- we instantiate our framework to construct several attacks which only corrupts the rewards for a small fraction of the total training timesteps and make the agent learn a low performing policy
Contribution
- result show that the reward attack efficiently poison agent learning with a variety of SOTA DRL algorithm such as DQN, PPO
- our attack can work on model-free DRL algorithm for all popular learning paradigms, and only assume the learning algorithm to be efficient.
- large enough reward poisoning attack in the right direction is able to disrupt the DRL algorithm.
limitation
- this research assume the attack has a limited budget. But if they prove that this can disrupt the DRL, then our consistent false positive rewards can for sure disrupt the algorithm
Some key terms
LPE (learned policy evasion) attack
- make all policies of good performance appear bad to the learning agent. Intuitively, policies of good performance should share similar behaviour as there is usually certain general strategy to behave well under the environment.
- Therefore if the attacker can make the actions correspond to such behaviour look bad. then all the good policies will appear bad to the agent.
Uniformly at random (UR) attack
- it means random attack that used up all the budget.
Potential future work
we can get some insight about the theoretical analysis of their attack methods based on certain assumptions on the efficiency of the DRL algorithm which yields several insightful implications