[TOC]

  1. Title: Reward Delay Attacks on Deep Reinforcement Learning
  2. Author: Anindya Sarkar et. al.
  3. Publish Year: 8 Sep 2022
  4. Review Date: Mon, Dec 26, 2022

Summary of paper

Motivation

  • we present novel attacks targeting Q-learning that exploit a vulnerability entailed by this assumption by delaying the reward signal for a limited time period.
  • We evaluate the efficacy of the proposed attacks through a series of experiments.

Contribution

  • our first observation is that reward-delay attacks are extremely effective when the goal for the adversarial is simply to minimise reward.
  • we find that some mitigation method remains insufficient to ensure robustness to attacks that delay, but preserve the order, of rewards.

Conclusion

  • Our results thus suggest that even a relatively short delay in the reward signal can lead DQN learning to be entirely ineffective.
  • Our empirical findings suggest that it is possible to induce a sub-optimal policy by strategically reshuffling the true reward sequence. Even randomly shuffling reward within relatively short time intervals is already sufficient to cause learning failure.
  • Reward delay attack also has a disastrous effect on DQN learning, implying that the DRL process can be easily disrupted when reward channel is corrupted.

Some key terms

Synchrony

  • Our attack exploits a common assumption of synchrony in reinforcement learning algorithms. Specifically, we assume that the adversary can delay rewards a bounded number of time steps (for example, by scheduling tasks computing a reward at time t after the task computing a reward at time t+k for some integer k >= 0)

reward shifting attacks

  1. an adversary can only drop rewards, or shift these a bounded number of steps into the future.

untargeted attack

  • simply aim to minimise the reward obtained by the learned policy
  • image-20221227000138836

Untargeted reward delay attacks

  • we investigate the efficacy of the untargeted reward delay attacks as we change $\delta$, the maximum delay we can add to a reward (i.e., the maximum we can shift reward back in time relative to the rest of DQN update information)
  • image-20221227001012628
  • what is surprising, however, is that this improvement is extremely slight, even though we doubled the amount of time the reward can be delayed.
  • Our results thus suggest that even a relatively short delay in the reward signal can lead DQN learning to be entirely ineffective.