[TOC]

  1. Title: Disturbing Reinforcement Learning Agents With Corrupted Rewards
  2. Author: Ruben Majadas et. al.
  3. Publish Year: Feb 2021
  4. Review Date: Sat, Dec 17, 2022

Summary of paper

Motivation

  • recent works have shown how the performance of RL algorithm decreases under the influence of soft changes in the reward function.
  • However, little work has been done about how sensitive these disturbances are depending on the aggressiveness of the attack and the learning learning exploration strategy.
  • it chooses a subclass of MDPs: episodic, stochastic goal-only rewards MDPs

Contribution

  • it demonstrated that smoothly crafting adversarial rewards are able to mislead the learner
  • the policy that is learned using low exploration probability values is more robust to corrupt rewards.
    • (though this conclusion seems valid only for the proposed experiment setting)
  • the agent is completely lost with attack probabilities higher that than p=0.4

Some key terms

deterministic goal only reward MDP

  • a reward is only given if a goal state is reached.
  • many decision task are modelled with goal-only rewards from classical control problems such as Mountain Car or the Cart Pole
  • the goal only reward setting is a special case for sparse reward setting.

goal of adversary

  • produce the maximum deterioration in the learner policy performing the minimum number of attacks.

Major comments

citation

  1. the attack on reward function has received very little attention
    1. ref: Majadas, Rubén, Javier García, and Fernando Fernández. “Disturbing reinforcement learning agents with corrupted rewards.” arXiv preprint arXiv:2102.06587 (2021).
    2. ref: Jingkang Wang, Yang Liu, and Bo Li, ‘Reinforcement learning with perturbed rewards’, CoRR, abs/1810.01032, (2018)
  2. there are no studies about how sensitive the learning process is depending on the aggressiveness of reward perturbations and the exploration strategy.
    1. ref: Majadas, Rubén, Javier García, and Fernando Fernández. “Disturbing reinforcement learning agents with corrupted rewards.” arXiv preprint arXiv:2102.06587 (2021).

limitation on the setting

  • this consider with small and low-dimensional state space
  • lack the consideration of sparse reward setting with high dimensional large state space
  • this perturbation is based on the probability and lack the focus on the false positive rewards.
  • the perturbation is limited to only changing the sign of the true rewards.
  • A fixed attack probability was used in the experiments to test how reward attacks affect agents with varying exploration rates.
  • this experiment is like punish the good movements that lead to the goal.