[TOC]

  1. Title: Towards Closing the Sim to Real Gap in Collaborative Multi Robot Deep Reinforcement Learning
  2. Author: Wenshuai Zhao et. al.
  3. Publish Year: 2020
  4. Review Date: Sun, Dec 25, 2022

Summary of paper

Motivation

  • we introduce the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning
  • we discuss on how both the different types of perturbations and how the number of agents experiencing those perturbations affect the collaborative learning effort

Contribution

  • This is, to the best of our knowledge, the first work exploring the limitation of PPO in multi-robot systems when considering that different robots might be exposed to different environment where their sensors or actuators have induced errors

  • with the conclusion of this work, we set the initial point for future work on designing and developing methods to achieve robust reinforcement learning on the presence of real-world perturbation that might differ within a multi-robot system.

Some key terms

Deep RL

  • DRL algorithms work on a trial and error basis, where an agent interacts with its environment and receives a reward based on the performance.
  • there are DRL approaches that rely on multiple agents to parallelise the learning process or explore a wider variety of experiences.

PPO

image-20221225172333949

conclusion

  • for a fixed small magnitude in the perturbation, the agents still converge on a policy that works for both subsets (the original and the perturbed)
  • Among disturbances in the model’s input (sensing, reward) and output (actuation, action) -> the disturbances in the ability of the robots to actuate properly have had a comparatively worse effect than those in their ability to sense to the position of object accurately

limitation

  • though the empirical analysis showed that RL algorithm can still converge under the reward disturbance, the experiment is not conducted in the sparse reward environment.

Major comments

citation

  • PPO has been identifies ad one of the most robust approaches against reward perturbation.
    • ref: Zhao, Wenshuai, et al. “Towards closing the sim-to-real gap in collaborative multi-robot deep reinforcement learning.” 2020 5th International Conference on Robotics and Automation Engineering (ICRAE). IEEE, 2020.
    • ref: Wang, Jingkang, Yang Liu, and Bo Li. “Reinforcement learning with perturbed rewards.” Proceedings of the AAAI conference on artificial intelligence. Vol. 34. No. 04. 2020.