[TOC]

  1. Title: Sample Factory: Asynchronous Rl at Very High FPS
  2. Author: Alex Petrenko
  3. Publish Year: Oct, 2020
  4. Review Date: Sun, Sep 25, 2022

Summary of paper

Motivation

Identifying performance bottlenecks

  1. RL involves three workloads:

    1. environment simulation
    2. inference
    3. backpropagation
    • overall performance depends on the lowest workload
    • In existing methods (A2C/PPO/IMPALA) the computational workloads are dependent -> under-utilisation of the system resources.
  2. Existing high-throughput methods focus on distributed training, therefore introducing a lot of overhead such as networking serialisation, etc.

    • e.g., (Ray & RLLib <==> Redis/Plasma, Seed RL <==> GRPC, Catalyst <==> Mongo DB)

Contribution

image-20220925165704937

Some key terms

Double-buffered sampling

Resolving bottleneck # 2 (communication)

image-20220925213918914

Good things about the paper (one paragraph)

Major comments

Minor comments

Incomprehension

Potential future work