[TOC]

  1. Title: Prioritised Experience Replay
  2. Author: Neuralnet.ai
  3. Publish Year: 25 Feb, 2016
  4. Review Date: Thu, Jun 2, 2022

https://www.neuralnet.ai/a-brief-overview-of-rank-based-prioritized-experience-replay/

Replay memory is essential in RL

Replay memory has been successfully deployed in both value based and policy gradient based reinforcement learning algorithms, to great success. The reasons for this success cut right to the heart of reinforcement learning. In particular, replay memory simultaneously solves two outstanding problems with the field.

  1. we shuffle the dataset and sample historic experience at random, we can obtain independent and uncorrelated inputs, which is important for deep neural network training. This is precisely what underpins the Markov property of the system. (In probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process.)
  2. we revisited and make attention to the historic experience with the hope that the agent learns something generalisable.

Improvement Direction

we can improve on how we sample the agent’s memories. The default is to simply sample them at random, which works, but leaves much to be desired.

Input aliasing to be improved

One issue that can be improved upon is that neural networks introduce a sort of aliasing into the problem.

So the question become “would the agent learn more from sampling totally distinct experiences

$$ \delta_t = r_t + \gamma Q_{target}(S_{t+1}, argmax_a Q(S_{t+1}, a)) - Q(S_t, a_t) $$