Ilya_kostrikov Offline Rl With Implicit Q Learning 2021

[TOC] Title: Offline Reinforcement Learning with Implicit Q-learning Author:Ilya Kostrikov et. al. Publish Year: 2021 Review Date: Mar 2022 Summary of paper Motivation conflict in offline reinforcement learning offline reinforcement learning requires reconciling two conflicting aims: learning a policy that improves over the behaviour policy (old policy) that collected the dataset while at the same time minimizing the deviation from the behaviour policy so as to avoid errors due to distributional shift (e....

<span title='2022-03-22 19:01:49 +1100 AEDT'>March 22, 2022</span>&nbsp;ยท&nbsp;4 min&nbsp;ยท&nbsp;Sukai Huang