Proximal Policy Optimisation Explained Blog

[TOC] Title: Proximal Policy Optimisation Explained Blog Author: Xiao-Yang Liu; DI engine Publish Year: May 4, 2021 Review Date: Mon, Dec 26, 2022 Highly recommend reading this blog https://lilianweng.github.io/posts/2018-04-08-policy-gradient/ https://zhuanlan.zhihu.com/p/487754664 Difference between on-policy and off-policy For on-policy algorithms, they update the policy network based on the transitions generated by the current policy network. The critic network would make a more accurate value-prediction for the current policy network in common environments. For off-policy algorithms, they allow to update the current policy network using the transitions from old policies....

<span title='2022-12-26 19:50:35 +1100 AEDT'>December 26, 2022</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;196 words&nbsp;·&nbsp;Sukai Huang
architecture

Alex_petrekno Sample Factory Asynchronous Rl at Very High Fps 2020

[TOC] Title: Sample Factory: Asynchronous Rl at Very High FPS Author: Alex Petrenko Publish Year: Oct, 2020 Review Date: Sun, Sep 25, 2022 Summary of paper Motivation Identifying performance bottlenecks RL involves three workloads: environment simulation inference backpropagation overall performance depends on the lowest workload In existing methods (A2C/PPO/IMPALA) the computational workloads are dependent -> under-utilisation of the system resources. Existing high-throughput methods focus on distributed training, therefore introducing a lot of overhead such as networking serialisation, etc....

<span title='2022-09-25 16:34:09 +1000 AEST'>September 25, 2022</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;154 words&nbsp;·&nbsp;Sukai Huang
multimodal framework

Sergios_karagiannakos Vision Language Models Towards Multimodal Dl 2022

[TOC] Title: Vision Language Models Towards Multimodal Deep Learning Author: Sergios Karagiannakos Publish Year: 03 Mar 2022 Review Date: Tue, Aug 9, 2022 https://theaisummer.com/vision-language-models/

<span title='2022-08-09 07:37:30 +1000 AEST'>August 9, 2022</span>&nbsp;·&nbsp;1 min&nbsp;·&nbsp;24 words&nbsp;·&nbsp;Sukai Huang