Giuseppe_de_giacomo Foundations for Retraining Bolts Rl With Ltl 2019

[TOC] Title: Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf Restraining Specification Author: Giuseppe De Giacomo et. al. Publish Year: 2019 Review Date: Mar 2022 Summary of paper The author investigated the concept of “restraining bolt” that can control the behaviour of learning agents. Essentially, the way to control a RL agent is that the bolt provides additional rewards to the agent Although this method is essentially the same as reward shaping (providing additional rewards to the agent), the contribution of this paper is...

<span title='2022-03-04 12:12:57 +1100 AEDT'>March 4, 2022</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Sukai Huang

Pashootan_vaezipoor Ltl2action Generalising Ltl Instructions for Multi Task Rl 2021

please modify the following [TOC] Title: LTL2Action: Generalizing LTL Instructions for Multi-Task RL Author: Pashootan Vaezipoor et. al. Publish Year: 2021 Review Date: March 2022 Summary of paper Motivation they addressed the problem of teaching a deep reinforcement learning agent to follow instructions in multi-task environments. Instructions are expressed in a well-known formal language – linear temporal logic (LTL) Limitation of the vanilla MDP temporal constraints cannot be expressed as rewards in MDP setting and thus modular policy and other stuffs are not able to obtain maximum rewards....

<span title='2022-03-01 20:53:10 +1100 AEDT'>March 1, 2022</span>&nbsp;·&nbsp;3 min&nbsp;·&nbsp;Sukai Huang

Roma_patel Learning to Ground Language Temporal Logical Form 2019

[TOC] Title: Learning to Ground Language to Temporal Logical Form Author: Roma Patel et. al. Publish Year: 2019 Review Date: Feb 2022 Summary of paper Motivation natural language commands often exhibits sequential (temporal) constraints e.g., “go through the kitchen and then into the living room”. But this constraints cannot be expressed in the reward of Markov Decision Process setting. (see this paper) Therefore, they proposed to ground language to Linear Temporal logic (LTL) and after that continue to map from LTL expressions to action sequences....

<span title='2022-02-28 21:40:53 +1100 AEDT'>February 28, 2022</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;Sukai Huang