Mariana_learning Generative Models With Goal Conditioned Reinforcement Learning 2023

[TOC] Title: Learning Generative Models With Goal Conditioned Reinforcement Learning Author: Mariana Vargas Vieyra et. al. Publish Year: 26 Mar 2023 Review Date: Thu, Mar 30, 2023 url: https://arxiv.org/abs/2303.14811 Summary of paper Contribution we present a novel framework for learning generative models with goal-conditioned reinforcement learning we define two agents, a goal conditioned agent (GC-agent) and a supervised agent (S-agent) Given a user-input initial state, the GC-agent learns to reconstruct the training set. In this context, elements in the training set are the goals. during training, the S-agent learns to imitate the GC-agent while remaining agnostic of the goals At inference we generate new samples with S-agent. Some key terms Goal-Conditioned Reinforcement Learning (GCRL) framework ...

March 30, 2023 · 2 min · 325 words · Sukai Huang

Itsugun_cho Deep Rl With Hierarchical Action Exploration for Dialogue Generation 2023

[TOC] Title: Deep RL With Hierarchical Action Exploration for Dialogue Generation Author: Itsugun Cho et. al. Publish Year: 22 Mar 2023 Review Date: Thu, Mar 30, 2023 url: https://arxiv.org/pdf/2303.13465v1.pdf Summary of paper Motivation Approximate dynamic programming applied to dialogue generation involves policy improvement with action sampling. However, such a practice is inefficient for reinforcement learning because the eligible (high action value) responses are very sparse, and the greedy policy sustained by the random sampling is flabby. Contribution this paper shows that the performance of dialogue policy positively correlated with sampling size by theoretical and experimental. we introduce a novel dual-granularity Q-function to alleviate this limitation by exploring the most promising response category to intervene the sampling. Some key terms limitation of the maximum likelihood estimation (MLE) objective for the probability distribution of responses ...

March 30, 2023 · 2 min · 358 words · Sukai Huang