[TOC]

  1. Title: Reinforced Cross Modal Matching and Self Supervised Imitation Learning for Vision Language Navigation 2019
  2. Author: Xin Wang et. al.
  3. Publish Year:
  4. Review Date: Wed, Jan 18, 2023

Summary of paper

image-20230118095333795

Motivation

Visual Language Navigation (VLN) presents some unique challenges

Implementation

  1. agent can infer which sub-instruction to focus on and where to look at. (automatic splitting long instruction)
  2. with a matching critic that evaluates an executed path by the probability of reconstructing the original instruction from the executed path. P(original instruction | past trajectory)
    1. cycle reconstruction: we have P(target trajectory | the instruction) = 1, and we want to measure P(original instruction | past trajectory)
    2. this will enhance the interpretability as now you understand how the robot was thinking about