[TOC]

  1. Title: Augmenting Transformers with KNN-based composite memory for dialog
  2. Author: Angela Fan et. al.
  3. Publish Year: 2021
  4. Review Date: Apr 2022

Summary of paper

Motivation

The author proposed augmenting generative Transformer neural network with KNN based Information Fetching module

Each KIF module learns a read operation to access fix external knowledge (e.g., WIKI)

The author demonstrated the effectiveness of this approach by identifying relevant knowledge required for knowledgeable but engaging dialog from Wikipedia, images and human-written dialog utterances.

drawback of previous work

many existing approaches focus on using attention over the memory slots, which is computationally intensive and becomes less effective as the size of the memory grows.

Advantages of KNN read operation

In the proposed method, KNN search is computationally efficient and scalable.

we can thus scale easily to larger memories by learning only the KNN-based read operation to identify relevant information from the memory

image-20220422104558164

The procedure

Some key terms

dialog modelling

this is a challenging task where information must be flexibly retrieved and incorporated to maintain the topic and flow of conversations.

training after stabilised

the model also try to pass backpropagation gradient to encoding module after certain training steps.

Minor comments

library faiss allow KNN to easily used on GPUs

Potential future work

good to try for NetHack environment