[TOC]

  1. Title: Improving language models by retrieving from trillions of tokens
  2. Author: Sebastian Borgeaud et. al.
  3. Publish Year: Feb 2022
  4. Review Date: Mar 2022

Summary of paper

Motivation

in order to decrease the size of language model, this work suggested retrieval from a large text database as a complementary path to scaling language models.

they equip models with the ability to directly access a large dataset to perform prediction โ€“ a semi-parametric approach.

how do they do

first, construct a key-value database, where values store raw chunks of text tokens and key are frozen BERT embeddings.

second, each training sequence input is split into chunks, which are augmented with their k-nearest neighbours retrieved from the database.

finally, a encoder-decoder architecture integrates retrieval chunks into the modelsโ€™s predictions.

image-20220321212422235

Algorithm

image-20220321213747170

Some key terms

retrieval database

a key-value database

Potential future work

looks like we do have such a very large database

also the database input and output are both text sequences, which may not be useful for language assisted RL