Deepmind Flamingo a Visual Language Model for Few Shot Learning 2022
[TOC] Title: Flamingo: a Visual Language Model for Few-Shot Learning Author: Jean-Baptiste Alayrac et. al. Publish Year: Apr 2022 Review Date: May 2022 Summary of paper Flamingo architecture Pretrained vision encoder: from pixels to features the model’s vision encoder is a pretrained Normalizer-Free ResNet (NFNet) they pretrain the vision encoder using a contrastive objective on their datasets of image and text pairs, using the two term contrastive loss from paper “Learning Transferable Visual Models From Natural Language Supervision” ...