[TOC]

  1. Title: Improved Baselines With Visual Instruction Tuning
  2. Author: Haotian Liu et. al.
  3. Publish Year: Oct 5 2023
  4. Review Date: Sun, Oct 8, 2023
  5. url: https://arxiv.org/pdf/2310.03744.pdf

Summary of paper

image-20231008103914399

Motivation

Contribution

Some key terms

Improvement one: MLP cross modal connector

Improvement two: Incorporating academic task related data such as VQA

Background

instruction-following LMM

existing limitation

image-20231008112723603

MLP Vision Language Connector