Xuran Pan on the Integration of Self Attention and Convolution 2022

April 25, 2024 · 1 min · 147 words · Sukai Huang | Submit a report

Table of Contents

Summary of paper
- Motivation
Related work

[TOC]

Title: On the Integration of Self-Attention and Convolution
Author: Xuran Pan et. al.
Publish Year: 2022 IEEE
Review Date: Thu, Apr 25, 2024
url: https://arxiv.org/abs/2111.14556

Summary of paper

Motivation

there exists a strong underlying relation between convolution and self-attention.

Convolution NN

it uses convolution kernels to extract local features, have become the most powerful and conventional technique for various vision tasks

Self-attention only

Recently, vision transformer shows that given enough data, we can treat an image as a sequence of 256 tokens and leverage Transformer models to achieve competitive results in image recognition.

Attention enhanced convolution

Multiple previously proposed attention mechanisms over images suggest it can overcome the limitation of locality for convolutional networks.

Convolution enhanced Attention

Among which exist researchers focusing on complementing transformer models with convolution operations to introduce additional inductive biases.
- Add convolutions at the early stage to achieve stabler training.