Qinghao_hitea Hierarchical Temporal Aware Video Language Pre Training 2022

[TOC] Title: Hierarchical Temporal Aware Video Language Pre Training Author: Qinghao Ye, Fei Huang et. al. Publish Year: 30 Dec 2022 Review Date: Thu, Apr 6, 2023 url: https://arxiv.org/pdf/2212.14546.pdf Summary of paper Motivation most previous methods directly inherit or adapt typical image-language pre-training paradigms to video-language pretraining, thus not fully exploiting the unique characteristic of video, i.e., temporal. Contribution this paper, the two novel pretraining tasks for modeling cross-modal alignment between moments and texts as well as the temporal relations of video-text pairs....

<span title='2023-04-06 10:02:22 +0800 +0800'>April 6, 2023</span>&nbsp;·&nbsp;2 min&nbsp;·&nbsp;411 words&nbsp;·&nbsp;Sukai Huang