Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.17432
Cited By
Video Understanding with Large Language Models: A Survey
29 December 2023
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
Teng Wang
Daoan Zhang
Jie An
Jingyang Lin
Rongyi Zhu
A. Vosoughi
Chao Huang
Zeliang Zhang
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Video Understanding with Large Language Models: A Survey"
8 / 58 papers shown
Title
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
212
682
0
13 Oct 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
116
178
0
23 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
272
1,939
0
09 Feb 2021
Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou
Stan Weixian Lei
Weiyao Wang
Deepti Ghadiyaram
Matt Feiszli
VOS
61
70
0
26 Jan 2021
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Mohit Bansal
103
268
0
24 Jan 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari
Kamaljeet Singh
Thomas Brox
111
495
0
24 Apr 2018
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
Previous
1
2