Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1812.05038
Cited By
v1
v2 (latest)
Long-Term Feature Banks for Detailed Video Understanding
12 December 2018
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Long-Term Feature Banks for Detailed Video Understanding"
50 / 315 papers shown
Title
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Zuoyu Qiu
Liang Xu
Yue Xu
Haoshu Fang
Cewu Lu
179
45
0
14 Feb 2022
OWL (Observe, Watch, Listen): Audiovisual Temporal Context for Localizing Actions in Egocentric Videos
Merey Ramazanova
Victor Escorcia
Fabian Caba Heilbron
Chen Zhao
Guohao Li
188
4
0
10 Feb 2022
A Coding Framework and Benchmark towards Compressed Video Understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Yuan Tian
Guo Lu
Manwen Liao
Guangtao Zhai
Lixing Chen
Zhiyong Gao
170
4
0
06 Feb 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Computer Vision and Pattern Recognition (CVPR), 2022
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
471
244
0
20 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
International Conference on Learning Representations (ICLR), 2022
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
368
14
0
17 Jan 2022
Video Transformers: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
442
137
0
16 Jan 2022
Hand-Object Interaction Reasoning
Advanced Video and Signal Based Surveillance (AVSS), 2022
Jian Ma
Dima Damen
173
8
0
13 Jan 2022
Multiview Transformers for Video Recognition
Computer Vision and Pattern Recognition (CVPR), 2022
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
398
265
0
12 Jan 2022
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization
AAAI Conference on Artificial Intelligence (AAAI), 2021
Zichen Yang
Jie Qin
Di Huang
132
67
0
21 Dec 2021
Distillation of Human-Object Interaction Contexts for Action Recognition
Muna Almushyti
Frederick W. Li
274
4
0
17 Dec 2021
SVIP: Sequence VerIfication for Procedures in Videos
Yichen Qian
Weixin Luo
Dongze Lian
Xu Tang
P. Zhao
Shenghua Gao
ViT
292
23
0
13 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
451
834
0
02 Dec 2021
Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips
Lijin Yang
Yifei Huang
Yusuke Sugano
Yoichi Sato
176
6
0
02 Dec 2021
Exploring Segment-level Semantics for Online Phase Recognition from Surgical Videos
IEEE Transactions on Medical Imaging (IEEE TMI), 2021
Xinpeng Ding
Xiaomeng Li
314
45
0
22 Nov 2021
Revisiting spatio-temporal layouts for compositional action recognition
British Machine Vision Conference (BMVC), 2021
Gorjan Radevski
Marie-Francine Moens
Tinne Tuytelaars
208
29
0
02 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
British Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
281
54
0
01 Nov 2021
Temporal-attentive Covariance Pooling Networks for Video Recognition
Zilin Gao
Qilong Wang
Bingbing Zhang
Q. Hu
P. Li
288
28
0
27 Oct 2021
Leveraging Local Temporal Information for Multimodal Scene Classification
Saurabh Sahu
Palash Goyal
ViT
81
0
0
26 Oct 2021
Domain Generalization through Audio-Visual Relative Norm Alignment in First Person Action Recognition
M. Planamente
Chiara Plizzari
Emanuele Alberti
Barbara Caputo
EgoV
237
48
0
19 Oct 2021
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context
Yuxi Li
Boshen Zhang
Jian Li
Yabiao Wang
Weiyao Lin
Chengjie Wang
Jilin Li
Feiyue Huang
145
5
0
19 Oct 2021
Object-Region Video Transformers
Roei Herzig
Elad Ben-Avraham
K. Mangalam
Amir Bar
Gal Chechik
Anna Rohrbach
Trevor Darrell
Amir Globerson
ViT
373
97
0
13 Oct 2021
Deep Learning-based Action Detection in Untrimmed Videos: A Survey
Elahe Vahdani
Yingli Tian
349
84
0
30 Sep 2021
Efficient Global-Local Memory for Real-time Instrument Segmentation of Robotic Surgical Video
Jiacheng Wang
Yueming Jin
Liansheng Wang
Shuntian Cai
Pheng-Ann Heng
Jing Qin
163
21
0
28 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
372
462
0
17 Sep 2021
Is First Person Vision Challenging for Object Tracking?
Matteo Dunnhofer
Antonino Furnari
G. Farinella
C. Micheloni
250
25
0
31 Aug 2021
Mining Contextual Information Beyond Image for Semantic Segmentation
IEEE International Conference on Computer Vision (ICCV), 2021
Zhenchao Jin
Tao Gong
Dongdong Yu
Qi Chu
Jian Wang
Changhu Wang
Jie Shao
207
93
0
26 Aug 2021
Identity-aware Graph Memory Network for Action Detection
ACM Multimedia (ACM MM), 2021
Jingcheng Ni
Jie Qin
Di Huang
175
10
0
26 Aug 2021
Temporal Action Segmentation with High-level Complex Activity Labels
Guodong Ding
Angela Yao
221
21
0
15 Aug 2021
Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring
IEEE International Conference on Computer Vision (ICCV), 2021
Omiros Pantazis
Gabriel J. Brostow
Kate E. Jones
Oisin Mac Aodha
SSL
137
31
0
14 Aug 2021
Video Contrastive Learning with Global Context
Haofei Kuang
Yi Zhu
Zhi-Li Zhang
Xinyu Li
Joseph Tighe
Sören Schwertfeger
C. Stachniss
Mu Li
SSL
AI4TS
223
64
0
05 Aug 2021
Predicting the Future from First Person (Egocentric) Vision: A Survey
Computer Vision and Image Understanding (CVIU), 2021
Ivan Rodin
Antonino Furnari
Dimitrios Mavroeidis
G. Farinella
EgoV
200
51
0
28 Jul 2021
Transferable Knowledge-Based Multi-Granularity Aggregation Network for Temporal Action Localization: Submission to ActivityNet Challenge 2021
Haisheng Su
Peiqin Zhuang
Yukun Li
Dongliang Wang
Weihao Gan
Wei Wu
Yu Qiao
115
1
0
27 Jul 2021
Human-like Relational Models for Activity Recognition in Video
J. Chrol-Cannon
Andrew Gilbert
R. Lazic
Adithya Madhusoodanan
Frank Guerin
BDL
149
1
0
12 Jul 2021
Review of Video Predictive Understanding: Early Action Recognition and Future Action Prediction
He Zhao
Richard P. Wildes
213
13
0
11 Jul 2021
Long Short-Term Transformer for Online Action Detection
Neural Information Processing Systems (NeurIPS), 2021
Mingze Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Xia
Zhuowen Tu
Stefano Soatto
ViT
266
169
0
07 Jul 2021
Spatio-Temporal Context for Action Detection
Manuel Sarmiento Calderó
David Varas
Elisenda Bou
171
2
0
29 Jun 2021
Towards Long-Form Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2021
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
310
193
0
21 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
577
154
0
21 Jun 2021
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Martine Toering
Ioannis Gatopoulos
M. Stol
Vincent Tao Hu
SSL
248
13
0
18 Jun 2021
JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection
Mahsa Ehsanpour
F. Saleh
Silvio Savarese
Ian Reid
Hamid Rezatofighi
228
59
0
16 Jun 2021
Relation Modeling in Spatio-Temporal Action Localization
Yutong Feng
Jianwen Jiang
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Shiwei Zhang
Mingqian Tang
Yue Gao
178
11
0
15 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Neural Information Processing Systems (NeurIPS), 2021
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
264
339
0
09 Jun 2021
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
Computer Vision and Pattern Recognition (CVPR), 2021
Shao-Wei Liu
Hanwen Jiang
Jiarui Xu
Sifei Liu
Xiaolong Wang
3DH
231
191
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
125
13
0
09 Jun 2021
Anticipative Video Transformer
IEEE International Conference on Computer Vision (ICCV), 2021
Rohit Girdhar
Kristen Grauman
ViT
290
249
0
03 Jun 2021
Cross-Domain First Person Audio-Visual Action Recognition through Relative Norm Alignment
M. Planamente
Chiara Plizzari
Emanuele Alberti
Barbara Caputo
EgoV
207
13
0
03 Jun 2021
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos
Meng-Jiun Chiou
Chun-Yu Liao
Li-Wei Wang
Roger Zimmermann
Jiashi Feng
214
30
0
25 May 2021
MultiSports: A Multi-Person Video Dataset of Spatio-Temporally Localized Sports Actions
IEEE International Conference on Computer Vision (ICCV), 2021
Shouqing Yang
Lei Chen
Runyu He
Zhenzhi Wang
Gangshan Wu
Limin Wang
253
126
0
16 May 2021
Not All Memories are Created Equal: Learning to Forget by Expiring
International Conference on Machine Learning (ICML), 2021
Sainbayar Sukhbaatar
Da Ju
Spencer Poff
Stephen Roller
Arthur Szlam
Jason Weston
Angela Fan
CLL
239
36
0
13 May 2021
Few-Shot Video Object Detection
European Conference on Computer Vision (ECCV), 2021
Qi Fan
Chi-Keung Tang
Yu-Wing Tai
353
15
0
30 Apr 2021
Previous
1
2
3
4
5
6
7
Next