Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1812.05038
Cited By
v1
v2 (latest)
Long-Term Feature Banks for Detailed Video Understanding
12 December 2018
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Long-Term Feature Banks for Detailed Video Understanding"
50 / 315 papers shown
VidTr: Video Transformer Without Convolutions
IEEE International Conference on Computer Vision (ICCV), 2021
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
418
217
0
23 Apr 2021
Multiscale Vision Transformers
IEEE International Conference on Computer Vision (ICCV), 2021
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
481
1,513
0
22 Apr 2021
H2O: Two Hands Manipulating Objects for First Person Interaction Recognition
IEEE International Conference on Computer Vision (ICCV), 2021
Taein Kwon
Bugra Tekin
Jan Stühmer
Federica Bogo
Marc Pollefeys
EgoV
375
234
0
22 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2021
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
254
98
0
19 Apr 2021
Spatiotemporal Deformable Scene Graphs for Complex Activity Detection
British Machine Vision Conference (BMVC), 2021
Salman Khan
Fabio Cuzzolin
3DPC
238
5
0
16 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Computer Vision and Pattern Recognition (CVPR), 2021
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
183
23
0
02 Apr 2021
Visual Semantic Role Labeling for Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2021
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
290
88
0
02 Apr 2021
TubeR: Tubelet Transformer for Video Action Detection
Computer Vision and Pattern Recognition (CVPR), 2021
Jiaojiao Zhao
Yanyi Zhang
Xinyu Li
Hao Chen
Shuai Bing
...
Yuanjun Xiong
Davide Modolo
I. Marsic
Cees G. M. Snoek
Joseph Tighe
ViT
344
92
0
02 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
IEEE International Conference on Computer Vision (ICCV), 2021
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
103
19
0
01 Apr 2021
Learning Representational Invariances for Data-Efficient Action Recognition
Computer Vision and Image Understanding (CVIU), 2021
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
312
45
0
30 Mar 2021
Temporal Memory Relation Network for Workflow Recognition from Surgical Video
IEEE Transactions on Medical Imaging (IEEE TMI), 2021
Yueming Jin
Yonghao Long
Cheng Chen
Zixu Zhao
Qi Dou
Pheng-Ann Heng
244
117
0
30 Mar 2021
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation
Shuning Chang
Pichao Wang
F. Wang
Hao Li
Jiashi Feng
ViT
217
46
0
30 Mar 2021
ViViT: A Video Vision Transformer
IEEE International Conference on Computer Vision (ICCV), 2021
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
545
2,702
0
29 Mar 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
138
3
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
IEEE International Conference on Computer Vision (ICCV), 2021
Anurag Arnab
Chen Sun
Cordelia Schmid
230
52
0
29 Mar 2021
Regular Polytope Networks
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
F. Pernici
Matteo Bruni
C. Baecchi
Marco Bertini
191
30
0
29 Mar 2021
On the hidden treasure of dialog in video question answering
IEEE International Conference on Computer Vision (ICCV), 2021
Deniz Engin
Franccois Schnitzler
Ngoc Q. K. Duong
Yannis Avrithis
229
12
0
26 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Computer Vision and Pattern Recognition (CVPR), 2021
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
192
205
0
24 Mar 2021
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Computer Vision and Pattern Recognition (CVPR), 2021
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
201
173
0
22 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
Computer Vision and Pattern Recognition (CVPR), 2021
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
128
13
0
21 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
125
2
0
18 Mar 2021
ROAD: The ROad event Awareness Dataset for Autonomous Driving
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Gurkirt Singh
Stephen Akrigg
Manuele Di Maio
Valentina Fontana
Reza Javanmard Alitappeh
...
Salman Khan
S. Grazioso
Andrew Bradley
G. Gironimo
Fabio Cuzzolin
226
108
0
23 Feb 2021
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
209
22
0
16 Feb 2021
Win-Fail Action Recognition
Paritosh Parmar
B. Morris
158
6
0
15 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
International Conference on Machine Learning (ICML), 2021
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
1.1K
2,648
0
09 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
783
475
0
01 Feb 2021
Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting
Asian Conference on Computer Vision (ACCV), 2021
Sovan Biswas
Juergen Gall
166
2
0
21 Jan 2021
Smoothed Gaussian Mixture Models for Video Classification and Recommendation
Sirjan Kafle
Aman Gupta
Xue Xia
A. Sankar
Xi Chen
Di Wen
Liang Zhang
110
0
0
17 Dec 2020
NUTA: Non-uniform Temporal Aggregation for Action Recognition
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Hao Chen
Joseph Tighe
ViT
120
17
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
283
210
0
11 Dec 2020
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2020
Yang Fu
Linjie Yang
Ding Liu
Thomas S. Huang
Humphrey Shi
VOS
284
75
0
07 Dec 2020
SAFCAR: Structured Attention Fusion for Compositional Action Recognition
Tae Soo Kim
Gregory Hager
CoGe
174
10
0
03 Dec 2020
Recent Progress in Appearance-based Action Recognition
J. Humphreys
Zhe Chen
Dacheng Tao
170
0
0
25 Nov 2020
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks
Humam Alwassel
Silvio Giancola
Guohao Li
239
143
0
23 Nov 2020
Memory Optimization for Deep Networks
International Conference on Learning Representations (ICLR), 2020
Aashaka Shah
Chaoxia Wu
Jayashree Mohan
Vijay Chidambaram
Philipp Krahenbuhl
157
27
0
27 Oct 2020
Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
International Journal of Computer Vision (IJCV), 2020
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
BDL
356
28
0
18 Oct 2020
Pose And Joint-Aware Action Recognition
Anshul B. Shah
Shlok Kumar Mishra
Ankan Bansal
Jun-Cheng Chen
Ramalingam Chellappa
Abhinav Shrivastava
328
36
0
16 Oct 2020
Deep Sequence Learning for Video Anticipation: From Discrete and Deterministic to Continuous and Stochastic
S. Aliakbarian
AI4TS
128
0
0
09 Oct 2020
Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing
Okan Kopuklu
Stefan Hormann
Fabian Herzog
Hakan Çevikalp
Gerhard Rigoll
3DPC
151
17
0
30 Sep 2020
Texture Memory-Augmented Deep Patch-Based Image Inpainting
Rui Xu
Minghao Guo
Yuan Liu
Xiaoxiao Li
Bolei Zhou
Chen Change Loy
3DV
245
47
0
28 Sep 2020
Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
Computer Vision and Pattern Recognition (CVPR), 2020
Yanyi Zhang
Xinyu Li
I. Marsic
HAI
157
28
0
16 Sep 2020
Online Spatiotemporal Action Detection and Prediction via Causal Representations
Gurkirt Singh
3DPC
CML
181
0
0
31 Aug 2020
A Prospective Study on Sequence-Driven Temporal Sampling and Ego-Motion Compensation for Action Recognition in the EPIC-Kitchens Dataset
Alejandro López-Cifuentes
Marcos Escudero-Viñolo
Jesús Bescós
EgoV
112
2
0
26 Aug 2020
Query Twice: Dual Mixture Attention Meta Learning for Video Summarization
Junyan Wang
Yang Bai
Yang Long
Bingzhang Hu
Z. Chai
Yu Guan
Xiaolin K. Wei
EgoV
208
21
0
19 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
169
50
0
18 Aug 2020
Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network
Geo-Spatial Information Science (GSIS), 2020
Rui Li
Shunyi Zheng
Chenxi Duan
Ce Zhang
325
121
0
01 Aug 2020
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
European Conference on Computer Vision (ECCV), 2020
Baoxiong Jia
Yixin Chen
Siyuan Huang
Yixin Zhu
Song-Chun Zhu
146
64
0
31 Jul 2020
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
123
47
0
21 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
European Conference on Computer Vision (ECCV), 2020
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
228
83
0
20 Jul 2020
Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
European Conference on Computer Vision (ECCV), 2020
Noa Garcia
Yuta Nakashima
250
35
0
17 Jul 2020
Previous
1
2
3
4
5
6
7
Next