v1v2 (latest)

Long-Term Feature Banks for Detailed Video Understanding

12 December 2018

Chao-Yuan Wu

Christoph Feichtenhofer

Papers citing "Long-Term Feature Banks for Detailed Video Understanding"

50 / 315 papers shown

Multi-Task Learning of Object State Changes from Uncurated Videos

Tomávs Souvcek

Jean-Baptiste Alayrac

Antoine Miech

Ivan Laptev

Josef Sivic

194

24 Nov 2022

Discovering A Variety of Objects in Spatio-Temporal Human-Object Interactions

...

Haisheng Su

195

14 Nov 2022

End-to-end Transformer for Compressed Video Quality EnhancementIEEE transactions on broadcasting (IEEE Trans. Broadcast.), 2022

193

25 Oct 2022

Holistic Interaction Transformer Network for Action DetectionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Gueter Josmy Faure

Min-Hung Chen

S. Lai

291

23 Oct 2022

YOWO-Plus: An Incremental Improvement

Jianhua Yang

ViT

130

20 Oct 2022

Grounded Video Situation RecognitionNeural Information Processing Systems (NeurIPS), 2022

Zeeshan Khan

C. V. Jawahar

Makarand Tapaswi

190

19 Oct 2022

Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive LearningNeural Information Processing Systems (NeurIPS), 2022

276

12 Oct 2022

ConTra: (Con)text (Tra)nsformer for Cross-Modal Video RetrievalAsian Conference on Computer Vision (ACCV), 2022

A. Fragomeni

Michael Wray

Dima Damen

CLIP ViT

145

09 Oct 2022

Compressed Vision for Efficient Video UnderstandingAsian Conference on Computer Vision (ACCV), 2022

119

06 Oct 2022

COPILOT: Human-Environment Collision Prediction and Localization from Egocentric VideosIEEE International Conference on Computer Vision (ICCV), 2022

149

04 Oct 2022

Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action DetectionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Luc Van Gool

336

28 Sep 2022

Visual Object Tracking in First Person VisionInternational Journal of Computer Vision (IJCV), 2022

235

27 Sep 2022

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal GroundingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Chong-Wah Ngo

253

22 Sep 2022

MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

370

09 Sep 2022

Spatio-Temporal Action Detection Under Large MotionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

Luc Van Gool

267

06 Sep 2022

A comprehensive survey on recent deep learning-based methods applied to surgical data

Mansoor Ali

Rafael Martinez Garcia Peña

Gilberto Ochoa-Ruiz

Sharib Ali

419

03 Sep 2022

Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action RecognitionEuropean Conference on Computer Vision (ECCV), 2022

197

03 Sep 2022

A Circular Window-based Cascade Transformer for Online Action Detection

191

30 Aug 2022

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

Stephen Su

Sam Kwong

Qingyu Zhao

De-An Huang

Juan Carlos Niebles

Ehsan Adeli

175

22 Aug 2022

EgoEnv: Human-centric environment representations from egocentric videoNeural Information Processing Systems (NeurIPS), 2022

Tushar Nagarajan

Santhosh Kumar Ramakrishnan

Ruta Desai

James M. Hillis

Kristen Grauman

EgoV

296

22 Jul 2022

Is an Object-Centric Video Representation Beneficial for Transfer?Asian Conference on Computer Vision (ACCV), 2022

346

20 Jul 2022

ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention networkIEEE Access (IEEE Access), 2022

Nikolaos Gkalelis

Dimitrios Daskalakis

Vasileios Mezaris

204

20 Jul 2022

Learning Sequence Representations by Non-local Recurrent Neural MemoryInternational Journal of Computer Vision (IJCV), 2022

295

20 Jul 2022

Learning from Label Relationships in Human AffectACM Multimedia (ACM MM), 2022

Niki Maria Foteinopoulou

Ioannis Patras

CVBM

189

12 Jul 2022

Beyond Transfer Learning: Co-finetuning for Action Localisation

262

08 Jul 2022

Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and BaselineComputer Vision and Pattern Recognition (CVPR), 2022

Yibo Wang

Xun Cao

206

23 Jun 2022

One-stage Action Detection Transformer

112

21 Jun 2022

It's Time for Artistic Correspondence in Music and VideoComputer Vision and Pattern Recognition (CVPR), 2022

Dídac Surís

Carl Vondrick

Bryan C. Russell

Justin Salamon

151

14 Jun 2022

A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action DetectorIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022

273

07 Jun 2022

Revisiting the "Video" in Video-Language UnderstandingComputer Vision and Pattern Recognition (CVPR), 2022

S. Buch

Cristobal Eyzaguirre

Adrien Gaidon

Jiajun Wu

L. Fei-Fei

Juan Carlos Niebles

213

202

03 Jun 2022

A CLIP-Hitchhiker's Guide to Long Video Retrieval

418

17 May 2022

Retrieval-Enhanced Machine LearningAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

165

02 May 2022

The Wisdom of Crowds: Temporal Progressive Attention for Early Action PredictionComputer Vision and Pattern Recognition (CVPR), 2022

Alexandros Stergiou

Dima Damen

AI4TS EgoV EDL

171

28 Apr 2022

Temporal Relevance Analysis for Video Action Models

165

25 Apr 2022

A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions

Fan Yang

240

21 Apr 2022

THORN: Temporal Human-Object Relation Network for Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2022

169

20 Apr 2022

LaMemo: Language Modeling with Look-Ahead MemoryNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

164

15 Apr 2022

SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action RecognitionEuropean Conference on Computer Vision (ECCV), 2022

228

10 Apr 2022

E^2TAD: An Energy-Efficient Tracking-based Action Detector

...

432

09 Apr 2022

Hierarchical Self-supervised Representation Learning for Movie UnderstandingComputer Vision and Pattern Recognition (CVPR), 2022

198

06 Apr 2022

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical ConsistencyComputer Vision and Pattern Recognition (CVPR), 2022

242

06 Apr 2022

TALLFormer: Temporal Action Localization with a Long-memory TransformerEuropean Conference on Computer Vision (ECCV), 2022

Feng Cheng

Gedas Bertasius

ViT

321

119

04 Apr 2022

Exploiting Temporal Relations on Radar Perception for Autonomous DrivingComputer Vision and Pattern Recognition (CVPR), 2022

276

03 Apr 2022

A-ACT: Action Anticipation through Cycle Transformations

Akash Gupta

Jingen Liu

Liefeng Bo

Amit K. Roy-Chowdhury

Tao Mei

209

02 Apr 2022

MeMOT: Multi-Object Tracking with MemoryComputer Vision and Pattern Recognition (CVPR), 2022

321

213

31 Mar 2022

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video ModelsComputer Vision and Pattern Recognition (CVPR), 2022

137

31 Mar 2022

Global Tracking TransformersComputer Vision and Pattern Recognition (CVPR), 2022

282

174

24 Mar 2022

Point3D: tracking actions as moving points with 3D CNNsBritish Machine Vision Conference (BMVC), 2022

Shentong Mo

Jingfei Xia

Xiaoqing Ellen Tan

Bhiksha Raj

3DPC

252

20 Mar 2022

Local-Global Context Aware Transformer for Language-Guided Video SegmentationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

322

101

18 Mar 2022

Gate-Shift-Fuse for Video Action RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Swathikiran Sudhakaran

Sergio Escalera

Oswald Lanz

267

16 Mar 2022