Temporal Alignment Networks for Long-term Video

Computer Vision and Pattern Recognition (CVPR), 2022

6 April 2022

Papers citing "Temporal Alignment Networks for Long-term Video"

23 / 73 papers shown

Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal RetrievalIEEE Transactions on Image Processing (IEEE TIP), 2023

Meng Han

Meng Wang

207

11 Sep 2023

Opening the Vocabulary of Egocentric ActionsNeural Information Processing Systems (NeurIPS), 2023

Angela Yao

303

22 Aug 2023

EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language UnderstandingNeural Information Processing Systems (NeurIPS), 2023

K. Mangalam

Raiymbek Akshulakov

Jitendra Malik

399

495

17 Aug 2023

Video-Mined Task Graphs for Keystep Recognition in Instructional VideosNeural Information Processing Systems (NeurIPS), 2023

Kumar Ashutosh

Santhosh Kumar Ramakrishnan

Triantafyllos Afouras

Kristen Grauman

299

17 Jul 2023

Learning to Ground Instructional Articles in Videos through NarrationsIEEE International Conference on Computer Vision (ICCV), 2023

E. Mavroudi

Triantafyllos Afouras

Lorenzo Torresani

DiffM

217

06 Jun 2023

StepFormer: Self-supervised Step Discovery and Localization in Instructional VideosComputer Vision and Pattern Recognition (CVPR), 2023

Nikita Dvornik

Isma Hadji

Ran Zhang

Konstantinos G. Derpanis

Animesh Garg

Richard P. Wildes

Allan D. Jepson

192

26 Apr 2023

LASER: A Neuro-Symbolic Framework for Learning Spatial-Temporal Scene Graphs with Weak SupervisionInternational Conference on Learning Representations (ICLR), 2023

Jiani Huang

Ziyang Li

Mayur Naik

Ser-Nam Lim

667

15 Apr 2023

Vita-CLIP: Video and text adaptive CLIP via Multimodal PromptingComputer Vision and Pattern Recognition (CVPR), 2023

Salman Khan

225

110

06 Apr 2023

Learning Procedure-aware Video Representation from Instructional Videos and Their NarrationsComputer Vision and Pattern Recognition (CVPR), 2023

236

31 Mar 2023

What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated InstructionsComputer Vision and Pattern Recognition (CVPR), 2023

351

29 Mar 2023

Aligning Step-by-Step Instructional Diagrams to Video DemonstrationsComputer Vision and Pattern Recognition (CVPR), 2023

Jiahao Zhang

A. Cherian

Yanbin Liu

Yizhak Ben-Shabat

Cristian Rodriguez-Opazo

Stephen Gould

224

24 Mar 2023

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential VideosComputer Vision and Pattern Recognition (CVPR), 2023

274

22 Mar 2023

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video CaptioningComputer Vision and Pattern Recognition (CVPR), 2023

497

325

27 Feb 2023

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

357

05 Jan 2023

Test of Time: Instilling Video-Language Models with a Sense of TimeComputer Vision and Pattern Recognition (CVPR), 2023

Piyush Bagad

Makarand Tapaswi

Cees G. M. Snoek

463

05 Jan 2023

Learning Video Representations from Large Language ModelsComputer Vision and Pattern Recognition (CVPR), 2022

306

229

08 Dec 2022

Temporal Action Segmentation: An Analysis of Modern TechniquesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Guodong Ding

Fadime Sener

Angela Yao

602

116

19 Oct 2022

Turbo Training with Token DropoutBritish Machine Vision Conference (BMVC), 2022

211

10 Oct 2022

Multimodal Learning with Transformers: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

529

836

13 Jun 2022

A CLIP-Hitchhiker's Guide to Long Video Retrieval

418

17 May 2022

Prompting Visual-Language Models for Efficient Video Understanding

362

459

08 Dec 2021

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

1.2K

4,653

17 Jun 2020

Action Segmentation with Joint Self-Supervised Temporal Domain AdaptationComputer Vision and Pattern Recognition (CVPR), 2020

461

142

05 Mar 2020