Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos

6 August 2020

Papers citing "Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos"

50 / 81 papers shown

Dual Learning with Dynamic Knowledge Distillation and Soft Alignment for Partially Relevant Video Retrieval

167

14 Oct 2025

FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting

354

29 Sep 2025

Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment RetrievalInternational Joint Conference on Artificial Intelligence (IJCAI), 2025

198

15 Aug 2025

A Flexible and Scalable Framework for Video Moment Search

Chongzhi Zhang

Xizhou Zhu

Aixin Sun

139

10 Jan 2025

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

391

02 Dec 2024

Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path ReasoningInternational Conference on Computational Linguistics (COLING), 2024

Xiaoye Qu

Jiashuo Sun

Wei Wei

Yu Cheng

MLLM LRM

309

30 Aug 2024

Context-Enhanced Video Moment Retrieval with Large Language Models

Bo Liu

324

21 May 2024

Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoAAAI Conference on Artificial Intelligence (AAAI), 2024

351

15 Jan 2024

Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval

315

12 Dec 2023

Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action UnderstandingACM Multimedia (ACM MM), 2023

Meng Wang

328

06 Nov 2023

Learning Temporal Sentence Grounding From Narrated EgoVideosBritish Machine Vision Conference (BMVC), 2023

Kevin Flanagan

Dima Damen

Michael Wray

243

26 Oct 2023

Exploring Iterative Refinement with Diffusion Models for Video GroundingIEEE International Conference on Multimedia and Expo (ICME), 2023

330

26 Oct 2023

DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight DetectionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

Henghao Zhao

Kevin Qinghong Lin

Rui Yan

Zechao Li

VGen DiffM

417

29 Aug 2023

Temporal Sentence Grounding in Streaming VideosACM Multimedia (ACM MM), 2023

301

14 Aug 2023

Knowing Where to Focus: Event-aware Transformer for Video GroundingIEEE International Conference on Computer Vision (ICCV), 2023

362

14 Aug 2023

A Survey on Video Moment LocalizationACM Computing Surveys (ACM CSUR), 2022

Meng Wang

398

13 Jun 2023

MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerIEEE International Joint Conference on Neural Network (IJCNN), 2023

Yang Li

293

29 Apr 2023

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long VideosIEEE International Conference on Computer Vision (ICCV), 2023

Yuxin Peng

251

15 Mar 2023

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Jiahao Zhu

...

Lichao Sun

234

02 Jan 2023

Video-Guided Curriculum Learning for Spoken Video GroundingACM Multimedia (ACM MM), 2022

Zhou Zhao

193

01 Sep 2022

Hierarchical Local-Global Transformer for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022

346

31 Aug 2022

PRVR: Partially Relevant Video RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

334

26 Aug 2022

Reducing the Vision and Language Bias for Temporal Sentence GroundingACM Multimedia (ACM MM), 2022

Daizong Liu

Xiaoye Qu

Wei Hu

290

27 Jul 2022

Multi-Attention Network for Compressed Video Referring Object SegmentationACM Multimedia (ACM MM), 2022

Guorong Li

258

26 Jul 2022

You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in VideosAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

250

25 May 2022

Entity-aware and Motion-aware Transformers for Language-driven Action Localization in VideosInternational Joint Conference on Artificial Intelligence (IJCAI), 2022

Shuo Yang

Xinxiao Wu

266

12 May 2022

Video Moment Retrieval from Text Queries via Single Frame AnnotationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

324

20 Apr 2022

Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding

321

04 Apr 2022

Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional VideoIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

813

13 Mar 2022

Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022

248

06 Mar 2022

Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos

500

25 Jan 2022

Temporal Sentence Grounding in Videos: A Survey and Future DirectionsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

471

20 Jan 2022

Unsupervised Temporal Video Grounding with Deep Semantic ClusteringAAAI Conference on Artificial Intelligence (AAAI), 2022

302

14 Jan 2022

Exploring Motion and Appearance Information for Temporal Sentence GroundingAAAI Conference on Artificial Intelligence (AAAI), 2022

Daizong Liu

Xiaoye Qu

Pan Zhou

Yang Liu

219

03 Jan 2022

Memory-Guided Semantic Learning Network for Temporal Sentence GroundingAAAI Conference on Artificial Intelligence (AAAI), 2022

374

03 Jan 2022

Hierarchical Deep Residual Reasoning for Temporal Moment LocalizationACM Multimedia Asia (MA), 2021

226

31 Oct 2021

CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

Zhijian Hou

Chong-Wah Ngo

W. Chan

153

21 Sep 2021

A Survey on Temporal Sentence Grounding in Videos

389

16 Sep 2021

Progressively Guide to Attend: An Iterative Alignment Framework for Temporal Sentence Grounding

Daizong Liu

Xiaoye Qu

Pan Zhou

235

14 Sep 2021

Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos

253

14 Sep 2021

Fine-Grained Fashion Similarity Prediction by Attribute-Specific Embedding LearningIEEE Transactions on Image Processing (TIP), 2021

336

06 Apr 2021

A Survey on Natural Language Video Localization

274

01 Apr 2021

Context-aware Biaffine Localizing Network for Temporal Sentence GroundingComputer Vision and Pattern Recognition (CVPR), 2021

258

180

22 Mar 2021

Natural Language Video Localization: A Revisit in Span-based Question Answering FrameworkIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

493

107

26 Feb 2021

Progressive Localization Networks for Language-based Moment Localization

318

02 Feb 2021

HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

784

549

01 May 2020

VIOLIN: A Large-Scale Dataset for Video-and-Language InferenceComputer Vision and Pattern Recognition (CVPR), 2020

330

25 Mar 2020

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019

Wei Liu

298

278

31 Oct 2019

Attention on Attention for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2019

506

988

19 Aug 2019

Cross-Modal Interaction Networks for Query-Based Moment Retrieval in VideosAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2019

Zhu Zhang

Zhijie Lin

Zhou Zhao

Zhenxin Xiao

293

232

06 Jun 2019