ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.02448
  4. Cited By
Fine-grained Iterative Attention Network for TemporalLanguage
  Localization in Videos

Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos

6 August 2020
Xiaoye Qu
Peng Tang
Zhikang Zhou
Yu Cheng
Jianfeng Dong
Pan Zhou
ArXiv (abs)PDFHTML

Papers citing "Fine-grained Iterative Attention Network for TemporalLanguage Localization in Videos"

50 / 81 papers shown
Dual Learning with Dynamic Knowledge Distillation and Soft Alignment for Partially Relevant Video Retrieval
Dual Learning with Dynamic Knowledge Distillation and Soft Alignment for Partially Relevant Video Retrieval
Jianfeng Dong
Lei Huang
Daizong Liu
Xianke Chen
Xun Yang
Changting Lin
Xun Wang
Meng Wang
167
0
0
14 Oct 2025
FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting
FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting
Zefeng He
Xiaoye Qu
Yafu Li
Siyuan Huang
Daizong Liu
Yu Cheng
OffRLVLMLRM
354
11
0
29 Sep 2025
Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment Retrieval
Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment RetrievalInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Weijia Liu
Jiuxin Cao
Bo Miao
Zhiheng Fu
Xuelin Zhu
Jiawei Ge
Bo Liu
Mehwish Nasim
Lin Wang
DiffMVGen
198
0
0
15 Aug 2025
A Flexible and Scalable Framework for Video Moment Search
A Flexible and Scalable Framework for Video Moment Search
Chongzhi Zhang
Xizhou Zhu
Aixin Sun
139
0
0
10 Jan 2025
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
391
5
0
02 Dec 2024
Look, Compare, Decide: Alleviating Hallucination in Large
  Vision-Language Models via Multi-View Multi-Path Reasoning
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path ReasoningInternational Conference on Computational Linguistics (COLING), 2024
Xiaoye Qu
Jiashuo Sun
Wei Wei
Yu Cheng
MLLMLRM
309
25
0
30 Aug 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Lin Wang
324
14
0
21 May 2024
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy
  for Temporal Sentence Grounding in Video
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoAAAI Conference on Artificial Intelligence (AAAI), 2024
Zhaobo Qi
Yibo Yuan
Xiaowen Ruan
Shuhui Wang
Weigang Zhang
Qingming Huang
351
16
0
15 Jan 2024
Cross-modal Contrastive Learning with Asymmetric Co-attention Network
  for Video Moment Retrieval
Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
Love Panta
Prashant Shrestha
Brabeem Sapkota
Amrita Bhattarai
Suresh Manandhar
Anand Kumar Sah
315
7
0
12 Dec 2023
Unified Multi-modal Unsupervised Representation Learning for
  Skeleton-based Action Understanding
Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action UnderstandingACM Multimedia (ACM MM), 2023
Shengkai Sun
Daizong Liu
Jianfeng Dong
Xiaoye Qu
Junyu Gao
Xun Yang
Xun Wang
Meng Wang
OffRL
328
31
0
06 Nov 2023
Learning Temporal Sentence Grounding From Narrated EgoVideos
Learning Temporal Sentence Grounding From Narrated EgoVideosBritish Machine Vision Conference (BMVC), 2023
Kevin Flanagan
Dima Damen
Michael Wray
243
3
0
26 Oct 2023
Exploring Iterative Refinement with Diffusion Models for Video Grounding
Exploring Iterative Refinement with Diffusion Models for Video GroundingIEEE International Conference on Multimedia and Expo (ICME), 2023
Xiao Liang
Tao Shi
Yaoyuan Liang
Te Tao
Shao-Lun Huang
DiffM
330
2
0
26 Oct 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight DetectionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGenDiffM
417
11
0
29 Aug 2023
Temporal Sentence Grounding in Streaming Videos
Temporal Sentence Grounding in Streaming VideosACM Multimedia (ACM MM), 2023
Tian Gan
Xiao Wang
Yan Sun
Yue Yu
Qingpei Guo
Liqiang Nie
301
9
0
14 Aug 2023
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Knowing Where to Focus: Event-aware Transformer for Video GroundingIEEE International Conference on Computer Vision (ICCV), 2023
Jinhyun Jang
Jungin Park
Jin-Hwa Kim
Hyeongjun Kwon
Kwanghoon Sohn
362
99
0
14 Aug 2023
A Survey on Video Moment Localization
A Survey on Video Moment LocalizationACM Computing Surveys (ACM CSUR), 2022
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
398
41
0
13 Jun 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal
  Transformer
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerIEEE International Joint Conference on Neural Network (IJCNN), 2023
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
293
52
0
29 Apr 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding
  in Long Videos
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long VideosIEEE International Conference on Computer Vision (ICCV), 2023
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
251
25
0
15 Mar 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal
  Sentence Grounding
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
234
35
0
02 Jan 2023
Video-Guided Curriculum Learning for Spoken Video Grounding
Video-Guided Curriculum Learning for Spoken Video GroundingACM Multimedia (ACM MM), 2022
Yan Xia
Zhou Zhao
Shangwei Ye
Yang Zhao
Haoyuan Li
Yi Ren
193
12
0
01 Sep 2022
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
Hierarchical Local-Global Transformer for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022
Xiang Fang
Daizong Liu
Pan Zhou
Zichuan Xu
Rui Li
346
58
0
31 Aug 2022
PRVR: Partially Relevant Video Retrieval
PRVR: Partially Relevant Video RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jianfeng Dong
Xianke Chen
Minsong Zhang
Xun Yang
Shujie Chen
Xirong Li
Xun Wang
334
49
0
26 Aug 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Reducing the Vision and Language Bias for Temporal Sentence GroundingACM Multimedia (ACM MM), 2022
Daizong Liu
Xiaoye Qu
Wei Hu
290
64
0
27 Jul 2022
Multi-Attention Network for Compressed Video Referring Object
  Segmentation
Multi-Attention Network for Compressed Video Referring Object SegmentationACM Multimedia (ACM MM), 2022
Weidong Chen
Dexiang Hong
Yuankai Qi
Zhenjun Han
Shuhui Wang
Laiyun Qing
Qingming Huang
Guorong Li
VOS
258
58
0
26 Jul 2022
You Need to Read Again: Multi-granularity Perception Network for Moment
  Retrieval in Videos
You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in VideosAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Xin Sun
Xinyu Wang
Jialin Gao
Qiong Liu
Xiaoping Zhou
250
45
0
25 May 2022
Entity-aware and Motion-aware Transformers for Language-driven Action
  Localization in Videos
Entity-aware and Motion-aware Transformers for Language-driven Action Localization in VideosInternational Joint Conference on Artificial Intelligence (IJCAI), 2022
Shuo Yang
Xinxiao Wu
266
22
0
12 May 2022
Video Moment Retrieval from Text Queries via Single Frame Annotation
Video Moment Retrieval from Text Queries via Single Frame AnnotationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Ran Cui
Tianwen Qian
Pai Peng
E. Daskalaki
Yue Yu
Xiao-Wei Guo
Huyang Sun
Yu-Gang Jiang
324
46
0
20 Apr 2022
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal
  Grounding
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding
Ziyue Wu
Junyu Gao
Shucheng Huang
Changsheng Xu
321
6
0
04 Apr 2022
Towards Visual-Prompt Temporal Answering Grounding in Medical
  Instructional Video
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional VideoIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
813
68
0
13 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
248
47
0
06 Mar 2022
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With
  Transformer for Sentence Grounding in Videos
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos
Sangmin Woo
Jinyoung Park
Inyong Koo
Sumin Lee
Minki Jeong
Changick Kim
500
6
0
25 Jan 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future DirectionsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
471
57
0
20 Jan 2022
Unsupervised Temporal Video Grounding with Deep Semantic Clustering
Unsupervised Temporal Video Grounding with Deep Semantic ClusteringAAAI Conference on Artificial Intelligence (AAAI), 2022
Daizong Liu
Xiaoye Qu
Yinzhen Wang
Xing Di
Kai Zou
Yu Cheng
Zichuan Xu
Pan Zhou
302
52
0
14 Jan 2022
Exploring Motion and Appearance Information for Temporal Sentence
  Grounding
Exploring Motion and Appearance Information for Temporal Sentence GroundingAAAI Conference on Artificial Intelligence (AAAI), 2022
Daizong Liu
Xiaoye Qu
Pan Zhou
Yang Liu
219
49
0
03 Jan 2022
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding
Memory-Guided Semantic Learning Network for Temporal Sentence GroundingAAAI Conference on Artificial Intelligence (AAAI), 2022
Daizong Liu
Xiaoye Qu
Xing Di
Yu Cheng
Zichuan Xu
Pan Zhou
374
80
0
03 Jan 2022
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Hierarchical Deep Residual Reasoning for Temporal Moment LocalizationACM Multimedia Asia (MA), 2021
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
226
10
0
31 Oct 2021
CONQUER: Contextual Query-aware Ranking for Video Corpus Moment
  Retrieval
CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
Zhijian Hou
Chong-Wah Ngo
W. Chan
153
59
0
21 Sep 2021
A Survey on Temporal Sentence Grounding in Videos
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
389
59
0
16 Sep 2021
Progressively Guide to Attend: An Iterative Alignment Framework for
  Temporal Sentence Grounding
Progressively Guide to Attend: An Iterative Alignment Framework for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
235
53
0
14 Sep 2021
Adaptive Proposal Generation Network for Temporal Sentence Localization
  in Videos
Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
253
66
0
14 Sep 2021
Fine-Grained Fashion Similarity Prediction by Attribute-Specific
  Embedding Learning
Fine-Grained Fashion Similarity Prediction by Attribute-Specific Embedding LearningIEEE Transactions on Image Processing (TIP), 2021
Jianfeng Dong
Zhe Ma
Xiaofeng Mao
Xun Yang
Yuan He
Richang Hong
S. Ji
OOD
336
46
0
06 Apr 2021
A Survey on Natural Language Video Localization
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
274
9
0
01 Apr 2021
Context-aware Biaffine Localizing Network for Temporal Sentence
  Grounding
Context-aware Biaffine Localizing Network for Temporal Sentence GroundingComputer Vision and Pattern Recognition (CVPR), 2021
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
258
180
0
22 Mar 2021
Natural Language Video Localization: A Revisit in Span-based Question
  Answering Framework
Natural Language Video Localization: A Revisit in Span-based Question Answering FrameworkIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
493
107
0
26 Feb 2021
Progressive Localization Networks for Language-based Moment Localization
Progressive Localization Networks for Language-based Moment Localization
Qi Zheng
Jianfeng Dong
Xiaoye Qu
Xun Yang
Yabing Wang
Pan Zhou
Baolong Liu
Xun Wang
318
39
0
02 Feb 2021
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLMVLMOffRLAI4TS
784
549
0
01 May 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language InferenceComputer Vision and Pattern Recognition (CVPR), 2020
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLMVGen
330
75
0
25 Mar 2020
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding
  in Videos
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Yitian Yuan
Lin Ma
Jingwen Wang
Wei Liu
Wenwu Zhu
298
278
0
31 Oct 2019
Attention on Attention for Image Captioning
Attention on Attention for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2019
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
506
988
0
19 Aug 2019
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in
  Videos
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in VideosAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2019
Zhu Zhang
Zhijie Lin
Zhou Zhao
Zhenxin Xiao
293
232
0
06 Jun 2019
12
Next
Page 1 of 2