ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.02497
  4. Cited By
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in
  Videos
v1v2 (latest)

Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos

Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2019
6 June 2019
Zhu Zhang
Zhijie Lin
Zhou Zhao
Zhenxin Xiao
ArXiv (abs)PDFHTML

Papers citing "Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos"

50 / 87 papers shown
When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions
Zhuo Cao
Heming Du
Bingqing Zhang
Xin Yu
Xue Li
Sen Wang
159
1
0
20 Oct 2025
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Shraman Pramanick
E. Mavroudi
Yale Song
Rama Chellappa
Lorenzo Torresani
Triantafyllos Afouras
272
3
0
19 Oct 2025
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
Runhao Zeng
Jiaqi Mao
Minghao Lai
Minh Hieu Phan
Yanjie Dong
Wei Wang
Qi Chen
Xiping Hu
187
0
0
16 Aug 2025
MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
Sangkwon Park
Jiho Choi
Kyungjune Baek
Hyunjung Shim
346
0
0
30 Dec 2024
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video
  Temporal Grounding
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal GroundingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Zhuo Cao
Bingqing Zhang
Heming Du
Xin Yu
Xue Li
Sen Wang
389
20
0
18 Dec 2024
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in
  the Wild
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in the Wild
Peijun Bao
Chenqi Kong
Zihao Shao
Boon Poh Ng
Meng Hwa Er
Alex C. Kot
353
4
0
01 Dec 2024
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation
  for Video Moment Retrieval
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval
Yiyang Jiang
Wengyu Zhang
Xu-Lu Zhang
Xiaoyong Wei
Chang Wen Chen
Qing Li
420
31
0
21 Jul 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Lin Wang
324
14
0
21 May 2024
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video
  Localization
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization
Chongzhi Zhang
Mingyuan Zhang
Zhiyang Teng
Jiayi Li
Xizhou Zhu
Lewei Lu
Ziwei Liu
Aixin Sun
DiffMVGen
198
1
0
16 Jan 2024
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy
  for Temporal Sentence Grounding in Video
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoAAAI Conference on Artificial Intelligence (AAAI), 2024
Zhaobo Qi
Yibo Yuan
Xiaowen Ruan
Shuhui Wang
Weigang Zhang
Qingming Huang
351
16
0
15 Jan 2024
Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video
  Grounding
Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding
Haifeng Huang
Yang Zhao
Zehan Wang
Yan Xia
Zhou Zhao
302
1
0
21 Dec 2023
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal
  Sentence Grounding in Videos
BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in VideosEuropean Conference on Computer Vision (ECCV), 2023
Pilhyeon Lee
Hyeran Byun
378
32
0
30 Nov 2023
Correlation-Guided Query-Dependency Calibration for Video Temporal
  Grounding
Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
WonJun Moon
Sangeek Hyun
Subeen Lee
Jae-Pil Heo
482
20
0
15 Nov 2023
Exploring Iterative Refinement with Diffusion Models for Video Grounding
Exploring Iterative Refinement with Diffusion Models for Video GroundingIEEE International Conference on Multimedia and Expo (ICME), 2023
Xiao Liang
Tao Shi
Yaoyuan Liang
Te Tao
Shao-Lun Huang
DiffM
330
2
0
26 Oct 2023
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Knowing Where to Focus: Event-aware Transformer for Video GroundingIEEE International Conference on Computer Vision (ICCV), 2023
Jinhyun Jang
Jungin Park
Jin-Hwa Kim
Hyeongjun Kwon
Kwanghoon Sohn
362
99
0
14 Aug 2023
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
ViGT: Proposal-free Video Grounding with Learnable Token in TransformerScience China Information Sciences (Sci China Inf Sci), 2023
Kun Li
Dan Guo
Meng Wang
ViT
177
68
0
11 Aug 2023
Encode-Store-Retrieve: Enhancing Memory Augmentation through
  Language-Encoded Egocentric Perception
Encode-Store-Retrieve: Enhancing Memory Augmentation through Language-Encoded Egocentric PerceptionInternational Symposium on Mixed and Augmented Reality (ISMAR), 2023
Junxiao Shen
John J. Dudley
Per Ola Kristensson
RALM
141
1
0
10 Aug 2023
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and
  Game Theory
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game TheoryIEEE International Conference on Computer Vision (ICCV), 2023
Hongxiang Li
Meng Cao
Xuxin Cheng
Yaowei Li
Zhihong Zhu
Yuexian Zou
433
32
0
26 Jul 2023
A Survey on Video Moment Localization
A Survey on Video Moment LocalizationACM Computing Surveys (ACM CSUR), 2022
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
398
41
0
13 Jun 2023
Transform-Equivariant Consistency Learning for Temporal Sentence
  Grounding
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Yining Qi
Xing Di
Weining Lu
Yu Cheng
327
12
0
06 May 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding
  in Long Videos
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long VideosIEEE International Conference on Computer Vision (ICCV), 2023
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
252
25
0
15 Mar 2023
You Can Ground Earlier than See: An Effective and Efficient Pipeline for
  Temporal Sentence Grounding in Compressed Videos
You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed VideosComputer Vision and Pattern Recognition (CVPR), 2023
Xiang Fang
Daizong Liu
Pan Zhou
Guoshun Nan
267
56
0
14 Mar 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
277
0
0
14 Mar 2023
Learning Grounded Vision-Language Representation for Versatile
  Understanding in Untrimmed Videos
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Teng Wang
Jinrui Zhang
Feng Zheng
Wenhao Jiang
Ran Cheng
Ping Luo
VLM
315
15
0
11 Mar 2023
Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal
  Sentence Localization in Videos
Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in VideosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Daizong Liu
Pan Zhou
VOS
346
7
0
02 Mar 2023
Tracking Objects and Activities with Attention for Temporal Sentence
  Grounding
Tracking Objects and Activities with Attention for Temporal Sentence GroundingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zeyu Xiong
Daizong Liu
Pan Zhou
Jiahao Zhu
336
5
0
21 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence
  Grounding
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
217
18
0
20 Feb 2023
Hypotheses Tree Building for One-Shot Temporal Sentence Localization
Hypotheses Tree Building for One-Shot Temporal Sentence LocalizationAAAI Conference on Artificial Intelligence (AAAI), 2023
Daizong Liu
Xiang Fang
Pan Zhou
Xing Di
Weining Lu
Yu Cheng
272
29
0
05 Jan 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal
  Sentence Grounding
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
234
35
0
02 Jan 2023
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
MRTNet: Multi-Resolution Temporal Network for Video Sentence GroundingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Ji
Long Chen
Yin-wei Wei
Yiming Wu
Tat-Seng Chua
AI4TS
204
24
0
26 Dec 2022
FedVMR: A New Federated Learning method for Video Moment Retrieval
FedVMR: A New Federated Learning method for Video Moment RetrievalIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yan Wang
Xin Luo
Zhen-Duo Chen
P. Zhang
Meng Liu
Xin-Shun Xu
FedML
226
3
0
28 Oct 2022
Fine-grained Semantic Alignment Network for Weakly Supervised Temporal
  Language Grounding
Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language GroundingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yuechen Wang
Wen-gang Zhou
Houqiang Li
AI4TS
214
16
0
21 Oct 2022
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval
Multi-Modal Cross-Domain Alignment Network for Video Moment RetrievalIEEE transactions on multimedia (IEEE TMM), 2022
Xiang Fang
Daizong Liu
Pan Zhou
Yuchong Hu
477
65
0
23 Sep 2022
Video-Guided Curriculum Learning for Spoken Video Grounding
Video-Guided Curriculum Learning for Spoken Video GroundingACM Multimedia (ACM MM), 2022
Yan Xia
Zhou Zhao
Shangwei Ye
Yang Zhao
Haoyuan Li
Yi Ren
193
12
0
01 Sep 2022
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
Hierarchical Local-Global Transformer for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022
Xiang Fang
Daizong Liu
Pan Zhou
Zichuan Xu
Rui Li
346
58
0
31 Aug 2022
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training
  Framework for Temporal Grounding
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal GroundingEuropean Conference on Computer Vision (ECCV), 2022
Jiachang Hao
Haifeng Sun
Pengfei Ren
Jingyu Wang
Q. Qi
J. Liao
334
36
0
29 Jul 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Reducing the Vision and Language Bias for Temporal Sentence GroundingACM Multimedia (ACM MM), 2022
Daizong Liu
Xiaoye Qu
Wei Hu
291
64
0
27 Jul 2022
Skimming, Locating, then Perusing: A Human-Like Framework for Natural
  Language Video Localization
Skimming, Locating, then Perusing: A Human-Like Framework for Natural Language Video LocalizationACM Multimedia (ACM MM), 2022
Daizong Liu
Wei Hu
247
43
0
27 Jul 2022
You Need to Read Again: Multi-granularity Perception Network for Moment
  Retrieval in Videos
You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in VideosAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Xin Sun
Xinyu Wang
Jialin Gao
Qiong Liu
Xiaoping Zhou
250
45
0
25 May 2022
Video Moment Retrieval from Text Queries via Single Frame Annotation
Video Moment Retrieval from Text Queries via Single Frame AnnotationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Ran Cui
Tianwen Qian
Pai Peng
E. Daskalaki
Yue Yu
Xiao-Wei Guo
Huyang Sun
Yu-Gang Jiang
325
47
0
20 Apr 2022
Position-aware Location Regression Network for Temporal Video Grounding
Position-aware Location Regression Network for Temporal Video GroundingAdvanced Video and Signal Based Surveillance (AVSS), 2021
Sunoh Kim
Kimin Yun
J. Choi
228
5
0
12 Apr 2022
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal
  Grounding
Learning Commonsense-aware Moment-Text Alignment for Fast Video Temporal Grounding
Ziyue Wu
Junyu Gao
Shucheng Huang
Changsheng Xu
321
6
0
04 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
TubeDETR: Spatio-Temporal Video Grounding with TransformersComputer Vision and Pattern Recognition (CVPR), 2022
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
382
127
0
30 Mar 2022
End-to-End Modeling via Information Tree for One-Shot Natural Language
  Spatial Video Grounding
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video GroundingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Meng Li
Tianbao Wang
Haoyu Zhang
Shengyu Zhang
Zhou Zhao
...
Wenming Tan
Jin Wang
Peng Wang
Shi Pu
Leilei Gan
342
46
0
15 Mar 2022
A Closer Look at Debiased Temporal Sentence Grounding in Videos:
  Dataset, Metric, and Approach
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Long Chen
Zhi Wang
Lin Ma
Wenwu Zhu
CML
215
20
0
10 Mar 2022
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for
  Weakly-Supervised Query-based Video Grounding
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Shentong Mo
Daizong Liu
Wei Hu
SSL
167
8
0
08 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence GroundingIEEE transactions on multimedia (IEEE TMM), 2022
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
248
47
0
06 Mar 2022
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With
  Transformer for Sentence Grounding in Videos
Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos
Sangmin Woo
Jinyoung Park
Inyong Koo
Sumin Lee
Minki Jeong
Changick Kim
500
6
0
25 Jan 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future DirectionsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
471
59
0
20 Jan 2022
Learning Sample Importance for Cross-Scenario Video Temporal Grounding
Learning Sample Importance for Cross-Scenario Video Temporal GroundingInternational Conference on Multimedia Retrieval (ICMR), 2022
P. Bao
Yadong Mu
182
14
0
08 Jan 2022
12
Next
Page 1 of 2