ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.07514
  4. Cited By
Local-Global Video-Text Interactions for Temporal Grounding

Local-Global Video-Text Interactions for Temporal Grounding

16 April 2020
Jonghwan Mun
Minsu Cho
Bohyung Han
ArXivPDFHTML

Papers citing "Local-Global Video-Text Interactions for Temporal Grounding"

50 / 52 papers shown
Title
Object-Shot Enhanced Grounding Network for Egocentric Video
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
41
0
0
07 May 2025
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu
Yaojie Shen
Chenxi Luo
Tiejian Luo
Yan Huang
Yuewei Lin
Heng Fan
L. Zhang
63
1
0
16 Feb 2025
On the Consistency of Video Large Language Models in Temporal Comprehension
On the Consistency of Video Large Language Models in Temporal Comprehension
Minjoon Jung
Junbin Xiao
Byoung-Tak Zhang
Angela Yao
87
2
0
20 Nov 2024
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video
  Grounding
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Xing Zhang
Jiaxi Gu
Haoyu Zhao
Shicong Wang
Hang Xu
Renjing Pei
Songcen Xu
Zuxuan Wu
Yu-Gang Jiang
43
0
0
11 Jun 2024
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint
  Moment Retrieval and Highlight Detection
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
Jin Yang
Ping Wei
Huan Li
Ziyang Ren
48
8
0
14 Apr 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
59
3
0
21 Mar 2024
Siamese Learning with Joint Alignment and Regression for
  Weakly-Supervised Video Paragraph Grounding
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
Chaolei Tan
Jian-Huang Lai
Wei-Shi Zheng
Jianfang Hu
AI4TS
41
5
0
18 Mar 2024
Cross-modal Contrastive Learning with Asymmetric Co-attention Network
  for Video Moment Retrieval
Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
Love Panta
Prashant Shrestha
Brabeem Sapkota
Amrita Bhattarai
Suresh Manandhar
Anand Kumar Sah
25
3
0
12 Dec 2023
EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video
  Grounding with Multimodal Large Language Model
EtC: Temporal Boundary Expand then Clarify for Weakly Supervised Video Grounding with Multimodal Large Language Model
Guozhang Li
Xinpeng Ding
De-Chun Cheng
Jie Li
Nannan Wang
Xinbo Gao
34
1
0
05 Dec 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000
  Frames
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Bernard Ghanem
33
25
0
28 Nov 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
31
53
0
21 Aug 2023
A Survey on Video Moment Localization
A Survey on Video Moment Localization
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
27
28
0
13 Jun 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
21
0
0
14 Mar 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal
  Sentence Grounding
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
27
18
0
02 Jan 2023
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in
  Temporal Action Localization Tasks
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks
Hyolim Kang
Hanjung Kim
Joungbin An
Minsu Cho
Seon Joo Kim
25
5
0
11 Nov 2022
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
Minjoon Jung
Seongho Choi
Joo-Kyung Kim
Jin-Hwa Kim
Byoung-Tak Zhang
34
7
0
23 Oct 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models
  In Temporal Video Grounding
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
27
7
0
26 Sep 2022
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
Xiang Fang
Daizong Liu
Pan Zhou
Zichuan Xu
Rui Li
17
28
0
31 Aug 2022
Partially Relevant Video Retrieval
Partially Relevant Video Retrieval
Jianfeng Dong
Xianke Chen
Minsong Zhang
Xun Yang
Shujie Chen
Xirong Li
Xun Wang
17
39
0
26 Aug 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Reducing the Vision and Language Bias for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Wei Hu
19
49
0
27 Jul 2022
Video Activity Localisation with Uncertainties in Temporal Boundary
Video Activity Localisation with Uncertainties in Temporal Boundary
Jiabo Huang
Hailin Jin
S. Gong
Yang Liu
19
23
0
26 Jun 2022
Entity-aware and Motion-aware Transformers for Language-driven Action
  Localization in Videos
Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos
Shuo Yang
Xinxiao Wu
25
15
0
12 May 2022
Contrastive Language-Action Pre-training for Temporal Localization
Contrastive Language-Action Pre-training for Temporal Localization
Mengmeng Xu
Erhan Gundogdu
⋆⋆ Maksim
Bernard Ghanem
M. Donoser
Loris Bazzani
33
27
0
26 Apr 2022
Video Moment Retrieval from Text Queries via Single Frame Annotation
Video Moment Retrieval from Text Queries via Single Frame Annotation
Ran Cui
Tianwen Qian
Pai Peng
E. Daskalaki
Jingjing Chen
Xiao-Wei Guo
Huyang Sun
Yu-Gang Jiang
17
35
0
20 Apr 2022
Position-aware Location Regression Network for Temporal Video Grounding
Position-aware Location Regression Network for Temporal Video Grounding
Sunoh Kim
Kimin Yun
J. Choi
22
4
0
12 Apr 2022
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
X. Wang
36
73
0
24 Mar 2022
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for
  Weakly-Supervised Query-based Video Grounding
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Shentong Mo
Daizong Liu
Wei Hu
SSL
18
6
0
08 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
15
37
0
06 Mar 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Hao Zhang
Aixin Sun
Wei Jing
Joey Tianyi Zhou
3DGS
36
38
0
20 Jan 2022
Exploring Motion and Appearance Information for Temporal Sentence
  Grounding
Exploring Motion and Appearance Information for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Pan Zhou
Yang Liu
19
41
0
03 Jan 2022
MAD: A Scalable Dataset for Language Grounding in Videos from Movie
  Audio Descriptions
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
Mattia Soldan
Alejandro Pardo
Juan Carlos León Alcázar
Fabian Caba Heilbron
Chen Zhao
Silvio Giancola
Bernard Ghanem
VGen
39
95
0
01 Dec 2021
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Hierarchical Deep Residual Reasoning for Temporal Moment Localization
Ziyang Ma
Xianjing Han
Xuemeng Song
Yiran Cui
Liqiang Nie
13
9
0
31 Oct 2021
Self-supervised Learning for Semi-supervised Temporal Language Grounding
Self-supervised Learning for Semi-supervised Temporal Language Grounding
Fan Luo
Shaoxiang Chen
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
VLM
51
11
0
23 Sep 2021
Natural Language Video Localization with Learnable Moment Proposals
Natural Language Video Localization with Learnable Moment Proposals
Shaoning Xiao
Long Chen
Jian Shao
Yueting Zhuang
Jun Xiao
14
43
0
22 Sep 2021
A Survey on Temporal Sentence Grounding in Videos
A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan
Yitian Yuan
Xin Eric Wang
Zhi Wang
Wenwu Zhu
30
47
0
16 Sep 2021
Adaptive Proposal Generation Network for Temporal Sentence Localization
  in Videos
Adaptive Proposal Generation Network for Temporal Sentence Localization in Videos
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
22
54
0
14 Sep 2021
Negative Sample Matters: A Renaissance of Metric Learning for Temporal
  Grounding
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Zhenzhi Wang
Limin Wang
Tao Wu
Tianhao Li
Gangshan Wu
AI4TS
28
116
0
10 Sep 2021
EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery
  Generation
EVOQUER: Enhancing Temporal Grounding with Video-Pivoted BackQuery Generation
Yanjun Gao
Lulu Liu
Jason Wang
Xin Chen
Huayan Wang
Rui Zhang
28
1
0
10 Sep 2021
Zero-shot Natural Language Video Localization
Zero-shot Natural Language Video Localization
Jinwoo Nam
Daechul Ahn
Dongyeop Kang
S. Ha
Jonghyun Choi
94
43
0
29 Aug 2021
Support-Set Based Cross-Supervision for Video Grounding
Support-Set Based Cross-Supervision for Video Grounding
Xinpeng Ding
N. Wang
Shiwei Zhang
De-Chun Cheng
Xiaomeng Li
Ziyuan Huang
Mingqian Tang
Xinbo Gao
33
42
0
24 Aug 2021
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for
  Video-and-Language Inference
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
Juncheng Li
Siliang Tang
Linchao Zhu
Haochen Shi
Xuanwen Huang
Fei Wu
Yi Yang
Yueting Zhuang
19
28
0
26 Jul 2021
Cross-Sentence Temporal and Semantic Relations in Video Activity
  Localisation
Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation
Jiabo Huang
Yang Liu
S. Gong
Hailin Jin
26
61
0
23 Jul 2021
End-to-end Multi-modal Video Temporal Grounding
End-to-end Multi-modal Video Temporal Grounding
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
11
51
0
12 Jul 2021
Interventional Video Grounding with Dual Contrastive Learning
Interventional Video Grounding with Dual Contrastive Learning
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
23
144
0
21 Jun 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Parallel Attention Network with Sequence Matching for Video Grounding
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Joey Tianyi Zhou
Rick Siow Mong Goh
18
40
0
18 May 2021
A Survey on Natural Language Video Localization
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
26
7
0
01 Apr 2021
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
  Localization
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization
Mengmeng Xu
Juan-Manuel Perez-Rua
Xiatian Zhu
Bernard Ghanem
Brais Martinez
15
27
0
28 Mar 2021
Look Before you Speak: Visually Contextualized Utterances
Look Before you Speak: Visually Contextualized Utterances
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
21
66
0
10 Dec 2020
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with
  Natural Language
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang
Houwen Peng
Jianlong Fu
Yijuan Lu
Jiebo Luo
19
51
0
04 Dec 2020
VLG-Net: Video-Language Graph Matching Network for Video Grounding
VLG-Net: Video-Language Graph Matching Network for Video Grounding
Mattia Soldan
Mengmeng Xu
Sisi Qu
Jesper N. Tegnér
Bernard Ghanem
33
69
0
19 Nov 2020
12
Next