ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.13049
  4. Cited By
Compositional Temporal Grounding with Structured Variational Cross-Graph
  Correspondence Learning
v1v2 (latest)

Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning

Computer Vision and Pattern Recognition (CVPR), 2022
24 March 2022
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Leilei Gan
Yi Yang
Yueting Zhuang
Xinze Wang
ArXiv (abs)PDFHTML

Papers citing "Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning"

49 / 49 papers shown
TAG: A Simple Yet Effective Temporal-Aware Approach for Zero-Shot Video Temporal Grounding
TAG: A Simple Yet Effective Temporal-Aware Approach for Zero-Shot Video Temporal Grounding
Jin-Seop Lee
SungJoon Lee
Jaehan Ahn
YunSeok Choi
Jee-Hyong Lee
VLM
208
3
0
11 Aug 2025
Boosting Temporal Sentence Grounding via Causal Inference
Boosting Temporal Sentence Grounding via Causal Inference
Kefan Tang
Lihuo He
Jisheng Dang
Xinbo Gao
OODCML
320
1
0
07 Jul 2025
Multi-Sourced Compositional Generalization in Visual Question Answering
Multi-Sourced Compositional Generalization in Visual Question AnsweringInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
Chuanhao Li
Wenbo Ye
Zhen Li
Yuwei Wu
Yunde Jia
CoGe
297
0
0
29 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Object-Shot Enhanced Grounding Network for Egocentric VideoComputer Vision and Pattern Recognition (CVPR), 2025
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
315
7
0
07 May 2025
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video SituationComputer Vision and Pattern Recognition (CVPR), 2025
Hao Du
Bo Wu
Yan Lu
Zhendong Mao
267
2
0
08 Apr 2025
Multi-Granular Multimodal Clue Fusion for Meme Understanding
Multi-Granular Multimodal Clue Fusion for Meme UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2025
Li Zheng
Hao Fei
Ting Dai
Zuquan Peng
Fei Li
Huisheng Ma
Chong Teng
Donghong Ji
358
10
0
16 Mar 2025
Consistency of Compositional Generalization across Multiple Levels
Consistency of Compositional Generalization across Multiple LevelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Chuanhao Li
Zhen Li
Chenchen Jing
Xiaomeng Fan
Wenbo Ye
Yuwei Wu
Yunde Jia
CoGe
273
1
0
18 Dec 2024
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Combating Multimodal LLM Hallucination via Bottom-Up Holistic ReasoningAAAI Conference on Artificial Intelligence (AAAI), 2024
Shengqiong Wu
Hao Fei
Liangming Pan
William Yang Wang
Shuicheng Yan
Tat-Seng Chua
LRM
524
21
0
15 Dec 2024
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in
  the Wild
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in the Wild
Peijun Bao
Chenqi Kong
Zihao Shao
Boon Poh Ng
Meng Hwa Er
Alex C. Kot
353
4
0
01 Dec 2024
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained
  Video Understanding
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video UnderstandingNeural Information Processing Systems (NeurIPS), 2024
Houlun Chen
Xin Wang
Hong Chen
Zeyang Zhang
Wei Feng
Bin Huang
Jia Jia
Wenwu Zhu
VGen
381
10
0
11 Oct 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
Towards Unified Multimodal Editing with Enhanced Knowledge CollaborationNeural Information Processing Systems (NeurIPS), 2024
Kaihang Pan
Zhaoyu Fan
Juncheng Li
Qifan Yu
Hao Fei
Siliang Tang
Richang Hong
Hanwang Zhang
Qianru Sun
KELM
413
18
0
30 Sep 2024
Training-free Video Temporal Grounding using Large-scale Pre-trained
  Models
Training-free Video Temporal Grounding using Large-scale Pre-trained ModelsEuropean Conference on Computer Vision (ECCV), 2024
Minghang Zheng
Xinhao Cai
Qingchao Chen
Yuxin Peng
Yang Liu
275
28
0
29 Aug 2024
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional
  Temporal Grounding
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng
Yujiang Pu
Shaogang Gong
Parisa Kordjamshidi
Yu Kong
AI4TS
344
3
0
06 Jul 2024
Align and Aggregate: Compositional Reasoning with Video Alignment and
  Answer Aggregation for Video Question-Answering
Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering
Zhaohe Liao
Jiangtong Li
Li Niu
Liqing Zhang
CoGe
246
15
0
03 Jul 2024
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment
  Retrieval
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Weitong Cai
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
263
11
0
25 Jun 2024
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video
  Grounding
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Xing Zhang
Jiaxi Gu
Haoyu Zhao
Shicong Wang
Hang Xu
Renjing Pei
Songcen Xu
Zuxuan Wu
Yu-Gang Jiang
314
1
0
11 Jun 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Lin Wang
324
14
0
21 May 2024
SnAG: Scalable and Accurate Video Grounding
SnAG: Scalable and Accurate Video GroundingComputer Vision and Pattern Recognition (CVPR), 2024
Fangzhou Mu
Sicheng Mo
Yin Li
410
34
0
02 Apr 2024
Momentor: Advancing Video Large Language Model with Fine-Grained
  Temporal Reasoning
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Long Qian
Juncheng Billy Li
Yu-hao Wu
Yaobo Ye
Hao Fei
Tat-Seng Chua
Yueting Zhuang
Siliang Tang
MLLMLRM
428
114
0
18 Feb 2024
A Joint-Reasoning based Disease Q&A System
A Joint-Reasoning based Disease Q&A System
Prakash Chandra Sukhwal
Vaibhav Rajan
A. Kankanhalli
LM&MAAI4MH
176
0
0
06 Jan 2024
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and
  Highlight Detection
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight DetectionAAAI Conference on Artificial Intelligence (AAAI), 2024
Hao Sun
Mingyao Zhou
Wenjing Chen
Wei Xie
PINN3DGSViT
319
79
0
04 Jan 2024
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video
  Moment Retrieval
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
Zhihang Liu
Jun Li
Hongtao Xie
Nianzu Yang
Jiannan Ge
Sun-Ao Liu
Guoqing Jin
331
42
0
19 Dec 2023
De-fine: Decomposing and Refining Visual Programs with Auto-Feedback
De-fine: Decomposing and Refining Visual Programs with Auto-FeedbackACM Multimedia (ACM MM), 2023
Minghe Gao
Juncheng Li
Hao Fei
Liang Pang
Wei Ji
Guoming Wang
Wenqiao Zhang
Siliang Tang
Yueting Zhuang
223
12
0
21 Nov 2023
Improving Vision Anomaly Detection with the Guidance of Language
  Modality
Improving Vision Anomaly Detection with the Guidance of Language ModalityIEEE transactions on multimedia (IEEE TMM), 2023
Dong Chen
Kaihang Pan
Guoming Wang
Yueting Zhuang
Siliang Tang
216
8
0
04 Oct 2023
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Zero-Shot Video Moment Retrieval from Frozen Vision-Language ModelsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Dezhao Luo
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
VLM
352
21
0
01 Sep 2023
I3: Intent-Introspective Retrieval Conditioned on Instructions
I3: Intent-Introspective Retrieval Conditioned on InstructionsAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Kaihang Pan
Juncheng Li
Wenjie Wang
Hao Fei
Hongye Song
Wei Ji
Jun Lin
Xiaozhong Liu
Tat-Seng Chua
Siliang Tang
353
7
0
19 Aug 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative
  Instructions
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative InstructionsInternational Conference on Learning Representations (ICLR), 2023
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
403
92
0
08 Aug 2023
Keyword-Aware Relative Spatio-Temporal Graph Networks for Video Question
  Answering
Keyword-Aware Relative Spatio-Temporal Graph Networks for Video Question AnsweringIEEE transactions on multimedia (IEEE TMM), 2023
Yi Cheng
Hehe Fan
Dongyun Lin
Ying Sun
Mohan S. Kankanhalli
J. Lim
252
10
0
25 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention
  and Zoom-in Boundary Detection
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Tao Gui
S. Zheng
Qin Jin
285
2
0
20 Jul 2023
ICSVR: Investigating Compositional and Syntactic Understanding in Video
  Retrieval Models
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models
Avinash Madasu
Vasudev Lal
CoGe
349
5
0
28 Jun 2023
Global Structure Knowledge-Guided Relation Extraction Method for
  Visually-Rich Document
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich DocumentConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiangnan Chen
Qianwen Xiao
Juncheng Li
Duo Dong
Jun Lin
Xiaozhong Liu
Siliang Tang
265
6
0
23 May 2023
Movie101: A New Movie Understanding Benchmark
Movie101: A New Movie Understanding BenchmarkAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zihao Yue
Tao Gui
Anwen Hu
Liang Zhang
Ziheng Wang
Qin Jin
VGen
355
27
0
20 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal
  Transformer
MH-DETR: Video Moment and Highlight Detection with Cross-modal TransformerIEEE International Joint Conference on Neural Network (IJCNN), 2023
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
294
52
0
29 Apr 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation
  in an Open World
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open WorldIEEE International Conference on Computer Vision (ICCV), 2023
Qifan Yu
Juncheng Li
Yuehua Wu
Siliang Tang
Wei Ji
Yueting Zhuang
297
51
0
23 Mar 2023
Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization
  for Few-shot Generalization
Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot GeneralizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kaihang Pan
Juncheng Billy Li
Hongye Song
Jun Lin
Xiaozhong Liu
Siliang Tang
OffRL
306
16
0
22 Mar 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable
  Vision-Language Models
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Juncheng Li
Minghe Gao
Longhui Wei
Siliang Tang
Wenqiao Zhang
Meng Li
Wei Ji
Qi Tian
Tat-Seng Chua
Yueting Zhuang
VLMVPVLM
296
34
0
12 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection
  to Image-Text Pre-Training
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-TrainingComputer Vision and Pattern Recognition (CVPR), 2023
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
399
45
0
28 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence
  Grounding
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
217
18
0
20 Feb 2023
Variational Cross-Graph Reasoning and Adaptive Structured Semantics
  Learning for Compositional Temporal Grounding
Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal GroundingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Juncheng Li
Siliang Tang
Linchao Zhu
Wenqiao Zhang
Yi Yang
Tat-Seng Chua
Fei Wu
Yueting Zhuang
BDL
267
25
0
22 Jan 2023
Exploiting Auxiliary Caption for Video Grounding
Exploiting Auxiliary Caption for Video GroundingAAAI Conference on Artificial Intelligence (AAAI), 2023
Hongxiang Li
Meng Cao
Xuxin Cheng
Zhihong Zhu
Yaowei Li
Yuexian Zou
353
16
0
15 Jan 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Test of Time: Instilling Video-Language Models with a Sense of TimeComputer Vision and Pattern Recognition (CVPR), 2023
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
555
51
0
05 Jan 2023
Language-free Training for Zero-shot Video Grounding
Language-free Training for Zero-shot Video GroundingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Dahye Kim
Jungin Park
Jiyoung Lee
S. Park
Kwanghoon Sohn
247
32
0
24 Oct 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer
  Vision: A Task-Oriented Perspective
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented PerspectiveIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViTMedImAI4CE
447
146
0
27 Sep 2022
Dilated Context Integrated Network with Cross-Modal Consensus for
  Temporal Emotion Localization in Videos
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in VideosACM Multimedia (ACM MM), 2022
Juncheng Billy Li
Junlin Xie
Linchao Zhu
Long Qian
Siliang Tang
...
Haochen Shi
Shengyu Zhang
Longhui Wei
Qi Tian
Yueting Zhuang
288
16
0
03 Aug 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid
  Counterfactual Training for Robust Content-based Image Retrieval
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
273
6
0
09 Jul 2022
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
  Semi-Supervised Learning and Active Learning
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active LearningKnowledge Discovery and Data Mining (KDD), 2022
Jiannan Guo
Yangyang Kang
Yu Duan
Xiaozhong Liu
Siliang Tang
Wenqiao Zhang
Kun Kuang
Changlong Sun
Leilei Gan
201
4
0
07 Jun 2022
End-to-End Modeling via Information Tree for One-Shot Natural Language
  Spatial Video Grounding
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video GroundingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Meng Li
Tianbao Wang
Haoyu Zhang
Shengyu Zhang
Zhou Zhao
...
Wenming Tan
Jin Wang
Peng Wang
Shi Pu
Leilei Gan
342
46
0
15 Mar 2022
BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive
  Pseudo Labeling and Informative Active Annotation
BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active AnnotationComputer Vision and Pattern Recognition (CVPR), 2022
Wenqiao Zhang
Lei Zhu
James Hallinan
A. Makmur
Shengyu Zhang
Qingpeng Cai
Beng Chin Ooi
424
126
0
04 Mar 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Temporal Sentence Grounding in Videos: A Survey and Future DirectionsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Hao Zhang
Aixin Sun
Wei Jing
Qiufeng Wang
3DGS
471
59
0
20 Jan 2022
1
Page 1 of 1