Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.13049
Cited By
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning
24 March 2022
Juncheng Li
Junlin Xie
Long Qian
Linchao Zhu
Siliang Tang
Fei Wu
Yi Yang
Yueting Zhuang
X. Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning"
47 / 47 papers shown
Title
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
36
0
0
07 May 2025
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du
Bo Wu
Yan Lu
Zhendong Mao
22
0
0
08 Apr 2025
Multi-Granular Multimodal Clue Fusion for Meme Understanding
Li Zheng
Hao Fei
Ting Dai
Zuquan Peng
Fei Li
Huisheng Ma
Chong Teng
Donghong Ji
50
0
0
16 Mar 2025
Consistency of Compositional Generalization across Multiple Levels
Chuanhao Li
Zhen Li
Chenchen Jing
Xiaomeng Fan
Wenbo Ye
Yuwei Wu
Yunde Jia
CoGe
69
0
0
18 Dec 2024
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
Shengqiong Wu
Hao Fei
Liangming Pan
William Yang Wang
Shuicheng Yan
Tat-Seng Chua
LRM
59
1
0
15 Dec 2024
Vid-Morp: Video Moment Retrieval Pretraining from Unlabeled Videos in the Wild
Peijun Bao
Chenqi Kong
Zihao Shao
Boon Poh Ng
Meng Hwa Er
Alex C. Kot
54
2
0
01 Dec 2024
VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding
Houlun Chen
Xin Wang
Hong Chen
Zeyang Zhang
Wei Feng
Bin Huang
Jia Jia
Wenwu Zhu
VGen
25
3
0
11 Oct 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
Kaihang Pan
Zhaoyu Fan
Juncheng Li
Qifan Yu
Hao Fei
Siliang Tang
Richang Hong
Hanwang Zhang
Qianru Sun
KELM
26
6
0
30 Sep 2024
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng
Xinhao Cai
Qingchao Chen
Yuxin Peng
Yang Liu
32
4
0
29 Aug 2024
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding
Zixu Cheng
Yujiang Pu
Shaogang Gong
Parisa Kordjamshidi
Yu Kong
AI4TS
17
0
0
06 Jul 2024
Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering
Zhaohe Liao
Jiangtong Li
Li Niu
Liqing Zhang
CoGe
30
3
0
03 Jul 2024
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Weitong Cai
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
26
0
0
25 Jun 2024
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Xing Zhang
Jiaxi Gu
Haoyu Zhao
Shicong Wang
Hang Xu
Renjing Pei
Songcen Xu
Zuxuan Wu
Yu-Gang Jiang
33
0
0
11 Jun 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Ajmal Saeed Mian
24
2
0
21 May 2024
SnAG: Scalable and Accurate Video Grounding
Fangzhou Mu
Sicheng Mo
Yin Li
21
8
0
02 Apr 2024
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Long Qian
Juncheng Billy Li
Yu-hao Wu
Yaobo Ye
Hao Fei
Tat-Seng Chua
Yueting Zhuang
Siliang Tang
MLLM
LRM
60
47
0
18 Feb 2024
A Joint-Reasoning based Disease Q&A System
Prakash Chandra Sukhwal
Vaibhav Rajan
A. Kankanhalli
LM&MA
AI4MH
10
0
0
06 Jan 2024
TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection
Hao Sun
Mingyao Zhou
Wenjing Chen
Wei Xie
PINN
3DGS
ViT
19
31
0
04 Jan 2024
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
Zhihang Liu
Jun Li
Hongtao Xie
Pandeng Li
Jiannan Ge
Sun-Ao Liu
Guoqing Jin
35
18
0
19 Dec 2023
De-fine: Decomposing and Refining Visual Programs with Auto-Feedback
Minghe Gao
Juncheng Li
Hao Fei
Liang Pang
Wei Ji
Guoming Wang
Wenqiao Zhang
Siliang Tang
Yueting Zhuang
14
8
0
21 Nov 2023
Improving Vision Anomaly Detection with the Guidance of Language Modality
Dong Chen
Kaihang Pan
Guoming Wang
Yueting Zhuang
Siliang Tang
12
3
0
04 Oct 2023
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Dezhao Luo
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
VLM
14
6
0
01 Sep 2023
I3: Intent-Introspective Retrieval Conditioned on Instructions
Kaihang Pan
Juncheng Li
Wenjie Wang
Hao Fei
Hongye Song
Wei Ji
Jun Lin
Xiaozhong Liu
Tat-Seng Chua
Siliang Tang
31
5
0
19 Aug 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
27
61
0
08 Aug 2023
Keyword-Aware Relative Spatio-Temporal Graph Networks for Video Question Answering
Yi Cheng
Hehe Fan
Dongyun Lin
Ying Sun
Mohan S. Kankanhalli
J. Lim
12
4
0
25 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Qi Zhang
S. Zheng
Qin Jin
10
1
0
20 Jul 2023
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models
Avinash Madasu
Vasudev Lal
CoGe
29
3
0
28 Jun 2023
Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document
Xiangnan Chen
Qianwen Xiao
Juncheng Li
Duo Dong
Jun Lin
Xiaozhong Liu
Siliang Tang
32
5
0
23 May 2023
Movie101: A New Movie Understanding Benchmark
Zihao Yue
Qi Zhang
Anwen Hu
Liang Zhang
Ziheng Wang
Qin Jin
VGen
11
17
0
20 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
35
33
0
29 Apr 2023
Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Qifan Yu
Juncheng Li
Yuehua Wu
Siliang Tang
Wei Ji
Yueting Zhuang
25
33
0
23 Mar 2023
Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization
Kaihang Pan
Juncheng Billy Li
Hongye Song
Jun Lin
Xiaozhong Liu
Siliang Tang
OffRL
22
10
0
22 Mar 2023
Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models
Juncheng Li
Minghe Gao
Longhui Wei
Siliang Tang
Wenqiao Zhang
Meng Li
Wei Ji
Qi Tian
Tat-Seng Chua
Yueting Zhuang
VLM
VPVLM
27
18
0
12 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
14
28
0
28 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya-Qin Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
14
15
0
20 Feb 2023
Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding
Juncheng Li
Siliang Tang
Linchao Zhu
Wenqiao Zhang
Yi Yang
Tat-Seng Chua
Fei Wu
Y. Zhuang
BDL
11
14
0
22 Jan 2023
Exploiting Auxiliary Caption for Video Grounding
Hongxiang Li
Meng Cao
Xuxin Cheng
Zhihong Zhu
Yaowei Li
Yuexian Zou
14
10
0
15 Jan 2023
Test of Time: Instilling Video-Language Models with a Sense of Time
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
70
36
0
05 Jan 2023
Language-free Training for Zero-shot Video Grounding
Dahye Kim
Jungin Park
Jiyoung Lee
S. Park
K. Sohn
25
20
0
24 Oct 2022
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective
Chaoqi Chen
Yushuang Wu
Qiyuan Dai
Hong-Yu Zhou
Mutian Xu
Sibei Yang
Xiaoguang Han
Yizhou Yu
ViT
MedIm
AI4CE
25
73
0
27 Sep 2022
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos
Juncheng Billy Li
Junlin Xie
Linchao Zhu
Long Qian
Siliang Tang
...
Haochen Shi
Shengyu Zhang
Longhui Wei
Qi Tian
Yueting Zhuang
14
12
0
03 Aug 2022
BOSS: Bottom-up Cross-modal Semantic Composition with Hybrid Counterfactual Training for Robust Content-based Image Retrieval
Wenqiao Zhang
Jiannan Guo
Meng Li
Haochen Shi
Shengyu Zhang
Juncheng Li
Siliang Tang
Yueting Zhuang
44
6
0
09 Jul 2022
Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active Learning
Jiannan Guo
Yangyang Kang
Yu Duan
Xiaozhong Liu
Siliang Tang
Wenqiao Zhang
Kun Kuang
Changlong Sun
Fei Wu
19
4
0
07 Jun 2022
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Meng Li
Tianbao Wang
Haoyu Zhang
Shengyu Zhang
Zhou Zhao
...
Wenming Tan
Jin Wang
Peng Wang
Shi Pu
Fei Wu
13
45
0
15 Mar 2022
BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
Wenqiao Zhang
Lei Zhu
James Hallinan
A. Makmur
Shengyu Zhang
Qingpeng Cai
Beng Chin Ooi
15
78
0
04 Mar 2022
Temporal Sentence Grounding in Videos: A Survey and Future Directions
Hao Zhang
Aixin Sun
Wei Jing
Joey Tianyi Zhou
3DGS
29
38
0
20 Jan 2022
Building machines that adapt and compute like brains
Brenden Lake
J. Tenenbaum
AI4CE
FedML
NAI
AILaw
243
893
0
11 Nov 2017
1