Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2008.01403
Cited By
v1
v2 (latest)
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization
4 August 2020
Daizong Liu
Xiaoye Qu
Xiao-Yang Liu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization"
50 / 74 papers shown
Dual Learning with Dynamic Knowledge Distillation and Soft Alignment for Partially Relevant Video Retrieval
Jianfeng Dong
Lei Huang
Daizong Liu
Xianke Chen
Xun Yang
Changting Lin
Xun Wang
Meng Wang
169
0
0
14 Oct 2025
FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting
Zefeng He
Xiaoye Qu
Yafu Li
Siyuan Huang
Daizong Liu
Yu Cheng
OffRL
VLM
LRM
362
16
0
29 Sep 2025
ResidualViT for Efficient Temporally Dense Video Encoding
Mattia Soldan
Fabian Caba Heilbron
Bernard Ghanem
Josef Sivic
Bryan C. Russell
224
1
0
16 Sep 2025
Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation
H. Ma
Daizong Liu
Xiaowen Cai
Pan Zhou
Yulai Xie
GAN
337
0
0
25 Aug 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
1.0K
1
0
06 Jan 2025
Activity-aware Human Mobility Prediction with Hierarchical Graph Attention Recurrent Network
Yihong Tang
Junlin He
Zhan Zhao
HAI
656
9
0
03 Jan 2025
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Zhuo Cao
Bingqing Zhang
Heming Du
Xin Yu
Xue Li
Sen Wang
396
20
0
18 Dec 2024
Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding
International Conference on Pattern Recognition (ICPR), 2024
Yang Liu
Daizong Liu
Wei Hu
3DPC
438
9
0
21 Oct 2024
Grounding is All You Need? Dual Temporal Grounding for Video Dialog
You Qin
Wei Ji
Xinze Lan
Hao Fei
Xun Yang
Dan Guo
Roger Zimmermann
Lizi Liao
VGen
354
2
0
08 Oct 2024
Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning
International Conference on Computational Linguistics (COLING), 2024
Xiaoye Qu
Jiashuo Sun
Wei Wei
Yu Cheng
MLLM
LRM
312
25
0
30 Aug 2024
Harmonizing Visual Text Comprehension and Generation
Zhen Zhao
Jingqun Tang
Binghong Wu
Chunhui Lin
Shubo Wei
Hao Liu
Xin Tan
Zhizhong Zhang
Can Huang
Yuan Xie
VLM
446
52
0
23 Jul 2024
Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval
Yiyang Jiang
Wengyu Zhang
Xu-Lu Zhang
Xiaoyong Wei
Chang Wen Chen
Qing Li
434
31
0
21 Jul 2024
Context-Enhanced Video Moment Retrieval with Large Language Models
Weijia Liu
Bo Miao
Jiuxin Cao
Xueling Zhu
Bo Liu
Mehwish Nasim
Lin Wang
327
14
0
21 May 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
377
23
0
21 Mar 2024
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization
Chongzhi Zhang
Mingyuan Zhang
Zhiyang Teng
Jiayi Li
Xizhou Zhu
Lewei Lu
Ziwei Liu
Aixin Sun
DiffM
VGen
199
1
0
16 Jan 2024
Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval
Love Panta
Prashant Shrestha
Brabeem Sapkota
Amrita Bhattarai
Suresh Manandhar
Anand Kumar Sah
315
8
0
12 Dec 2023
Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
WonJun Moon
Sangeek Hyun
Subeen Lee
Jae-Pil Heo
494
20
0
15 Nov 2023
Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding
ACM Multimedia (ACM MM), 2023
Shengkai Sun
Daizong Liu
Jianfeng Dong
Xiaoye Qu
Junyu Gao
Xun Yang
Xun Wang
Meng Wang
OffRL
329
31
0
06 Nov 2023
Exploring Iterative Refinement with Diffusion Models for Video Grounding
IEEE International Conference on Multimedia and Expo (ICME), 2023
Xiao Liang
Tao Shi
Yaoyuan Liang
Te Tao
Shao-Lun Huang
DiffM
333
2
0
26 Oct 2023
Dual-Path Temporal Map Optimization for Make-up Temporal Video Grounding
Multimedia Systems (MS), 2023
Jiaxiu Li
Kun Li
Jia Li
Guoliang Chen
Dan Guo
Meng Wang
293
3
0
12 Sep 2023
Dense Object Grounding in 3D Scenes
ACM Multimedia (ACM MM), 2023
Wencan Huang
Daizong Liu
Wei Hu
286
26
0
05 Sep 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
435
11
0
29 Aug 2023
Knowing Where to Focus: Event-aware Transformer for Video Grounding
IEEE International Conference on Computer Vision (ICCV), 2023
Jinhyun Jang
Jungin Park
Jin-Hwa Kim
Hyeongjun Kwon
Kwanghoon Sohn
364
99
0
14 Aug 2023
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
IEEE International Conference on Computer Vision (ICCV), 2023
Hongxiang Li
Meng Cao
Xuxin Cheng
Yaowei Li
Zhihong Zhu
Yuexian Zou
441
32
0
26 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Tao Gui
S. Zheng
Qin Jin
288
2
0
20 Jul 2023
A Survey on Video Moment Localization
ACM Computing Surveys (ACM CSUR), 2022
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
402
42
0
13 Jun 2023
From Region to Patch: Attribute-Aware Foreground-Background Contrastive Learning for Fine-Grained Fashion Retrieval
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Jianfeng Dong
Xi Peng
Zhe Ma
Daizong Liu
Xiaoye Qu
Xun Yang
Jixiang Zhu
Baolong Liu
244
18
0
17 May 2023
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Yining Qi
Xing Di
Weining Lu
Yu Cheng
327
12
0
06 May 2023
Boundary-Denoising for Video Activity Localization
International Conference on Learning Representations (ICLR), 2023
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Guohao Li
287
15
0
06 Apr 2023
You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Computer Vision and Pattern Recognition (CVPR), 2023
Xiang Fang
Daizong Liu
Pan Zhou
Guoshun Nan
267
56
0
14 Mar 2023
Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Daizong Liu
Pan Zhou
VOS
347
7
0
02 Mar 2023
Tracking Objects and Activities with Attention for Temporal Sentence Grounding
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zeyu Xiong
Daizong Liu
Pan Zhou
Jiahao Zhu
345
5
0
21 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
227
18
0
20 Feb 2023
Exploiting Auxiliary Caption for Video Grounding
AAAI Conference on Artificial Intelligence (AAAI), 2023
Hongxiang Li
Meng Cao
Xuxin Cheng
Zhihong Zhu
Yaowei Li
Yuexian Zou
362
16
0
15 Jan 2023
Hypotheses Tree Building for One-Shot Temporal Sentence Localization
AAAI Conference on Artificial Intelligence (AAAI), 2023
Daizong Liu
Xiang Fang
Pan Zhou
Xing Di
Weining Lu
Yu Cheng
280
29
0
05 Jan 2023
Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jiahao Zhu
Daizong Liu
Pan Zhou
Xing Di
Yu Cheng
...
Wenzheng Xu
Zichuan Xu
Yao Wan
Lichao Sun
Zeyu Xiong
237
35
0
02 Jan 2023
MRTNet: Multi-Resolution Temporal Network for Video Sentence Grounding
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Wei Ji
Long Chen
Yin-wei Wei
Yiming Wu
Tat-Seng Chua
AI4TS
204
24
0
26 Dec 2022
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval
IEEE transactions on multimedia (IEEE TMM), 2022
Xiang Fang
Daizong Liu
Pan Zhou
Yuchong Hu
492
67
0
23 Sep 2022
Hierarchical Local-Global Transformer for Temporal Sentence Grounding
IEEE transactions on multimedia (IEEE TMM), 2022
Xiang Fang
Daizong Liu
Pan Zhou
Zichuan Xu
Rui Li
349
58
0
31 Aug 2022
PRVR: Partially Relevant Video Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jianfeng Dong
Xianke Chen
Minsong Zhang
Xun Yang
Shujie Chen
Xirong Li
Xun Wang
340
49
0
26 Aug 2022
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding
European Conference on Computer Vision (ECCV), 2022
Jiachang Hao
Haifeng Sun
Pengfei Ren
Jingyu Wang
Q. Qi
J. Liao
336
36
0
29 Jul 2022
Reducing the Vision and Language Bias for Temporal Sentence Grounding
ACM Multimedia (ACM MM), 2022
Daizong Liu
Xiaoye Qu
Wei Hu
292
65
0
27 Jul 2022
Skimming, Locating, then Perusing: A Human-Like Framework for Natural Language Video Localization
ACM Multimedia (ACM MM), 2022
Daizong Liu
Wei Hu
250
43
0
27 Jul 2022
LocVTP: Video-Text Pre-training for Temporal Localization
European Conference on Computer Vision (ECCV), 2022
Meng Cao
Tianyu Yang
Junwu Weng
Can Zhang
Jue Wang
Yuexian Zou
228
72
0
21 Jul 2022
Gaussian Kernel-based Cross Modal Network for Spatio-Temporal Video Grounding
International Conference on Information Photonics (ICIP), 2022
Zeyu Xiong
Daizong Liu
Technology
128
8
0
02 Jul 2022
You Need to Read Again: Multi-granularity Perception Network for Moment Retrieval in Videos
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Xin Sun
Xinyu Wang
Jialin Gao
Qiong Liu
Xiaoping Zhou
250
47
0
25 May 2022
Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Shuo Yang
Xinxiao Wu
276
23
0
12 May 2022
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
826
68
0
13 Mar 2022
Multi-Scale Self-Contrastive Learning with Hard Negative Mining for Weakly-Supervised Query-based Video Grounding
Shentong Mo
Daizong Liu
Wei Hu
SSL
171
8
0
08 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
IEEE transactions on multimedia (IEEE TMM), 2022
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
248
47
0
06 Mar 2022
1
2
Next
Page 1 of 2