Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2303.09867
Cited By
v1
v2 (latest)
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
IEEE International Conference on Computer Vision (ICCV), 2023
17 March 2023
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Xiang Ji
Chang-rui Liu
Li-ming Yuan
Jie Chen
DiffM
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Github (131★)
Papers citing
"DiffusionRet: Generative Text-Video Retrieval with Diffusion Model"
46 / 46 papers shown
Title
Table Comprehension in Building Codes using Vision Language Models and Domain-Specific Fine-Tuning
Mohammad Aqib
Mohd Hamza
Ying Hei Chui
Qipei Mei
LMTD
309
0
0
23 Nov 2025
Reasoning Text-to-Video Retrieval via Digital Twin Video Representations and Large Language Models
Yiqing Shen
Chenxiao Fan
Chenjia Li
Mathias Unberath
VGen
LRM
194
0
0
15 Nov 2025
TCMA: Text-Conditioned Multi-granularity Alignment for Drone Cross-Modal Text-Video Retrieval
Zixu Zhao
Yang Zhan
VGen
AI4TS
125
0
0
11 Oct 2025
RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning
Zipeng Guo
Lichen Ma
Xiaolong Fu
Gaojing Zhou
L. Yang
...
Zhen Chen
Yu Shi
Junshi Huang
Jason Li
Chao Gou
DiffM
90
0
0
09 Oct 2025
Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval
Bangxiang Lan
Ruobing Xie
Ruixiang Zhao
Xingwu Sun
Zhanhui Kang
Gang Yang
Xirong Li
84
0
0
05 Sep 2025
Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search
Fan Hu
Zijie Xin
Xirong Li
106
0
0
04 Aug 2025
BiMa: Towards Biases Mitigation for Text-Video Retrieval via Scene Element Guidance
Huy Le
Nhat Chung
Tung Kieu
A. Nguyen
Ngan Le
364
1
0
04 Jun 2025
Semantic-Space-Intervened Diffusive Alignment for Visual Classification
International Joint Conference on Artificial Intelligence (IJCAI), 2025
Zixuan Li
Lei Meng
Guoqing Chao
Wei Wu
Xiaoshuo Yan
Yimeng Yang
Zhuang Qi
Xiangxu Meng
DiffM
322
0
0
09 May 2025
Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval
WonJun Moon
Cheol-Ho Cho
Woojin Jun
Minho Shim
Taeoh Kim
Inwoong Lee
Dongyoon Wee
Jae-Pil Heo
240
3
0
17 Apr 2025
DiffusionCom: Structure-Aware Multimodal Diffusion Model for Multimodal Knowledge Graph Completion
Wei Huang
M. Liang
Peining Li
Xu Hou
Yawen Li
Junping Du
Zhe Xue
Zeli Guan
DiffM
227
0
0
09 Apr 2025
Query Smarter, Trust Better? Exploring Search Behaviours for Verifying News Accuracy
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
David Elsweiler
Samy Ateia
Markus Bink
Gregor Donabauer
Marcos Fernández-Pichel
...
Udo Kruschwitz
David Losada
Bernd Ludwig
Selina Meyer
Noel Pascual Presa
163
0
0
07 Apr 2025
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
A. Fragomeni
Dima Damen
Michael Wray
414
1
0
02 Apr 2025
Generative Modeling of Class Probability for Multi-Modal Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2025
Jungkyoo Shin
Bumsoo Kim
Eunwoo Kim
359
2
0
21 Mar 2025
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval
Computer Vision and Pattern Recognition (CVPR), 2025
Zengrong Lin
Zheng Wang
Tianwen Qian
Pan Mu
Sixian Chan
Cong Bai
191
2
0
13 Mar 2025
Continual Text-to-Video Retrieval with Frame Fusion and Task-Aware Routing
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Zecheng Zhao
Zhi Chen
Zi-Rui Huang
S. Sadiq
Tong Chen
432
0
0
13 Mar 2025
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
AAAI Conference on Artificial Intelligence (AAAI), 2024
Haoran Tang
Meng Cao
Jinfa Huang
Ruyang Liu
Peng Jin
Ge Li
Xiaodan Liang
Mamba
334
8
0
24 Feb 2025
Learning Semantic Facial Descriptors for Accurate Face Animation
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Lei Zhu
Yuanqi Chen
Xiaohang Liu
Thomas H. Li
Ge Li
CVBM
108
0
0
29 Jan 2025
Unveiling Discrete Clues: Superior Healthcare Predictions for Rare Diseases
The Web Conference (WWW), 2025
Chuang Zhao
Hui Tang
Jiheng Zhang
Xiaomeng Li
211
2
0
23 Jan 2025
AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scene
AAAI Conference on Artificial Intelligence (AAAI), 2025
Chaoran Feng
Wangbo Yu
Xinhua Cheng
Zhenyu Tang
Junwu Zhang
Li Yuan
Yonghong Tian
238
16
0
08 Jan 2025
Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Peng Jin
Haoyang Li
Li Yuan
Shuicheng Yan
Jie Chen
363
4
0
31 Dec 2024
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
Computer Vision and Pattern Recognition (CVPR), 2024
Tanveer Hannan
Md. Mohaiminul Islam
Jindong Gu
Thomas Seidl
Gedas Bertasius
VLM
206
9
0
22 Nov 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
International Conference on Machine Learning (ICML), 2024
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
349
34
0
15 Oct 2024
DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval
Interspeech (Interspeech), 2024
Yifei Xin
Xuxin Cheng
Zhihong Zhu
Xusheng Yang
Yuexian Zou
DiffM
290
7
0
16 Sep 2024
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
International Conference on Learning Representations (ICLR), 2024
Leqi Shen
Tianxiang Hao
Tao He
Sicheng Zhao
Pengzhang Liu
Yongjun Bao
Guiguang Ding
Guiguang Ding
404
32
0
02 Sep 2024
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
Peng Jin
Hao Li
Ze-Long Cheng
Kehan Li
Runyi Yu
Yu Xie
Xiangyang Ji
Li-ming Yuan
Jie Chen
DiffM
176
13
0
15 Jul 2024
Towards Retrieval Augmented Generation over Large Video Libraries
Yannis Tevissen
Khalil Guetari
Frédéric Petitpont
RALM
126
2
0
21 Jun 2024
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang
Guohao Sun
Pichao Wang
Dongfang Liu
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Zhiqiang Tao
VGen
304
60
0
26 Mar 2024
VidLA: Video-Language Alignment at Scale
Computer Vision and Pattern Recognition (CVPR), 2024
Mamshad Nayeem Rizve
Fan Fei
Jayakrishnan Unnikrishnan
Son Tran
Benjamin Z. Yao
Belinda Zeng
Mubarak Shah
Trishul Chilimbi
VLM
AI4TS
180
8
0
21 Mar 2024
Retrieval is Accurate Generation
Bowen Cao
Deng Cai
Leyang Cui
Xuxin Cheng
Wei Bi
Yuexian Zou
Shuming Shi
374
11
0
27 Feb 2024
TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation
Zongying Lin
Hao Li
Liuzhenghao Lv
Lin Bin
Junwu Zhang
Calvin Yu-Chian Chwn
Li Yuan
Tian Yonghong
172
6
0
27 Feb 2024
ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation
Zuobai Zhang
Jiarui Lu
Vijil Chenthamarakshan
Aurélie C. Lozano
Payel Das
Jian Tang
142
1
0
10 Feb 2024
Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach
International Conference on Learning Representations (ICLR), 2024
Shaofeng Zhang
Jinfa Huang
Qiang-feng Zhou
Zhibin Wang
Fan Wang
Jiebo Luo
Junchi Yan
DiffM
249
19
0
28 Jan 2024
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
Xiangpeng Yang
Linchao Zhu
Xiaohan Wang
Yi Yang
VLM
280
42
0
19 Jan 2024
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
675
160
0
29 Dec 2023
Diffusion-Based Particle-DETR for BEV Perception
Asen Nachkov
Martin Danelljan
D. Paudel
Luc Van Gool
DiffM
261
5
0
18 Dec 2023
FreestyleRet: Retrieving Images from Style-Diversified Queries
Hao Li
Curise Jia
Peng Jin
Ze-Long Cheng
Kehan Li
Jialu Sui
Chang Liu
Li-ming Yuan
3DH
302
16
0
05 Dec 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Computer Vision and Pattern Recognition (CVPR), 2023
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
460
347
0
14 Nov 2023
3DifFusionDet: Diffusion Model for 3D Object Detection with Robust LiDAR-Camera Fusion
Xinhao Xiang
Simon Dräger
Jiawei Zhang
170
7
0
07 Nov 2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
Neural Information Processing Systems (NeurIPS), 2023
Peng Jin
Yang Wu
Yanbo Fan
Zhongqian Sun
Yang Wei
Li-ming Yuan
DiffM
236
42
0
02 Nov 2023
A Survey on Video Diffusion Models
ACM Computing Surveys (ACM Comput. Surv.), 2023
Zhen Xing
Qijun Feng
Haoran Chen
Jingdong Sun
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
394
211
0
16 Oct 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
306
8
0
29 Aug 2023
MomentDiff: Generative Video Moment Retrieval from Random to Real
Neural Information Processing Systems (NeurIPS), 2023
P. Li
Chen-Wei Xie
Hongtao Xie
Liming Zhao
Lei Zhang
Yun Zheng
Deli Zhao
Yongdong Zhang
DiffM
VGen
330
84
0
06 Jul 2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Peng Jin
Hao Li
Ze-Long Cheng
Jinfa Huang
Zhennan Wang
Li-ming Yuan
Chang-rui Liu
Jie Chen
259
50
0
20 May 2023
TG-VQA: Ternary Game of Video Question Answering
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Hao Li
Peng Jin
Ze-Long Cheng
Songyang Zhang
Kai-xiang Chen
Zhennan Wang
Chang-rui Liu
Jie Chen
194
12
0
17 May 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
IEEE International Conference on Computer Vision (ICCV), 2023
Bo Fang
Wenhao Wu
Chang-rui Liu
Can Ma
Yuxin Song
Weiping Wang
Min Yang
Xiang Ji
Jingdong Wang
209
80
0
16 Jan 2023
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering
IEEE Transactions on Image Processing (IEEE TIP), 2022
Hao Li
Jinfa Huang
Peng Jin
Guoli Song
Qi Wu
Jie Chen
341
27
0
21 Sep 2022
1