Visual RAG: Expanding MLLM visual knowledge without fine-tuning

18 January 2025

Papers citing "Visual RAG: Expanding MLLM visual knowledge without fine-tuning"

4 / 4 papers shown

Title
Towards Film-Making Production Dialogue, Narration, Monologue Adaptive Moving Dubbing Benchmarks Chaoyi Wang Junjie Zheng Zihao Chen Shiyu Xia Chaofan Ding Xiaohao Zhang Xi Tao Xiaoming He Xinhan Di AuLLM 94 0 0 30 Apr 2025
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation Pei Liu Xin Liu Ruoyu Yao Junming Liu Siyuan Meng Ding Wang Jun Ma 3DV VLM 120 1 0 13 Apr 2025
DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning Guidance Junjie Zheng Zihao Chen Chaofan Ding Xinhan Di VGen 67 1 0 31 Mar 2025
Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook Xu Zheng Ziqiao Weng Yuanhuiyi Lyu Lutao Jiang Haiwei Xue Bin Ren Danda Pani Paudel N. Sebe Luc Van Gool Xuming Hu 3DV 37 1 0 23 Mar 2025