Don't Miss the Forest for the Trees: Attentional Vision Calibration for
Large Vision Language Models

Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models

28 May 2024

Papers citing "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models"

16 / 16 papers shown

Title
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Hongcheng Gao Jiashu Qu Jingyi Tang Baolong Bi Y. Liu Hongyu Chen Li Liang Li Su Qingming Huang MLLM VLM LRM 79 3 0 25 Mar 2025
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models Bin Li Dehong Gao Yeyuan Wang Linbo Jin Shanqing Yu Xiaoyan Cai Libin Yang VLM 36 0 0 24 Mar 2025
EAZY: Eliminating Hallucinations in LVLMs by Zeroing out Hallucinatory Image Tokens Liwei Che Tony Qingze Liu Jing Jia Weiyi Qin Ruixiang Tang Vladimir Pavlovic MLLM VLM 100 1 0 10 Mar 2025
See What You Are Told: Visual Attention Sink in Large Multimodal Models Seil Kang Jinyeong Kim Junhyeok Kim Seong Jae Hwang VLM 101 5 0 05 Mar 2025
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding Wei Suo Lijun Zhang Mengyang Sun Lin Yuanbo Wu Peng Wang Y. Zhang MLLM VLM 47 1 0 01 Mar 2025
Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models Mingi Jung Saehuyng Lee Eunji Kim Sungroh Yoon 66 0 0 03 Feb 2025
ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling William Jongwon Han Chaojing Duan M. Rosenberg Emerson Liu Ding Zhao 62 0 0 18 Dec 2024
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Di Zhang Jingdi Lei Junxian Li Xunzhi Wang Y. Liu ... S. M. I. Simon X. Yang Jianbo Wu Peng Ye Wanli Ouyang Dongzhan Zhou OffRL LRM 105 6 0 27 Nov 2024
Law of Vision Representation in MLLMs Shijia Yang Bohan Zhai Quanzeng You Jianbo Yuan Hongxia Yang Chenfeng Xu 40 9 0 29 Aug 2024
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models Fushuo Huo Wenchao Xu Zhong Zhang Haozhao Wang Zhicheng Chen Peilin Zhao VLM MLLM 55 18 0 04 Aug 2024
Multi-Modal Hallucination Control by Visual Information Grounding Alessandro Favero L. Zancato Matthew Trager Siddharth Choudhary Pramuditha Perera Alessandro Achille Ashwin Swaminathan Stefano Soatto MLLM 63 62 0 20 Mar 2024
Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models Xin He Longhui Wei Lingxi Xie Qi Tian 35 8 0 06 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Zhe Chen Jiannan Wu Wenhai Wang Weijie Su Guo Chen ... Bin Li Ping Luo Tong Lu Yu Qiao Jifeng Dai VLM MLLM 135 895 0 21 Dec 2023
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback M. Steyvers Yuan Yao Haoye Zhang Taiwen He Yifeng Han ... Xinyue Hu Zhiyuan Liu Hai-Tao Zheng Maosong Sun Tat-Seng Chua MLLM VLM 130 176 0 01 Dec 2023
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding Sicong Leng Hang Zhang Guanzheng Chen Xin Li Shijian Lu Chunyan Miao Li Bing VLM MLLM 82 196 0 28 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 244 4,186 0 30 Jan 2023