Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.02325
Cited By
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training
4 March 2024
David Wan
Jaemin Cho
Elias Stengel-Eskin
Mohit Bansal
VLM
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"
28 / 28 papers shown
Title
Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs
Dung Tien Nguyen
Minh Khoi Ho
Huy Ta
T. Nguyen
Qi Chen
...
Zhibin Liao
Minh Nguyen Nhat To
Johan W. Verjans
Phi Le Nguyen
Vu Minh Hieu Phan
19
0
0
30 Apr 2025
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
Ziqiao Ma
Jing Ding
Xuejun Zhang
Dezhi Luo
Jiahe Ding
Sihan Xu
Yuchen Huang
Run Peng
Joyce Chai
44
0
0
22 Apr 2025
URECA: Unique Region Caption Anything
Sangbeom Lim
J. Kim
Heeji Yoon
Jaewoo Jung
Seungryong Kim
29
0
0
07 Apr 2025
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention
Jinhao Duan
Fei Kong
Hao-Ran Cheng
James Diffenderfer
B. Kailkhura
Lichao Sun
Xiaofeng Zhu
Xiaoshuang Shi
Kaidi Xu
52
0
0
13 Mar 2025
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding
Kyungmin Min
Minbeom Kim
Kang-il Lee
Dongryeol Lee
Kyomin Jung
MLLM
60
3
0
20 Feb 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
108
2
0
14 Jan 2025
Supervision-free Vision-Language Alignment
Giorgio Giannone
Ruoteng Li
Qianli Feng
Evgeny Perevodchikov
Rui Chen
Aleix M. Martinez
VLM
56
0
0
08 Jan 2025
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
41
3
0
31 Dec 2024
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
Jiaqi Wang
Yifei Gao
Jitao Sang
MLLM
99
2
0
24 Nov 2024
Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
18
2
0
03 Nov 2024
Natural Language Inference Improves Compositionality in Vision-Language Models
Paola Cascante-Bonilla
Yu Hou
Yang Trista Cao
Hal Daumé III
Rachel Rudinger
ReLM
CoGe
VLM
31
3
0
29 Oct 2024
Reducing Hallucinations in Vision-Language Models via Latent Space Steering
Sheng Liu
Haotian Ye
Lei Xing
James Zou
VLM
LLMSV
23
5
0
21 Oct 2024
ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue
Zhangpu Li
Changhong Zou
Suxue Ma
Zhicheng Yang
Chen Du
...
Xingzhi Sun
Jing Xiao
Kai Zhang
Mei Han
Mei Han
LM&MA
35
1
0
26 Sep 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
69
5
0
11 Sep 2024
Visual Prompting in Multimodal Large Language Models: A Survey
Junda Wu
Zhehao Zhang
Yu Xia
Xintong Li
Zhaoyang Xia
...
Subrata Mitra
Dimitris N. Metaxas
Lina Yao
Jingbo Shang
Julian McAuley
VLM
LRM
38
12
0
05 Sep 2024
Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation
Jian Hu
Jiayi Lin
Junchi Yan
Shaogang Gong
VLM
20
7
0
27 Aug 2024
VACoDe: Visual Augmented Contrastive Decoding
Sihyeon Kim
Boryeong Cho
Sangmin Bae
Sumyeong Ahn
SeYoung Yun
21
0
0
26 Jul 2024
Multi-Object Hallucination in Vision-Language Models
Xuweiyi Chen
Ziqiao Ma
Xuejun Zhang
Sihan Xu
Shengyi Qian
Jianing Yang
David Fouhey
Joyce Chai
36
15
0
08 Jul 2024
Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
Xin Xiao
Bohong Wu
Jiacong Wang
Chunyuan Li
Xun Zhou
Haoyuan Guo
VLM
21
7
0
28 May 2024
RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs
Sangmin Woo
Jaehyuk Jang
Donguk Kim
Yubin Choi
Changick Kim
19
1
0
28 May 2024
Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
Sangmin Woo
Donguk Kim
Jaehyuk Jang
Yubin Choi
Changick Kim
25
12
0
28 May 2024
Contextual Emotion Recognition using Large Vision Language Models
Yasaman Etesam
Özge Nilay Yalçin
Chuxuan Zhang
Angelica Lim
VLM
54
3
0
14 May 2024
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Yuechen Zhang
Shengju Qian
Bohao Peng
Shu-Lin Liu
Jiaya Jia
MLLM
22
19
0
07 Dec 2023
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Sicong Leng
Hang Zhang
Guanzheng Chen
Xin Li
Shijian Lu
Chunyan Miao
Li Bing
VLM
MLLM
82
196
0
28 Nov 2023
Emotional Theory of Mind: Bridging Fast Visual Processing with Slow Linguistic Reasoning
Yasaman Etesam
Özge Nilay Yalçin
Chuxuan Zhang
Angelica Lim
27
2
0
30 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
178
521
0
06 Oct 2022
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
186
218
0
24 Sep 2021
1