ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.05256
  4. Cited By
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

8 May 2024
Prannay Kaul
Zhizhong Li
Hao-Yu Yang
Yonatan Dukler
Ashwin Swaminathan
C. Taylor
Stefano Soatto
    HILM
ArXivPDFHTML

Papers citing "THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models"

15 / 15 papers shown
Title
DASH: Detection and Assessment of Systematic Hallucinations of VLMs
DASH: Detection and Assessment of Systematic Hallucinations of VLMs
Maximilian Augustin
Yannic Neuhaus
Matthias Hein
VLM
47
1
0
30 Mar 2025
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Hongcheng Gao
Jiashu Qu
Jingyi Tang
Baolong Bi
Y. Liu
Hongyu Chen
Li Liang
Li Su
Qingming Huang
MLLM
VLM
LRM
81
3
0
25 Mar 2025
Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding
Shunqi Mao
Chaoyi Zhang
Weidong Cai
MLLM
53
0
0
13 Mar 2025
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Towards Statistical Factuality Guarantee for Large Vision-Language Models
Z. Li
Chao Yan
Nicholas J. Jackson
Wendi Cui
B. Li
Jiaxin Zhang
Bradley Malin
67
0
0
27 Feb 2025
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Efficient Scaling of Diffusion Transformers for Text-to-Image Generation
Hao Li
Shamit Lal
Zhiheng Li
Yusheng Xie
Ying Wang
...
R. Manmatha
Z. Tu
Stefano Ermon
Stefano Soatto
A. Swaminathan
78
0
0
16 Dec 2024
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large
  Language Models
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models
Han Qiu
Jiaxing Huang
Peng Gao
Qin Qi
Xiaoqin Zhang
Ling Shao
Shijian Lu
HILM
17
1
0
13 Oct 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
34
8
0
21 Sep 2024
Mitigating Dialogue Hallucination for Large Vision Language Models via
  Adversarial Instruction Tuning
Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning
Dongmin Park
Zhaofang Qian
Guangxing Han
Ser-Nam Lim
MLLM
28
0
0
15 Mar 2024
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on
  Deceptive Prompts
How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts
Yusu Qian
Haotian Zhang
Yinfei Yang
Zhe Gan
64
26
0
20 Feb 2024
MiniGPT-v2: large language model as a unified interface for
  vision-language multi-task learning
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
154
280
0
14 Oct 2023
mPLUG-Owl: Modularization Empowers Large Language Models with
  Multimodality
mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality
Qinghao Ye
Haiyang Xu
Guohai Xu
Jiabo Ye
Ming Yan
...
Junfeng Tian
Qiang Qi
Ji Zhang
Feiyan Huang
Jingren Zhou
VLM
MLLM
203
883
0
27 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
2,875
0
11 Feb 2021
1