ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.20341
  4. Cited By
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger
  Visual Cues

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues

29 July 2024
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
ArXivPDFHTML

Papers citing "BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues"

8 / 8 papers shown
Title
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives
Sara Sarto
Marcella Cornia
Rita Cucchiara
41
0
0
18 Mar 2025
Positive-Augmented Contrastive Learning for Vision-and-Language
  Evaluation and Training
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
23
3
0
09 Oct 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based
  Optimization
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
24
1
0
26 Aug 2024
Benchmarking and Improving Detail Image Caption
Benchmarking and Improving Detail Image Caption
Hongyuan Dong
Jiawen Li
Bohong Wu
Jiacong Wang
Yuan Zhang
Haoyuan Guo
VLM
MLLM
18
16
0
29 May 2024
CLAIR: Evaluating Image Captions with Large Language Models
CLAIR: Evaluating Image Captions with Large Language Models
David M. Chan
Suzanne Petryk
Joseph E. Gonzalez
Trevor Darrell
John F. Canny
32
19
0
19 Oct 2023
MiniGPT-v2: large language model as a unified interface for
  vision-language multi-task learning
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
152
280
0
14 Oct 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
1