ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.06422
  4. Cited By
Learning to Evaluate Image Captioning

Learning to Evaluate Image Captioning

17 June 2018
Yin Cui
Guandao Yang
Andreas Veit
Xun Huang
Serge J. Belongie
ArXivPDFHTML

Papers citing "Learning to Evaluate Image Captioning"

50 / 74 papers shown
Title
G-VEval: A Versatile Metric for Evaluating Image and Video Captions
  Using GPT-4o
G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o
Tony Cheng Tong
Sirui He
Z. Shao
Dit-Yan Yeung
65
3
0
18 Dec 2024
Neptune: The Long Orbit to Benchmarking Long Video Understanding
Arsha Nagrani
Mingda Zhang
Ramin Mehran
Rachel Hornung
N. B. Gundavarapu
...
Boqing Gong
Cordelia Schmid
Mikhail Sirotenko
Yukun Zhu
Tobias Weyand
98
4
0
12 Dec 2024
Positive-Augmented Contrastive Learning for Vision-and-Language
  Evaluation and Training
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
23
3
0
09 Oct 2024
DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image
  Captioning
DENEB: A Hallucination-Robust Automatic Evaluation Metric for Image Captioning
Kazuki Matsuda
Yuiga Wada
Komei Sugiura
21
0
0
28 Sep 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
24
0
0
09 Aug 2024
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
Zequn Zeng
Jianqiao Sun
Hao Zhang
Tiansheng Wen
Yudi Su
Yan Xie
Zhengjue Wang
Boli Chen
39
3
0
26 Jul 2024
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM
  Framework for Detecting Factual Errors
Detecting Errors through Ensembling Prompts (DEEP): An End-to-End LLM Framework for Detecting Factual Errors
Alex Chandler
Devesh Surve
Hui Su
HILM
UQCV
15
1
0
18 Jun 2024
AutoAD III: The Prequel -- Back to the Pixels
AutoAD III: The Prequel -- Back to the Pixels
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
DiffM
36
20
0
22 Apr 2024
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models?
Can Feedback Enhance Semantic Grounding in Large Vision-Language Models?
Yuan-Hong Liao
Rafid Mahmood
Sanja Fidler
David Acuna
VLM
44
7
0
09 Apr 2024
Polos: Multimodal Metric Learning from Human Feedback for Image
  Captioning
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada
Kanta Kaneda
Daichi Saito
Komei Sugiura
22
24
0
28 Feb 2024
Open-ended VQA benchmarking of Vision-Language models by exploiting
  Classification datasets and their semantic hierarchy
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy
Simon Ging
M. A. Bravo
Thomas Brox
VLM
38
11
0
11 Feb 2024
Leveraging Large Language Models for NLG Evaluation: Advances and
  Challenges
Leveraging Large Language Models for NLG Evaluation: Advances and Challenges
Zhen Li
Xiaohan Xu
Tao Shen
Can Xu
Jia-Chen Gu
Yuxuan Lai
Chongyang Tao
Shuai Ma
LM&MA
ELM
26
9
0
13 Jan 2024
See, Say, and Segment: Teaching LMMs to Overcome False Premises
See, Say, and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLM
MLLM
30
18
0
13 Dec 2023
CLAIR: Evaluating Image Captions with Large Language Models
CLAIR: Evaluating Image Captions with Large Language Models
David M. Chan
Suzanne Petryk
Joseph E. Gonzalez
Trevor Darrell
John F. Canny
38
19
0
19 Oct 2023
ContextRef: Evaluating Referenceless Metrics For Image Description
  Generation
ContextRef: Evaluating Referenceless Metrics For Image Description Generation
Elisa Kreiss
E. Zelikman
Christopher Potts
Nick Haber
8
5
0
21 Sep 2023
Linear Alignment of Vision-language Models for Image Captioning
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
35
0
0
10 Jul 2023
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained
  Transformer for Vision, Language, and Multimodal Tasks
BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Kai Zhang
Jun Yu
Eashan Adhikarla
Rong-Er Zhou
Zhilin Yan
...
Xun Chen
Yong Chen
Quanzheng Li
Hongfang Liu
Lichao Sun
LM&MA
MedIm
12
149
0
26 May 2023
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Changrong Xiao
S. Xu
Kunpeng Zhang
DiffM
11
10
0
03 May 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
8
55
0
21 Mar 2023
Test-Time Distribution Normalization for Contrastively Learned
  Vision-language Models
Test-Time Distribution Normalization for Contrastively Learned Vision-language Models
Yi Zhou
Juntao Ren
Fengyu Li
Ramin Zabih
Ser-Nam Lim
VLM
21
13
0
22 Feb 2023
Not All Errors are Equal: Learning Text Generation Metrics using
  Stratified Error Synthesis
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis
Wenda Xu
Yi-Lin Tuan
Yujie Lu
Michael Stephen Saxon
Lei Li
William Yang Wang
31
22
0
10 Oct 2022
Affection: Learning Affective Explanations for Real-World Visual Data
Affection: Learning Affective Explanations for Real-World Visual Data
Panos Achlioptas
M. Ovsjanikov
Leonidas J. Guibas
Sergey Tulyakov
59
10
0
04 Oct 2022
A Feature-space Multimodal Data Augmentation Technique for Text-video
  Retrieval
A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval
Alex Falcon
G. Serra
O. Lanz
VGen
15
25
0
03 Aug 2022
Contrastive Cross-Modal Knowledge Sharing Pre-training for
  Vision-Language Representation Learning and Retrieval
Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval
Keyu Wen
Zhenshan Tan
Qingrong Cheng
Cheng Chen
X. Gu
VLM
6
0
0
02 Jul 2022
Measuring Representational Harms in Image Captioning
Measuring Representational Harms in Image Captioning
Angelina Wang
Solon Barocas
Kristen Laird
Hanna M. Wallach
11
51
0
14 Jun 2022
Mutual Information Divergence: A Unified Metric for Multimodal
  Generative Models
Mutual Information Divergence: A Unified Metric for Multimodal Generative Models
Jin-Hwa Kim
Yunji Kim
Jiyoung Lee
Kang Min Yoo
Sang-Woo Lee
EGVM
17
32
0
25 May 2022
RoViST:Learning Robust Metrics for Visual Storytelling
RoViST:Learning Robust Metrics for Visual Storytelling
Eileen Wang
S. Han
Josiah Poon
14
7
0
08 May 2022
Reproducibility Issues for BERT-based Evaluation Metrics
Reproducibility Issues for BERT-based Evaluation Metrics
Yanran Chen
Jonas Belouadi
Steffen Eger
27
16
0
30 Mar 2022
BERTHA: Video Captioning Evaluation Via Transfer-Learned Human
  Assessment
BERTHA: Video Captioning Evaluation Via Transfer-Learned Human Assessment
Luis Lebron
Yvette Graham
Kevin McGuinness
K. Kouramas
Noel E. O'Connor
14
3
0
25 Jan 2022
Dynamic population-based meta-learning for multi-agent communication
  with natural language
Dynamic population-based meta-learning for multi-agent communication with natural language
Abhinav Gupta
Marc Lanctot
Angeliki Lazaridou
LLMAG
17
20
0
27 Oct 2021
Reason induced visual attention for explainable autonomous driving
Reason induced visual attention for explainable autonomous driving
Sikai Chen
Jiqian Dong
Runjia Du
Yujie Li
S. Labi
29
0
0
11 Oct 2021
COSMic: A Coherence-Aware Generation Metric for Image Descriptions
COSMic: A Coherence-Aware Generation Metric for Image Descriptions
Mert Inan
P. Sharma
Baber Khalid
Radu Soricut
Matthew Stone
Malihe Alikhani
EGVM
16
13
0
11 Sep 2021
Problem Learning: Towards the Free Will of Machines
Problem Learning: Towards the Free Will of Machines
Yongfeng Zhang
FaML
13
2
0
01 Sep 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Pierre Colombo
Guillaume Staerman
Chloé Clavel
Pablo Piantanida
14
41
0
27 Aug 2021
Language Model Augmented Relevance Score
Language Model Augmented Relevance Score
Ruibo Liu
Jason W. Wei
Soroush Vosoughi
14
10
0
19 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
53
244
0
14 Jul 2021
Contrastive Semantic Similarity Learning for Image Captioning Evaluation
  with Intrinsic Auto-encoder
Contrastive Semantic Similarity Learning for Image Captioning Evaluation with Intrinsic Auto-encoder
Chao Zeng
Tiesong Zhao
Sam Kwong
14
2
0
29 Jun 2021
UMIC: An Unreferenced Metric for Image Captioning via Contrastive
  Learning
UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Trung Bui
Kyomin Jung
VLM
11
34
0
26 Jun 2021
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption
  Evaluation via Typicality Analysis
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
Joshua Forster Feinglass
Yezhou Yang
11
18
0
02 Jun 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
6
882
0
18 Apr 2021
ArtEmis: Affective Language for Visual Art
ArtEmis: Affective Language for Visual Art
Panos Achlioptas
M. Ovsjanikov
Kilichbek Haydarov
Mohamed Elhoseiny
Leonidas J. Guibas
18
115
0
19 Jan 2021
WEmbSim: A Simple yet Effective Metric for Image Captioning
WEmbSim: A Simple yet Effective Metric for Image Captioning
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
15
1
0
24 Dec 2020
LCEval: Learned Composite Metric for Caption Evaluation
LCEval: Learned Composite Metric for Caption Evaluation
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
16
8
0
24 Dec 2020
Intrinsic Image Captioning Evaluation
Intrinsic Image Captioning Evaluation
Chao Zeng
Sam Kwong
11
0
0
14 Dec 2020
Dual Attention on Pyramid Feature Maps for Image Captioning
Dual Attention on Pyramid Feature Maps for Image Captioning
Litao Yu
Jian Andrew Zhang
Qiang Wu
6
39
0
02 Nov 2020
Quantifying Learnability and Describability of Visual Concepts Emerging
  in Representation Learning
Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning
Iro Laina
Ruth C. Fong
Andrea Vedaldi
OCL
12
13
0
27 Oct 2020
Learning Dual Semantic Relations with Graph Attention for Image-Text
  Matching
Learning Dual Semantic Relations with Graph Attention for Image-Text Matching
Keyu Wen
Xiaodong Gu
Qingrong Cheng
6
95
0
22 Oct 2020
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
18
6
0
19 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei-Neng Chen
Weiping Wang
Li Liu
M. Lew
VLM
105
30
0
16 Oct 2020
Positioning yourself in the maze of Neural Text Generation: A
  Task-Agnostic Survey
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey
Khyathi Raghavi Chandu
A. Black
10
0
0
14 Oct 2020
12
Next