Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.18091
Cited By
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
28 February 2024
Yuiga Wada
Kanta Kaneda
Daichi Saito
Komei Sugiura
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Polos: Multimodal Metric Learning from Human Feedback for Image Captioning"
8 / 8 papers shown
Title
A Video-grounded Dialogue Dataset and Metric for Event-driven Activities
Wiradee Imrattanatrai
Masaki Asada
Kimihiro Hasegawa
Zhi-Qi Cheng
Ken Fukuda
Teruko Mitamura
VGen
43
0
0
30 Jan 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
51
0
0
28 Jan 2025
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis
Wenda Xu
Yi-Lin Tuan
Yujie Lu
Michael Stephen Saxon
Lei Li
William Yang Wang
23
22
0
10 Oct 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
378
4,010
0
28 Jan 2022
CIDEr-R: Robust Consensus-based Image Description Evaluation
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
20
18
0
28 Sep 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
51
244
0
14 Jul 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
203
698
0
28 Apr 2021
1