Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.00908
Cited By
Clue: Cross-modal Coherence Modeling for Caption Generation
2 May 2020
Malihe Alikhani
Piyush Sharma
Shengjie Li
Radu Soricut
Matthew Stone
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Clue: Cross-modal Coherence Modeling for Caption Generation"
6 / 6 papers shown
Title
Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents
Sabit Hassan
Hye-Young Chung
Xiang Zhi Tan
Malihe Alikhani
47
0
0
18 Oct 2024
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
66
32
0
11 Oct 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
67
72
0
25 May 2022
All You May Need for VQA are Image Captions
Soravit Changpinyo
Doron Kukliansky
Idan Szpektor
Xi Chen
Nan Ding
Radu Soricut
30
70
0
04 May 2022
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
53
254
0
14 Jul 2021
Human-like Controllable Image Captioning with Verb-specific Semantic Roles
Long Chen
Zhihong Jiang
Jun Xiao
Wei Liu
8
74
0
22 Mar 2021
1