ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1505.01809
  4. Cited By
Language Models for Image Captioning: The Quirks and What Works

Language Models for Image Captioning: The Quirks and What Works

7 May 2015
Jacob Devlin
Hao Cheng
Hao Fang
Saurabh Gupta
Li Deng
Xiaodong He
Geoffrey Zweig
Margaret Mitchell
ArXivPDFHTML

Papers citing "Language Models for Image Captioning: The Quirks and What Works"

32 / 32 papers shown
Title
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
...
Jinghua Yan
Y. Bai
P. Sadayappan
Xia Hu
Bo Yuan
VLM
53
0
0
24 Mar 2025
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores
Chantal Shaib
Joe Barrow
Jiuding Sun
Alexa F. Siu
Byron C. Wallace
A. Nenkova
66
31
0
01 Mar 2024
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Luke Vilnis
Yury Zemlyanskiy
Patrick C. Murray
Alexandre Passos
Sumit Sanghai
54
9
0
18 Oct 2022
Multi-Glimpse Network: A Robust and Efficient Classification
  Architecture based on Recurrent Downsampled Attention
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention
S. Tan
Runpei Dong
Kaisheng Ma
22
2
0
03 Nov 2021
Every Model Learned by Gradient Descent Is Approximately a Kernel
  Machine
Every Model Learned by Gradient Descent Is Approximately a Kernel Machine
Pedro M. Domingos
MLT
21
70
0
30 Nov 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
17
16
0
15 Feb 2020
Going Beneath the Surface: Evaluating Image Captioning for
  Grammaticality, Truthfulness and Diversity
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
17
9
0
19 Dec 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with
  Contextualized Embeddings
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
11
174
0
23 Sep 2019
Learning Visual Relation Priors for Image-Text Matching and Image
  Captioning with Neural Scene Graph Generators
Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Kuang-Huei Lee
Hamid Palangi
Xi Chen
Houdong Hu
Jianfeng Gao
VLM
11
37
0
22 Sep 2019
Compositional Generalization in Image Captioning
Compositional Generalization in Image Captioning
Mitja Nikolaus
Mostafa Abdou
Matthew Lamm
Rahul Aralikatte
Desmond Elliott
CoGe
8
49
0
10 Sep 2019
MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment
MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment
N. Ilinykh
Sina Zarrieß
David Schlangen
13
43
0
11 Jul 2019
Sequence-to-Sequence Models for Data-to-Text Natural Language
  Generation: Word- vs. Character-based Processing and Output Diversity
Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity
Glorianna Jagfeld
Sabrina Jenne
Ngoc Thang Vu
AIMat
25
24
0
11 Oct 2018
Neural Aesthetic Image Reviewer
Neural Aesthetic Image Reviewer
Wenshan Wang
Su Yang
Weishan Zhang
Jiulong Zhang
6
38
0
28 Feb 2018
Attacking Visual Language Grounding with Adversarial Examples: A Case
  Study on Neural Image Captioning
Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
Hongge Chen
Huan Zhang
Pin-Yu Chen
Jinfeng Yi
Cho-Jui Hsieh
GAN
AAML
19
49
0
06 Dec 2017
Diverse and Accurate Image Description Using a Variational Auto-Encoder
  with an Additive Gaussian Encoding Space
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
Liwei Wang
A. Schwing
Svetlana Lazebnik
CoGe
16
175
0
19 Nov 2017
AI Challenger : A Large-scale Dataset for Going Deeper in Image
  Understanding
AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
Jiahong Wu
He Zheng
Bo-Lu Zhao
Yixin Li
Baoming Yan
...
Shipei Zhou
G. Lin
Yanwei Fu
Yizhou Wang
Yonggang Wang
VLM
22
149
0
17 Nov 2017
Self-Guiding Multimodal LSTM - when we do not have a perfect training
  dataset for image captioning
Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning
Yang Xian
Yingli Tian
VLM
15
22
0
15 Sep 2017
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy
  Risks in Images
Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images
Rakshith Shetty
Bernt Schiele
Mario Fritz
19
223
0
30 Mar 2017
Where to put the Image in an Image Caption Generator
Where to put the Image in an Image Caption Generator
Marc Tanti
Albert Gatt
K. Camilleri
39
96
0
27 Mar 2017
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
16
232
0
02 Dec 2016
Semantic Regularisation for Recurrent Image Annotation
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
21
103
0
16 Nov 2016
Boosting Image Captioning with Attributes
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
11
620
0
05 Nov 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
14
68
0
18 Aug 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
11
1,883
0
29 Jul 2016
Movie Description
Movie Description
Anna Rohrbach
Atousa Torabi
Marcus Rohrbach
Niket Tandon
C. Pal
Hugo Larochelle
Aaron Courville
Bernt Schiele
3DV
VGen
22
353
0
12 May 2016
Visual Storytelling
Visual Storytelling
Ting-Hao 'Kenneth' Huang
Huang
Francis Ferraro
N. Mostafazadeh
Ishan Misra
...
C. L. Zitnick
Devi Parikh
Lucy Vanderwende
Michel Galley
Margaret Mitchell
VGen
9
464
0
13 Apr 2016
Rich Image Captioning in the Wild
Rich Image Captioning in the Wild
Kenneth Tran
Xiaodong He
Lei Zhang
Jian Sun
Cornelia Carapcea
Chris Thrasher
Chris Buehler
Chris Sienkiewicz
VLM
9
123
0
30 Mar 2016
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for
  Visual Question Answering
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
Huijuan Xu
Kate Saenko
22
760
0
17 Nov 2015
Describing Multimedia Content using Attention-based Encoder--Decoder
  Networks
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
26
410
0
04 Jul 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
24
534
0
07 May 2015
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
Junhua Mao
W. Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
VLM
30
1,234
0
20 Dec 2014
Long-term Recurrent Convolutional Networks for Visual Recognition and
  Description
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
23
6,030
0
17 Nov 2014
1