ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6632
  4. Cited By
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

20 December 2014
Junhua Mao
W. Xu
Yi Yang
Jiang Wang
Zhiheng Huang
Alan Yuille
    VLM
ArXivPDFHTML

Papers citing "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)"

50 / 417 papers shown
Title
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Automatic Rule Induction for Interpretable Semi-Supervised Learning
Reid Pryzant
Ziyi Yang
Yichong Xu
Chenguang Zhu
Michael Zeng
28
9
0
18 May 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
  Context for Image Captioning
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
6
52
0
09 May 2022
Diverse Image Captioning with Grounded Style
Diverse Image Captioning with Grounded Style
Franz Klein
Shweta Mahajan
S. Roth
14
7
0
03 May 2022
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Zhaowei Cai
Gukyeong Kwon
Avinash Ravichandran
Erhan Bas
Z. Tu
Rahul Bhotika
Stefano Soatto
ObjD
MLLM
VLM
17
49
0
12 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
30
16
0
08 Apr 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Xiao Wang
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
C. L. P. Chen
22
12
0
07 Mar 2022
Vision-Language Intelligence: Tasks, Representation Learning, and Large
  Models
Vision-Language Intelligence: Tasks, Representation Learning, and Large Models
Feng Li
Hao Zhang
Yi-Fan Zhang
S. Liu
Jian Guo
L. Ni
Pengchuan Zhang
Lei Zhang
AI4TS
VLM
16
36
0
03 Mar 2022
Inference of captions from histopathological patches
Inference of captions from histopathological patches
M. Tsuneki
F. Kanavati
8
29
0
07 Feb 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network
  Accelerators
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators
Lois Orosa
Skanda Koppula
Yaman Umuroglu
Konstantinos Kanellopoulos
Juan Gómez Luna
Michaela Blott
K. Vissers
O. Mutlu
26
4
0
04 Feb 2022
Multi-Label Classification on Remote-Sensing Images
Multi-Label Classification on Remote-Sensing Images
A. Singh
B. Uma Shankar
6
0
0
06 Jan 2022
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
32
192
0
29 Nov 2021
Contrastive Learning of Visual-Semantic Embeddings
Contrastive Learning of Visual-Semantic Embeddings
Anurag Jain
Yashaswi Verma
SSL
25
1
0
17 Oct 2021
Geometry Attention Transformer with Position-aware LSTMs for Image
  Captioning
Geometry Attention Transformer with Position-aware LSTMs for Image Captioning
Chi-Yin Wang
Yulin Shen
Luping Ji
ViT
39
49
0
01 Oct 2021
Cross Modification Attention Based Deliberation Model for Image
  Captioning
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
19
4
0
17 Sep 2021
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal
  Attention
Sensor-Augmented Egocentric-Video Captioning with Dynamic Modal Attention
Katsuyuki Nakamura
Hiroki Ohashi
Mitsuhiro Okada
EgoV
31
12
0
07 Sep 2021
Group-based Distinctive Image Captioning with Memory Attention
Group-based Distinctive Image Captioning with Memory Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
6
18
0
20 Aug 2021
Caption Generation on Scenes with Seen and Unseen Object Categories
Caption Generation on Scenes with Seen and Unseen Object Categories
B. Demirel
R. G. Cinbis
VLM
15
1
0
13 Aug 2021
A Better Loss for Visual-Textual Grounding
A Better Loss for Visual-Textual Grounding
Davide Rigoni
Luciano Serafini
A. Sperduti
ObjD
17
3
0
11 Aug 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
162
152
0
07 Aug 2021
Structured Multi-modal Feature Embedding and Alignment for
  Image-Sentence Retrieval
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
J. Jose
Zhilong Ji
Zhongqin Wu
Xiao-Chang Liu
18
53
0
05 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
53
254
0
14 Jul 2021
A comparison of LSTM and GRU networks for learning symbolic sequences
A comparison of LSTM and GRU networks for learning symbolic sequences
Roberto Cahuantzi
Xinye Chen
S. Güttel
11
135
0
05 Jul 2021
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by
  Bidirectional Matching between Parts and Words
Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words
Chuan Tang
Xi Yang
Bojian Wu
Zhizhong Han
Yi Chang
3DPC
28
13
0
05 Jul 2021
Case Relation Transformer: A Crossmodal Language Generation Model for
  Fetching Instructions
Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions
Motonari Kambara
K. Sugiura
ViT
11
6
0
02 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
23
36
0
01 Jul 2021
New Encoder Learning for Captioning Heavy Rain Images via Semantic
  Visual Feature Matching
New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching
Chang-Hwan Son
Pung-Hwi Ye
15
3
0
28 May 2021
Writing by Memorizing: Hierarchical Retrieval-based Medical Report
  Generation
Writing by Memorizing: Hierarchical Retrieval-based Medical Report Generation
Xingyi Yang
Muchao Ye
Quanzeng You
Fenglong Ma
MedIm
8
37
0
25 May 2021
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image
  Retrieval
Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval
K. Ueki
13
3
0
16 May 2021
End-to-End Attention-based Image Captioning
End-to-End Attention-based Image Captioning
Carola Sundaramoorthy
Lin Ziwen Kelvin
Mahak Sarin
Shubham Gupta
ViT
17
6
0
30 Apr 2021
Multi-view Deep One-class Classification: A Systematic Exploration
Multi-view Deep One-class Classification: A Systematic Exploration
Siqi Wang
Jiyuan Liu
Guang Yu
Xinwang Liu
Sihang Zhou
En Zhu
Yuexiang Yang
Jianping Yin
11
1
0
27 Apr 2021
Towards Open-World Text-Guided Face Image Generation and Manipulation
Towards Open-World Text-Guided Face Image Generation and Manipulation
Weihao Xia
Yujiu Yang
Jing-Hao Xue
Baoyuan Wu
DiffM
33
39
0
18 Apr 2021
Integrating Information Theory and Adversarial Learning for Cross-modal
  Retrieval
Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval
Wei-Neng Chen
Yu Liu
E. Bakker
M. Lew
GAN
11
27
0
11 Apr 2021
A Comprehensive Review of the Video-to-Text Problem
A Comprehensive Review of the Video-to-Text Problem
Jesus Perez-Martin
B. Bustos
S. Guimarães
I. Sipiran
Jorge A. Pérez
Grethel Coello Said
13
17
0
27 Mar 2021
Sequential Learning on Liver Tumor Boundary Semantics and Prognostic
  Biomarker Mining
Sequential Learning on Liver Tumor Boundary Semantics and Prognostic Biomarker Mining
Jieneng Chen
K. Yan
Yu-Dong Zhang
Youbao Tang
Xun Xu
...
Lingyun Huang
Jing Xiao
Alan Yuille
Ya-Qin Zhang
Le Lu
6
2
0
09 Mar 2021
Analysis of Convolutional Decoder for Image Caption Generation
Analysis of Convolutional Decoder for Image Caption Generation
Sulabh Katiyar
S. Borgohain
13
0
0
08 Mar 2021
A Universal Model for Cross Modality Mapping by Relational Reasoning
A Universal Model for Cross Modality Mapping by Relational Reasoning
Zun Li
Congyan Lang
Liqian Liang
Tao Wang
Songhe Feng
Jun Wu
Yidong Li
11
2
0
26 Feb 2021
Comparative evaluation of CNN architectures for Image Caption Generation
Comparative evaluation of CNN architectures for Image Caption Generation
Sulabh Katiyar
S. Borgohain
8
24
0
23 Feb 2021
Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings
  and Data Augmentation
Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation
Sulabh Katiyar
S. Borgohain
VLM
8
14
0
22 Feb 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David F. Harwath
Christopher Song
James R. Glass
CLIP
27
66
0
31 Dec 2020
SubICap: Towards Subword-informed Image Captioning
SubICap: Towards Subword-informed Image Captioning
Naeha Sharif
Bennamoun
Wei Liu
Syed Afaq Ali Shah
20
2
0
24 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
16
9
0
16 Dec 2020
StacMR: Scene-Text Aware Cross-Modal Retrieval
StacMR: Scene-Text Aware Cross-Modal Retrieval
Andrés Mafla
Rafael Sampaio de Rezende
Lluís Gómez
Diane Larlus
Dimosthenis Karatzas
3DV
37
14
0
08 Dec 2020
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
Weihao Xia
Yujiu Yang
Jing-Hao Xue
Baoyuan Wu
DiffM
38
23
0
06 Dec 2020
Robust Image Captioning
Robust Image Captioning
Daniel Yarnell
Xian Wang
11
0
0
06 Dec 2020
Understanding Guided Image Captioning Performance across Domains
Understanding Guided Image Captioning Performance across Domains
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
21
24
0
04 Dec 2020
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling
Jing Su
Qingyun Dai
Frank Guerin
Mian Zhou
16
24
0
03 Dec 2020
Diverse Image Captioning with Context-Object Split Latent Spaces
Diverse Image Captioning with Context-Object Split Latent Spaces
Shweta Mahajan
Stefan Roth
11
41
0
02 Nov 2020
Personalized Multimodal Feedback Generation in Education
Personalized Multimodal Feedback Generation in Education
Haochen Liu
Zitao Liu
Zhongqin Wu
Jiliang Tang
24
9
0
31 Oct 2020
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in
  the Conversation
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in the Conversation
Yuzhao Mao
Qi Sun
Guang Liu
Xiaojie Wang
Weiguo Gao
Xuan Li
Jianping Shen
19
24
0
15 Oct 2020
Spatial Attention as an Interface for Image Captioning Models
Spatial Attention as an Interface for Image Captioning Models
P. Sadler
12
0
0
29 Sep 2020
Previous
123456789
Next