ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.00823
  4. Cited By
STAIR Captions: Constructing a Large-Scale Japanese Image Caption
  Dataset

STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset

2 May 2017
Yuya Yoshikawa
Yutaro Shigeto
A. Takeuchi
    3DV
ArXiv (abs)PDFHTML

Papers citing "STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset"

16 / 66 papers shown
M3P: Learning Universal Representations via Multitask Multilingual
  Multimodal Pre-training
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training
Minheng Ni
Haoyang Huang
Lin Su
Edward Cui
Taroon Bharti
Lijuan Wang
Jianfeng Gao
Dongdong Zhang
Nan Duan
288
7
0
04 Jun 2020
Captioning Images Taken by People Who Are Blind
Captioning Images Taken by People Who Are BlindEuropean Conference on Computer Vision (ECCV), 2020
Danna Gurari
Yinan Zhao
Meng Zhang
Nilavra Bhattacharya
334
203
0
20 Feb 2020
UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image
  Captioning
UIT-ViIC: A Dataset for the First Evaluation on Vietnamese Image CaptioningInternational Conference on Computational Collective Intelligence (ICCCI), 2020
Q. Lam
Q. Le
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
133
20
0
01 Feb 2020
Multimodal Machine Translation through Visuals and Speech
Multimodal Machine Translation through Visuals and SpeechMachine Translation (MT), 2019
U. Sulubacak
Ozan Caglayan
Stig-Arne Gronroos
Aku Rouhe
Desmond Elliott
Lucia Specia
Jörg Tiedemann
201
88
0
28 Nov 2019
Bootstrapping Disjoint Datasets for Multilingual Multimodal
  Representation Learning
Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning
Ákos Kádár
Grzegorz Chrupała
Afra Alishahi
Desmond Elliott
227
1
0
09 Nov 2019
Aligning Multilingual Word Embeddings for Cross-Modal Retrieval Task
Aligning Multilingual Word Embeddings for Cross-Modal Retrieval TaskConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Alireza Mohammadshahi
R. Lebret
Karl Aberer
118
12
0
08 Oct 2019
Trends in Integration of Vision and Language Research: A Survey of
  Tasks, Datasets, and Methods
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and MethodsJournal of Artificial Intelligence Research (JAIR), 2019
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
406
142
0
22 Jul 2019
Unsupervised Bilingual Lexicon Induction from Mono-lingual Multimodal
  Data
Unsupervised Bilingual Lexicon Induction from Mono-lingual Multimodal DataAAAI Conference on Artificial Intelligence (AAAI), 2019
Shizhe Chen
Qin Jin
Alexander G. Hauptmann
SSL
86
9
0
02 Jun 2019
Models of Visually Grounded Speech Signal Pay Attention To Nouns: a
  Bilingual Experiment on English and Japanese
Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese
William N. Havard
Jean-Pierre Chevrot
Laurent Besacier
155
25
0
08 Feb 2019
How2: A Large-scale Dataset for Multimodal Language Understanding
How2: A Large-scale Dataset for Multimodal Language Understanding
Ramon Sanabria
Ozan Caglayan
Shruti Palaskar
Desmond Elliott
Loïc Barrault
Lucia Specia
Florian Metze
VGenMLLM
253
313
0
01 Nov 2018
Neural Joking Machine : Humorous image captioning
Neural Joking Machine : Humorous image captioning
Kota Yoshida
Munetaka Minoguchi
Kenichiro Wani
Akio Nakamura
Hirokatsu Kataoka
106
11
0
30 May 2018
COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval
COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval
Xirong Li
Chaoxi Xu
Xiaoxu Wang
Weiyu Lan
Zhengxiong Jia
Gang Yang
Jieping Xu
290
181
0
22 May 2018
Findings of the Second Shared Task on Multimodal Machine Translation and
  Multilingual Image Description
Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description
Desmond Elliott
Stella Frank
Loïc Barrault
Fethi Bougares
Lucia Specia
VLM
188
230
0
19 Oct 2017
Emergent Translation in Multi-Agent Communication
Emergent Translation in Multi-Agent Communication
Jason D. Lee
Dong Wang
Jason Weston
Douwe Kiela
215
69
0
12 Oct 2017
Image Pivoting for Learning Multilingual Multimodal Representations
Image Pivoting for Learning Multilingual Multimodal Representations
Spandana Gella
Rico Sennrich
Frank Keller
Mirella Lapata
SSL
152
79
0
24 Jul 2017
Cross-linguistic differences and similarities in image descriptions
Cross-linguistic differences and similarities in image descriptions
Emiel van Miltenburg
Desmond Elliott
Piek Vossen
VLM
194
34
0
06 Jul 2017
Previous
12