ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1605.00459
  4. Cited By
Multi30K: Multilingual English-German Image Descriptions

Multi30K: Multilingual English-German Image Descriptions

2 May 2016
Desmond Elliott
Stella Frank
K. Simaán
Lucia Specia
    VLM
ArXivPDFHTML

Papers citing "Multi30K: Multilingual English-German Image Descriptions"

50 / 102 papers shown
Title
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for Documentaries
Jinze Lv
Jian Chen
Zi Long
Xianghua Fu
Yin Chen
VGen
42
0
0
09 May 2025
A Transformer-based Neural Architecture Search Method
A Transformer-based Neural Architecture Search Method
Shang Wang
Huanrong Tang
Jianquan Ouyang
28
0
0
02 May 2025
Memory Reviving, Continuing Learning and Beyond: Evaluation of Pre-trained Encoders and Decoders for Multimodal Machine Translation
Memory Reviving, Continuing Learning and Beyond: Evaluation of Pre-trained Encoders and Decoders for Multimodal Machine Translation
Zhuang Yu
Shiliang Sun
Jing Zhao
Tengfei Song
Hao-Yu Yang
48
0
0
25 Apr 2025
Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models
Julian Spravil
Sebastian Houben
Sven Behnke
VLM
70
0
0
12 Mar 2025
Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation
Yingfeng Luo
Tong Zheng
Yongyu Mu
B. Li
Qinghong Zhang
...
Ziqiang Xu
Peinan Feng
Xiaoqian Liu
Tong Xiao
Jingbo Zhu
AI4CE
158
0
0
09 Mar 2025
Chitranuvad: Adapting Multi-Lingual LLMs for Multimodal Translation
Chitranuvad: Adapting Multi-Lingual LLMs for Multimodal Translation
Shaharukh Khan
Ayush Tarun
Ali Faraz
Palash Kamble
Vivek Dahiya
Praveen Kumar Pokala
Ashish Kulkarni
Chandra Khatri
Abhinav Ravi
Shubham Agarwal
136
0
0
27 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
100
4
0
12 Feb 2025
Large Multimodal Models for Low-Resource Languages: A Survey
Large Multimodal Models for Low-Resource Languages: A Survey
Marian Lupascu
Ana-Cristina Rogoz
Mihai-Sorin Stupariu
Radu Tudor Ionescu
61
1
0
08 Feb 2025
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Yingtao Zhang
Jialin Zhao
Wenjing Wu
Ziheng Liao
Umberto Michieli
C. Cannistraci
51
0
0
31 Jan 2025
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai
Enxin Song
Y. Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jeng-Neng Hwang
Saining Xie
Christopher D. Manning
3DV
84
25
0
04 Oct 2024
Towards Zero-Shot Multimodal Machine Translation
Towards Zero-Shot Multimodal Machine Translation
Matthieu Futeral
Cordelia Schmid
Benoît Sagot
Rachel Bawden
35
3
0
18 Jul 2024
AnyTrans: Translate AnyText in the Image with Large Scale Models
AnyTrans: Translate AnyText in the Image with Large Scale Models
Zhipeng Qian
Pei Zhang
Baosong Yang
Kai Fan
Yiwei Ma
Derek F. Wong
Xiaoshuai Sun
Rongrong Ji
VLM
40
1
0
17 Jun 2024
Image captioning in different languages
Image captioning in different languages
Emiel van Miltenburg
VLM
39
0
0
31 May 2024
Relay Decoding: Concatenating Large Language Models for Machine
  Translation
Relay Decoding: Concatenating Large Language Models for Machine Translation
Chengpeng Fu
Xiaocheng Feng
Yi-Chong Huang
Wenshuai Huo
Baohang Li
Hui Wang
Bing Qin
Ting Liu
24
0
0
05 May 2024
Constructing Multilingual Visual-Text Datasets Revealing Visual
  Multilingual Ability of Vision Language Models
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models
Jesse Atuhurra
Iqra Ali
Tatsuya Hiraoka
Hidetaka Kamigaito
Tomoya Iwakura
Taro Watanabe
44
1
0
29 Mar 2024
CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning
CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning
Yanqi Dai
Dong Jing
Nanyi Fei
Zhiwu Lu
Nanyi Fei
Guoxing Yang
Zhiwu Lu
55
3
0
07 Mar 2024
Detecting Concrete Visual Tokens for Multimodal Machine Translation
Detecting Concrete Visual Tokens for Multimodal Machine Translation
Braeden Bowen
Vipin Vijayan
Scott Grigsby
Timothy Anderson
Jeremy Gwinnup
26
2
0
05 Mar 2024
Evaluating Bias and Fairness in Gender-Neutral Pretrained
  Vision-and-Language Models
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
23
7
0
26 Oct 2023
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Maestro: Uncovering Low-Rank Structures via Trainable Decomposition
Samuel Horváth
Stefanos Laskaridis
Shashank Rajput
Hongyi Wang
BDL
32
4
0
28 Aug 2023
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual
  Captioning
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLM
CLIP
44
13
0
25 Aug 2023
How Good Are LLMs at Out-of-Distribution Detection?
How Good Are LLMs at Out-of-Distribution Detection?
Bo Liu
Li-Ming Zhan
Zexin Lu
Yu Feng
Lei Xue
Xiao-Ming Wu
OODD
30
8
0
20 Aug 2023
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
AltDiffusion: A Multilingual Text-to-Image Diffusion Model
Fulong Ye
Guangyi Liu
Xinya Wu
Ledell Yu Wu
VLM
32
25
0
19 Aug 2023
Transformer-based Joint Source Channel Coding for Textual Semantic
  Communication
Transformer-based Joint Source Channel Coding for Textual Semantic Communication
Shicong Liu
Zhenke Gao
Gaojie Chen
Yu Su
Lu Peng
14
3
0
23 Jul 2023
Iterative Adversarial Attack on Image-guided Story Ending Generation
Iterative Adversarial Attack on Image-guided Story Ending Generation
Youze Wang
Wenbo Hu
Richang Hong
32
3
0
16 May 2023
RC3: Regularized Contrastive Cross-lingual Cross-modal Pre-training
RC3: Regularized Contrastive Cross-lingual Cross-modal Pre-training
Chulun Zhou
Yunlong Liang
Fandong Meng
Jinan Xu
Jinsong Su
Jie Zhou
VLM
23
4
0
13 May 2023
Few-shot Multimodal Multitask Multilingual Learning
Few-shot Multimodal Multitask Multilingual Learning
Aman Chadha
Vinija Jain
45
0
0
19 Feb 2023
Beyond Triplet: Leveraging the Most Data for Multimodal Machine
  Translation
Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation
Yaoming Zhu
Zewei Sun
Shanbo Cheng
Yuyang Huang
Liwei Wu
Mingxuan Wang
26
10
0
20 Dec 2022
Beyond Mahalanobis-Based Scores for Textual OOD Detection
Beyond Mahalanobis-Based Scores for Textual OOD Detection
Pierre Colombo
Eduardo Dadalto Camara Gomes
Guillaume Staerman
Nathan Noiry
Pablo Piantanida
OODD
41
5
0
24 Nov 2022
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in
  Text
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Qianhui Wu
Huiqiang Jiang
Haonan Yin
Börje F. Karlsson
Chin-Yew Lin
30
10
0
21 Nov 2022
ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for
  Understanding and Generation
ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation
Bin Shan
Yaqian Han
Weichong Yin
Shuohuan Wang
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
MLLM
VLM
11
7
0
09 Nov 2022
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of
  Downstream Tasks
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
Colin Leong
Joshua Nemecek
Jacob Mansdorfer
Anna Filighera
A. Owodunni
Daniel Whitenack
VLM
AI4CE
39
24
0
26 Oct 2022
Low-resource Neural Machine Translation with Cross-modal Alignment
Low-resource Neural Machine Translation with Cross-modal Alignment
Zhe Yang
Qingkai Fang
Yang Feng
VLM
34
9
0
13 Oct 2022
MuMUR : Multilingual Multimodal Universal Retrieval
MuMUR : Multilingual Multimodal Universal Retrieval
Avinash Madasu
Estelle Aflalo
Gabriela Ben-Melech Stan
Shachar Rosenman
Shao-Yen Tseng
Gedas Bertasius
Vasudev Lal
39
3
0
24 Aug 2022
Scalable K-FAC Training for Deep Neural Networks with Distributed
  Preconditioning
Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning
Lin Zhang
S. Shi
Wei Wang
Bo-wen Li
28
10
0
30 Jun 2022
VALHALLA: Visual Hallucination for Machine Translation
VALHALLA: Visual Hallucination for Machine Translation
Yi Li
Rameswar Panda
Yoon Kim
Chun-Fu Chen
Rogerio Feris
David D. Cox
Nuno Vasconcelos
MLLM
38
38
0
31 May 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
16
6
0
28 May 2022
A Blessing of Dimensionality in Membership Inference through
  Regularization
A Blessing of Dimensionality in Membership Inference through Regularization
Jasper Tan
Daniel LeJeune
Blake Mason
Hamid Javadi
Richard G. Baraniuk
32
18
0
27 May 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
84
72
0
25 May 2022
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text
  Retrieval
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
Feilong Chen
Xiuyi Chen
Jiaxin Shi
Duzhen Zhang
Jianlong Chang
Qi Tian
VLM
CLIP
34
6
0
24 May 2022
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual
  Word Alignment
Utilizing Language-Image Pretraining for Efficient and Robust Bilingual Word Alignment
Tuan Dinh
Jy-yong Sohn
Shashank Rajput
Timothy Ossowski
Yifei Ming
Junjie Hu
Dimitris Papailiopoulos
Kangwook Lee
20
0
0
23 May 2022
Neural Machine Translation with Phrase-Level Universal Visual
  Representations
Neural Machine Translation with Phrase-Level Universal Visual Representations
Qingkai Fang
Yang Feng
31
40
0
19 Mar 2022
Delving Deeper into Cross-lingual Visual Question Answering
Delving Deeper into Cross-lingual Visual Question Answering
Chen Cecilia Liu
Jonas Pfeiffer
Anna Korhonen
Ivan Vulić
Iryna Gurevych
26
8
0
15 Feb 2022
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
  Languages
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
Emanuele Bugliarello
Fangyu Liu
Jonas Pfeiffer
Siva Reddy
Desmond Elliott
E. Ponti
Ivan Vulić
MLLM
VLM
ELM
45
62
0
27 Jan 2022
VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine
  Translation
VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine Translation
Yihang Li
Shuichiro Shimizu
Weiqi Gu
Chenhui Chu
Sadao Kurohashi
13
13
0
20 Jan 2022
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Arch-Net: Model Distillation for Architecture Agnostic Model Deployment
Weixin Xu
Zipeng Feng
Shuangkang Fang
Song Yuan
Yi Yang
Shuchang Zhou
MQ
24
1
0
01 Nov 2021
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine
  Learning Models
On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models
Yunhao Yang
Parham Gohari
Ufuk Topcu
AAML
28
3
0
06 Oct 2021
xGQA: Cross-Lingual Visual Question Answering
xGQA: Cross-Lingual Visual Question Answering
Jonas Pfeiffer
Gregor Geigle
Aishwarya Kamath
Jan-Martin O. Steitz
Stefan Roth
Ivan Vulić
Iryna Gurevych
28
56
0
13 Sep 2021
Vision Matters When It Should: Sanity Checking Multimodal Machine
  Translation Models
Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models
Jiaoda Li
Duygu Ataman
Rico Sennrich
18
28
0
08 Sep 2021
Product-oriented Machine Translation with Cross-modal Cross-lingual
  Pre-training
Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training
Yuqing Song
Shizhe Chen
Qin Jin
Wei Luo
Jun Xie
Fei Huang
16
18
0
25 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLM
MLLM
51
779
0
24 Aug 2021
123
Next