Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.06709
Cited By
Improving Neural Machine Translation Models with Monolingual Data
20 November 2015
Rico Sennrich
Barry Haddow
Alexandra Birch
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Neural Machine Translation Models with Monolingual Data"
50 / 1,201 papers shown
Title
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
Ona de Gibert
Joseph Attieh
Teemu Vahtola
Mikko Aulamo
Zihao Li
Raúl Vázquez
Tiancheng Hu
Jörg Tiedemann
SyDa
16
0
0
20 May 2025
Pivot Language for Low-Resource Machine Translation
Abhimanyu Talwar
Julien Laasri
7
0
0
20 May 2025
SMOTExT: SMOTE meets Large Language Models
Mateusz Bystroński
Mikołaj Hołysz
Grzegorz Piotrowski
Nitesh V. Chawla
Tomasz Kajdanowicz
12
0
0
19 May 2025
Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda
Richard Kimera
DongNyeong Heo
Daniela N. Rim
Heeyoul Choi
164
0
0
05 May 2025
Bemba Speech Translation: Exploring a Low-Resource African Language
Muhammad Hazim Al Farouq
Aman Kassahun Wassie
Yasmin Moslem
46
0
0
05 May 2025
Towards High-Fidelity Synthetic Multi-platform Social Media Datasets via Large Language Models
Henry Tari
Nojus Sereiva
Rishabh Kaushal
T. Bertaglia
Adriana Iamnitchi
35
0
0
02 May 2025
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Maxime Bouthors
Josep Crego
François Yvon
RALM
LRM
56
0
0
30 Apr 2025
Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family
Pierre-Carl Langlais
Pavel Chizhov
Mattia Nee
Carlos Rosas Hinostroza
Matthieu Delsart
Irène Girard
Othman Hicheur
Anastasia Stasenko
Ivan P. Yamshchikov
LRM
68
0
0
25 Apr 2025
Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends
Jiaxin Guo
Xiaoyu Chen
Zhiqiang Rao
Jinlong Yang
Zongyao Li
Hengchao Shang
Daimeng Wei
Hao Yang
44
0
0
21 Apr 2025
High-Resource Translation:Turning Abundance into Accessibility
Abhiram Reddy Yanampally
24
0
0
08 Apr 2025
Is LLM the Silver Bullet to Low-Resource Languages Machine Translation?
Yewei Song
Lujun Li
Cedric Lothritz
Saad Ezzini
Lama Sleem
Niccolo Gentile
Radu State
Tegawende F. Bissyande
Jacques Klein
52
1
0
31 Mar 2025
SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection
Haoyi Li
Angela Yifei Yuan
Soyeon Caren Han
Christopher Leckie
50
0
0
19 Mar 2025
Synthetic Data Generation Using Large Language Models: Advances in Text and Code
Mihai Nadas
Laura Diosan
Andreea Tomescu
SyDa
72
0
0
18 Mar 2025
Domain Adaptation for Japanese Sentence Embeddings with Contrastive Learning based on Synthetic Sentence Generation
Zihao Chen
H. Handa
Miho Ohsaki
Kimiaki Shirahama
59
0
0
12 Mar 2025
A kinetic-based regularization method for data science applications
Abhisek Ganguly
Alessandro Gabbana
Vybhav Rao
Sauro Succi
Santosh Ansumali
52
0
0
06 Mar 2025
SpiritSight Agent: Advanced GUI Agent with One Look
Zhiyuan Huang
Ziming Cheng
Junting Pan
Zhaohui Hou
Mingjie Zhan
LLMAG
101
2
0
05 Mar 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Rui Zhao
Weijia Mao
Mike Zheng Shou
66
0
0
05 Mar 2025
ReaderLM-v2: Small Language Model for HTML to Markdown and JSON
Feng Wang
Zesheng Shi
Bo Wang
Nan Wang
Han Xiao
RALM
81
1
0
03 Mar 2025
Chitranuvad: Adapting Multi-Lingual LLMs for Multimodal Translation
Shaharukh Khan
Ayush Tarun
Ali Faraz
Palash Kamble
Vivek Dahiya
Praveen Kumar Pokala
Ashish Kulkarni
Chandra Khatri
Abhinav Ravi
Shubham Agarwal
181
0
0
27 Feb 2025
MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification
Varun Vashisht
Shri Kiran Srinivasan
Mihir Konduskar
Jaskaran Singh Walia
Vukosi Marivate
47
0
0
25 Feb 2025
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning
Wenhao Zhu
Pinzhen Chen
Hanxu Hu
Shujian Huang
Fei Yuan
Jiajun Chen
Alexandra Birch
SyDa
72
1
0
24 Feb 2025
From Priest to Doctor: Domain Adaptation for Low-Resource Neural Machine Translation
Ali Marashian
Enora Rice
Luke Gessler
Alexis Palmer
K. Wense
81
1
0
24 Feb 2025
Deterministic Reversible Data Augmentation for Neural Machine Translation
Jiashu Yao
Heyan Huang
Zeming Liu
Yuhang Guo
51
0
0
21 Feb 2025
Text-to-SQL Domain Adaptation via Human-LLM Collaborative Data Annotation
Yuan Tian
Daniel Lee
Fei Wu
Tung Mai
Kun Qian
Siddhartha Sahai
Tianyi Zhang
Yunyao Li
SyDa
45
0
0
21 Feb 2025
Diversity-Oriented Data Augmentation with Large Language Models
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
80
1
0
17 Feb 2025
TARDiS : Text Augmentation for Refining Diversity and Separability
Kyungmin Kim
Sanghun Im
Gibaeg Kim
Heung-Seon Oh
VLM
34
0
0
06 Jan 2025
Language verY Rare for All
Ibrahim Merad
Amos Wolf
Ziad Mazzawi
Yannick Léo
77
0
0
18 Dec 2024
Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali
Sharad Duwal
Suraj Prasai
Suresh Manandhar
CLL
84
1
0
18 Dec 2024
CALICO: Conversational Agent Localization via Synthetic Data Generation
Andy Rosenbaum
Pegah Kharazmi
Ershad Banijamali
Lu Zeng
Christopher DiPersio
...
Gokmen Oz
Clement Chung
Karolina Owczarzak
Fabian Triefenbach
Wael Hamza
SyDa
86
0
0
06 Dec 2024
EzSQL: An SQL intermediate representation for improving SQL-to-text Generation
Meher Bhardwaj
Hrishikesh Ethari
Dennis Singh Moirangthem
AI4TS
82
0
0
28 Nov 2024
Cyber-Attack Technique Classification Using Two-Stage Trained Large Language Models
Weiqiu You
Youngja Park
76
0
0
27 Nov 2024
Evaluating LLM Prompts for Data Augmentation in Multi-label Classification of Ecological Texts
Anna Glazkova
Olga Zakharova
82
2
0
22 Nov 2024
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
Yeming Wen
Swarat Chaudhuri
34
0
0
11 Nov 2024
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages
Sparsh Jain
Ashwin Sankar
Devilal Choudhary
Dhairya Suman
Nikhil Narasimhan
Mohammed Safi Ur Rahman Khan
Anoop Kunchukuttan
Mitesh M. Khapra
Raj Dabre
44
2
0
07 Nov 2024
Self-Compositional Data Augmentation for Scientific Keyphrase Generation
Mael Houbre
Florian Boudin
B. Daille
Akiko Aizawa
37
0
0
05 Nov 2024
Grounding Natural Language to SQL Translation with Data-Based Self-Explanations
Yuankai Fan
Tonghui Ren
Can Huang
Zhenying He
Xinyu Wang
LRM
47
1
0
05 Nov 2024
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
Y. Qi
Hao Peng
Xueliang Wang
Bin Xu
Lei Hou
Juanzi Li
64
1
0
31 Oct 2024
Deep Learning and Data Augmentation for Detecting Self-Admitted Technical Debt
E. Sutoyo
Paris Avgeriou
Andrea Capiluppi
29
2
0
21 Oct 2024
Quantity vs. Quality of Monolingual Source Data in Automatic Text Translation: Can It Be Too Little If It Is Too Good?
Idris Abdulmumin
B. Galadanci
G. Aliyu
Shamsuddeen Hassan Muhammad
37
1
0
17 Oct 2024
A Little Human Data Goes A Long Way
Dhananjay Ashok
Jonathan May
SyDa
41
2
0
17 Oct 2024
Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels
Leo Kohlenberg
Leonard Horns
Frederic Sadrieh
Nils Kiele
Matthis Clausen
Konstantin Ketterer
Avetis Navasardyan
Tamara Czinczoll
Gerard de Melo
Ralf Herbrich
34
0
0
16 Oct 2024
Expanding Chatbot Knowledge in Customer Service: Context-Aware Similar Question Generation Using Large Language Models
Mengze Hong
Yuanfeng Song
Di Jiang
Lu Wang
Zichang Guo
Yuanqin He
Zhiyang Su
Qing Li
40
2
0
16 Oct 2024
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs
Abdellah El Mekki
Muhammad Abdul-Mageed
LRM
36
0
0
14 Oct 2024
ChakmaNMT: A Low-resource Machine Translation On Chakma Language
Aunabil Chakma
Aditya Chakma
Soham Khisa
Chumui Tripura
Masum Hasan
Rifat Shahriyar
23
0
0
14 Oct 2024
Extended Japanese Commonsense Morality Dataset with Masked Token and Label Enhancement
Takumi Ohashi
Tsubasa Nakagawa
Hitoshi Iyatomi
27
0
0
12 Oct 2024
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Wenxi Chen
Ziyang Ma
Xiquan Li
Xuenan Xu
Yuzhe Liang
Zhisheng Zheng
Kai Yu
Xie Chen
23
5
0
12 Oct 2024
Neural machine translation system for Lezgian, Russian and Azerbaijani languages
Alidar Asvarov
Andrey Grabovoy
37
0
0
07 Oct 2024
Parallel Corpus Augmentation using Masked Language Models
Vibhuti Kumari
Narayana Murthy Kavi
24
0
0
04 Oct 2024
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
Seanie Lee
Haebin Seong
Dong Bok Lee
Minki Kang
Xiaoyin Chen
Dominik Wagner
Yoshua Bengio
Juho Lee
Sung Ju Hwang
67
2
0
02 Oct 2024
Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis
Hippolyte Gisserot-Boukhlef
Ricardo Rei
Emmanuel Malherbe
C´eline Hudelot
Pierre Colombo
Nuno M. Guerreiro
32
2
0
30 Sep 2024
1
2
3
4
...
23
24
25
Next