Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2001.09907
Cited By
PMIndia -- A Collection of Parallel Corpora of Languages of India
27 January 2020
Barry Haddow
Faheem Kirefu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PMIndia -- A Collection of Parallel Corpora of Languages of India"
35 / 35 papers shown
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Pooja Singh
Shashwat Bhardwaj
V. Sharma
Sandeep Kumar
129
0
0
01 Nov 2025
A kinetic-based regularization method for data science applications
Abhisek Ganguly
Alessandro Gabbana
Vybhav Rao
Sauro Succi
Santosh Ansumali
375
5
0
06 Mar 2025
Using Language Models to Disambiguate Lexical Choices in Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Josh Barua
Sanjay Subramanian
Kayo Yin
Alane Suhr
198
2
0
08 Nov 2024
SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared Task
Conference on Machine Translation (WMT), 2024
Hamees Sayed
Advait Joglekar
S. Umesh
270
1
0
01 Nov 2024
Decoding the Diversity: A Review of the Indic AI Research Landscape
Sankalp KJ
Vinija Jain
S. Bhaduri
Tamoghna Roy
Vasu Sharma
266
9
0
13 Jun 2024
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
David Anugraha
Genta Indra Winata
Chenyue Li
Patrick Amadeus Irawan
En-Shiun Annie Lee
355
13
0
13 Jun 2024
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation
Akshay Batheja
S. Deoghare
Helen Treharne
Pushpak Bhattacharyya
139
1
0
18 Dec 2023
Improving Access to Justice for the Indian Population: A Benchmark for Evaluating Translation of Legal Text to Indian Languages
Sayan Mahapatra
Debtanu Datta
Shubham Soni
A. Goswami
Saptarshi Ghosh
AILaw
ELM
136
3
0
15 Oct 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Neural Information Processing Systems (NeurIPS), 2023
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
286
201
0
09 Sep 2023
"A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering improves Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Akshay Batheja
P. Bhattacharyya
250
5
0
06 Jun 2023
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Shravan Nayak
Surangika Ranathunga
Sarubi Thillainathan
Rikki Hung
Anthony Rinaldi
Yining Wang
Jonah Mackey
Andrew Ho
E. Lee
298
6
0
02 Jun 2023
AxomiyaBERTa: A Phonologically-aware Transformer Model for Assamese
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Abhijnan Nath
Sheikh Mannan
Nikhil Krishnaswamy
189
7
0
23 May 2023
Machine Translation by Projecting Text into the Same Phonetic-Orthographic Space Using a Common Encoding
Amit Kumar
Shantipriya Parida
A. Pratap
Anil Kumar Singh
216
2
0
21 May 2023
PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ashok Urlana
Pinzhen Chen
Zheng Zhao
Shay B. Cohen
Manish Shrivastava
Barry Haddow
191
13
0
15 May 2023
Evaluating Inter-Bilingual Semantic Parsing for Indian Languages
Divyanshu Aggarwal
V. Gupta
Anoop Kunchukuttan
210
3
0
25 Apr 2023
Improving Machine Translation with Phrase Pair Injection and Corpus Filtering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Akshay Batheja
P. Bhattacharyya
138
15
0
19 Jan 2023
Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sumanth Doddapaneni
Rahul Aralikatte
Gowtham Ramesh
Shreyansh Goyal
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
ELM
356
124
0
11 Dec 2022
Improving Multilingual Neural Machine Translation System for Indic Languages
Sudhansu Bala Das
Atharv Biradar
Tapas Kumar Mishra
B. Patra
237
41
0
27 Sep 2022
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Genta Indra Winata
Alham Fikri Aji
Samuel Cahyawijaya
Rahmad Mahendra
Fajri Koto
...
Pascale Fung
Timothy Baldwin
Jey Han Lau
Rico Sennrich
Sebastian Ruder
259
110
0
31 May 2022
Building Machine Translation Systems for the Next Thousand Languages
Ankur Bapna
Isaac Caswell
Julia Kreutzer
Orhan Firat
D. Esch
...
Apurva Shah
Yanping Huang
Zhiwen Chen
Yonghui Wu
Macduff Hughes
325
110
0
09 May 2022
IndicXNLI: Evaluating Multilingual Inference for Indian Languages
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Divyanshu Aggarwal
V. Gupta
Anoop Kunchukuttan
172
35
0
19 Apr 2022
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
Findings (Findings), 2022
E. Lee
Sarubi Thillainathan
Shravan Nayak
Surangika Ranathunga
David Ifeoluwa Adelani
Ruisi Su
Arya D. McCarthy
VLM
356
51
0
16 Mar 2022
Cost-Effective Training in Low-Resource Neural Machine Translation
Sai Koneru
Danni Liu
Jan Niehues
138
1
0
14 Jan 2022
Analyzing Architectures for Neural Machine Translation Using Low Computational Resources
Aditya Mandke
Onkar Litake
Dipali M. Kadam
149
1
0
06 Nov 2021
MURAL: Multimodal, Multitask Retrieval Across Languages
Aashi Jain
Mandy Guo
Krishna Srinivasan
Ting-Li Chen
Sneha Kudugunta
Chao Jia
Yinfei Yang
Jason Baldridge
VLM
304
61
0
10 Sep 2021
IndicBART: A Pre-trained Model for Indic Natural Language Generation
Findings (Findings), 2021
Mary Dabre
Himani Shrotriya
Anoop Kunchukuttan
Ratish Puduppully
Mitesh M. Khapra
Pratyush Kumar
252
87
0
07 Sep 2021
Survey of Low-Resource Machine Translation
Computational Linguistics (CL), 2021
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
515
200
0
01 Sep 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
313
314
0
12 Aug 2021
Itihasa: A large-scale corpus for Sanskrit to English translation
Workshop on Asian Translation (WAT), 2021
Rahul Aralikatte
Miryam de Lhoneux
Anoop Kunchukuttan
Anders Søgaard
223
27
0
06 Jun 2021
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages
Transactions of the Association for Computational Linguistics (TACL), 2021
Gowtham Ramesh
Sumanth Doddapaneni
Aravinth Bheemaraj
Mayank Jobanputra
AK Raghavan
...
K. Deepak
Vivek Raghavan
Anoop Kunchukuttan
Pratyush Kumar
Mitesh Khapra
LRM
373
269
0
12 Apr 2021
Unsupervised Machine Translation On Dravidian Languages
Sai Koneru
Danni Liu
Jan Niehues
218
7
0
29 Mar 2021
MuRIL: Multilingual Representations for Indian Languages
Simran Khanuja
Diksha Bansal
Sarvesh Mehtani
Savya Khosla
Atreyee Dey
...
Shachi Dave
Shruti Gupta
Subhash Chandra Bose Gali
Vishnu Subramanian
Partha P. Talukdar
356
377
0
19 Mar 2021
Improving Zero-Shot Translation by Disentangling Positional Information
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Danni Liu
Jan Niehues
James Cross
Francisco Guzmán
Xian Li
231
49
0
30 Dec 2020
Exploring Pair-Wise NMT for Indian Languages
ICON (ICON), 2020
Kartheek Akella
Sai Himal Allu
S. Ragupathi
Aman Singhal
Zeeshan Khan
Vinay P. Namboodiri
C. V. Jawahar
163
8
0
10 Dec 2020
Revisiting Low Resource Status of Indian Languages in Machine Translation
Jerin Philip
Shashank Siripragada
Vinay P. Namboodiri
C. V. Jawahar
263
30
0
11 Aug 2020
1
Page 1 of 1