ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.09907
  4. Cited By
PMIndia -- A Collection of Parallel Corpora of Languages of India

PMIndia -- A Collection of Parallel Corpora of Languages of India

27 January 2020
Barry Haddow
Faheem Kirefu
ArXiv (abs)PDFHTML

Papers citing "PMIndia -- A Collection of Parallel Corpora of Languages of India"

35 / 35 papers shown
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel CorpusConference on Empirical Methods in Natural Language Processing (EMNLP), 2025
Pooja Singh
Shashwat Bhardwaj
V. Sharma
Sandeep Kumar
129
0
0
01 Nov 2025
A kinetic-based regularization method for data science applications
A kinetic-based regularization method for data science applications
Abhisek Ganguly
Alessandro Gabbana
Vybhav Rao
Sauro Succi
Santosh Ansumali
375
5
0
06 Mar 2025
Using Language Models to Disambiguate Lexical Choices in Translation
Using Language Models to Disambiguate Lexical Choices in TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Josh Barua
Sanjay Subramanian
Kayo Yin
Alane Suhr
198
2
0
08 Nov 2024
SPRING Lab IITM's submission to Low Resource Indic Language Translation
  Shared Task
SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared TaskConference on Machine Translation (WMT), 2024
Hamees Sayed
Advait Joglekar
S. Umesh
270
1
0
01 Nov 2024
Decoding the Diversity: A Review of the Indic AI Research Landscape
Decoding the Diversity: A Review of the Indic AI Research Landscape
Sankalp KJ
Vinija Jain
S. Bhaduri
Tamoghna Roy
Vasu Sharma
266
9
0
13 Jun 2024
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via
  Proxy Models
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
David Anugraha
Genta Indra Winata
Chenyue Li
Patrick Amadeus Irawan
En-Shiun Annie Lee
355
13
0
13 Jun 2024
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT
  Training Data Creation
APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation
Akshay Batheja
S. Deoghare
Helen Treharne
Pushpak Bhattacharyya
139
1
0
18 Dec 2023
Improving Access to Justice for the Indian Population: A Benchmark for
  Evaluating Translation of Legal Text to Indian Languages
Improving Access to Justice for the Indian Population: A Benchmark for Evaluating Translation of Legal Text to Indian Languages
Sayan Mahapatra
Debtanu Datta
Shubham Soni
A. Goswami
Saptarshi Ghosh
AILawELM
136
3
0
15 Oct 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
MADLAD-400: A Multilingual And Document-Level Large Audited DatasetNeural Information Processing Systems (NeurIPS), 2023
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
286
201
0
09 Sep 2023
"A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering
  improves Machine Translation
"A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering improves Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Akshay Batheja
P. Bhattacharyya
250
5
0
06 Jun 2023
Leveraging Auxiliary Domain Parallel Data in Intermediate Task
  Fine-tuning for Low-resource Translation
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Shravan Nayak
Surangika Ranathunga
Sarubi Thillainathan
Rikki Hung
Anthony Rinaldi
Yining Wang
Jonah Mackey
Andrew Ho
E. Lee
298
6
0
02 Jun 2023
AxomiyaBERTa: A Phonologically-aware Transformer Model for Assamese
AxomiyaBERTa: A Phonologically-aware Transformer Model for AssameseAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Abhijnan Nath
Sheikh Mannan
Nikhil Krishnaswamy
189
7
0
23 May 2023
Machine Translation by Projecting Text into the Same
  Phonetic-Orthographic Space Using a Common Encoding
Machine Translation by Projecting Text into the Same Phonetic-Orthographic Space Using a Common Encoding
Amit Kumar
Shantipriya Parida
A. Pratap
Anil Kumar Singh
216
2
0
21 May 2023
PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for
  Languages in India
PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in IndiaConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ashok Urlana
Pinzhen Chen
Zheng Zhao
Shay B. Cohen
Manish Shrivastava
Barry Haddow
191
13
0
15 May 2023
Evaluating Inter-Bilingual Semantic Parsing for Indian Languages
Evaluating Inter-Bilingual Semantic Parsing for Indian Languages
Divyanshu Aggarwal
V. Gupta
Anoop Kunchukuttan
210
3
0
25 Apr 2023
Improving Machine Translation with Phrase Pair Injection and Corpus
  Filtering
Improving Machine Translation with Phrase Pair Injection and Corpus FilteringConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Akshay Batheja
P. Bhattacharyya
138
15
0
19 Jan 2023
Towards Leaving No Indic Language Behind: Building Monolingual Corpora,
  Benchmark and Models for Indic Languages
Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Sumanth Doddapaneni
Rahul Aralikatte
Gowtham Ramesh
Shreyansh Goyal
Mitesh M. Khapra
Anoop Kunchukuttan
Pratyush Kumar
ELM
356
124
0
11 Dec 2022
Improving Multilingual Neural Machine Translation System for Indic
  Languages
Improving Multilingual Neural Machine Translation System for Indic Languages
Sudhansu Bala Das
Atharv Biradar
Tapas Kumar Mishra
B. Patra
237
41
0
27 Sep 2022
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local
  Languages
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local LanguagesConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Genta Indra Winata
Alham Fikri Aji
Samuel Cahyawijaya
Rahmad Mahendra
Fajri Koto
...
Pascale Fung
Timothy Baldwin
Jey Han Lau
Rico Sennrich
Sebastian Ruder
259
110
0
31 May 2022
Building Machine Translation Systems for the Next Thousand Languages
Building Machine Translation Systems for the Next Thousand Languages
Ankur Bapna
Isaac Caswell
Julia Kreutzer
Orhan Firat
D. Esch
...
Apurva Shah
Yanping Huang
Zhiwen Chen
Yonghui Wu
Macduff Hughes
325
110
0
09 May 2022
IndicXNLI: Evaluating Multilingual Inference for Indian Languages
IndicXNLI: Evaluating Multilingual Inference for Indian LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Divyanshu Aggarwal
V. Gupta
Anoop Kunchukuttan
172
35
0
19 Apr 2022
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for
  Low-Resource Language Translation?
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?Findings (Findings), 2022
E. Lee
Sarubi Thillainathan
Shravan Nayak
Surangika Ranathunga
David Ifeoluwa Adelani
Ruisi Su
Arya D. McCarthy
VLM
356
51
0
16 Mar 2022
Cost-Effective Training in Low-Resource Neural Machine Translation
Cost-Effective Training in Low-Resource Neural Machine Translation
Sai Koneru
Danni Liu
Jan Niehues
138
1
0
14 Jan 2022
Analyzing Architectures for Neural Machine Translation Using Low
  Computational Resources
Analyzing Architectures for Neural Machine Translation Using Low Computational Resources
Aditya Mandke
Onkar Litake
Dipali M. Kadam
149
1
0
06 Nov 2021
MURAL: Multimodal, Multitask Retrieval Across Languages
MURAL: Multimodal, Multitask Retrieval Across Languages
Aashi Jain
Mandy Guo
Krishna Srinivasan
Ting-Li Chen
Sneha Kudugunta
Chao Jia
Yinfei Yang
Jason Baldridge
VLM
304
61
0
10 Sep 2021
IndicBART: A Pre-trained Model for Indic Natural Language Generation
IndicBART: A Pre-trained Model for Indic Natural Language GenerationFindings (Findings), 2021
Mary Dabre
Himani Shrotriya
Anoop Kunchukuttan
Ratish Puduppully
Mitesh M. Khapra
Pratyush Kumar
252
87
0
07 Sep 2021
Survey of Low-Resource Machine Translation
Survey of Low-Resource Machine TranslationComputational Linguistics (CL), 2021
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
515
200
0
01 Sep 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural
  Language Processing
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLMLM&MA
313
314
0
12 Aug 2021
Itihasa: A large-scale corpus for Sanskrit to English translation
Itihasa: A large-scale corpus for Sanskrit to English translationWorkshop on Asian Translation (WAT), 2021
Rahul Aralikatte
Miryam de Lhoneux
Anoop Kunchukuttan
Anders Søgaard
223
27
0
06 Jun 2021
Samanantar: The Largest Publicly Available Parallel Corpora Collection
  for 11 Indic Languages
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic LanguagesTransactions of the Association for Computational Linguistics (TACL), 2021
Gowtham Ramesh
Sumanth Doddapaneni
Aravinth Bheemaraj
Mayank Jobanputra
AK Raghavan
...
K. Deepak
Vivek Raghavan
Anoop Kunchukuttan
Pratyush Kumar
Mitesh Khapra
LRM
373
269
0
12 Apr 2021
Unsupervised Machine Translation On Dravidian Languages
Unsupervised Machine Translation On Dravidian Languages
Sai Koneru
Danni Liu
Jan Niehues
218
7
0
29 Mar 2021
MuRIL: Multilingual Representations for Indian Languages
MuRIL: Multilingual Representations for Indian Languages
Simran Khanuja
Diksha Bansal
Sarvesh Mehtani
Savya Khosla
Atreyee Dey
...
Shachi Dave
Shruti Gupta
Subhash Chandra Bose Gali
Vishnu Subramanian
Partha P. Talukdar
356
377
0
19 Mar 2021
Improving Zero-Shot Translation by Disentangling Positional Information
Improving Zero-Shot Translation by Disentangling Positional InformationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Danni Liu
Jan Niehues
James Cross
Francisco Guzmán
Xian Li
231
49
0
30 Dec 2020
Exploring Pair-Wise NMT for Indian Languages
Exploring Pair-Wise NMT for Indian LanguagesICON (ICON), 2020
Kartheek Akella
Sai Himal Allu
S. Ragupathi
Aman Singhal
Zeeshan Khan
Vinay P. Namboodiri
C. V. Jawahar
163
8
0
10 Dec 2020
Revisiting Low Resource Status of Indian Languages in Machine
  Translation
Revisiting Low Resource Status of Indian Languages in Machine Translation
Jerin Philip
Shashank Siripragada
Vinay P. Namboodiri
C. V. Jawahar
263
30
0
11 Aug 2020
1
Page 1 of 1