ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.02821
  4. Cited By
Data Augmentation and Terminology Integration for Domain-Specific
  Sinhala-English-Tamil Statistical Machine Translation
v1v2v3 (latest)

Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation

5 November 2020
Aloka Fernando
Surangika Ranathunga
G. Dias
ArXiv (abs)PDFHTML

Papers citing "Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation"

13 / 13 papers shown
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
Sarubi Thillainathan
Songchen Yuan
E. Lee
Sanath Jayasena
Surangika Ranathunga
322
2
0
28 Mar 2025
Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
Aloka Fernando
Nisansa de Silva
Menan Velyuthan
Charitha Rathnayake
Surangika Ranathunga
413
1
0
26 Feb 2025
Unsupervised Bilingual Lexicon Induction for Low Resource Languages
Unsupervised Bilingual Lexicon Induction for Low Resource Languages
Charitha Rathnayake
P. R. S. Thilakarathna
Uthpala Nethmini
Rishemjith Kaur
Surangika Ranathunga
287
0
0
22 Dec 2024
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
Surangika Ranathunga
Asanka Ranasinghea
Janaka Shamala
Ayodya Dandeniyaa
Rashmi Galappaththia
Malithi Samaraweeraa
434
1
0
03 Dec 2024
SiTSE: Sinhala Text Simplification Dataset and Evaluation
SiTSE: Sinhala Text Simplification Dataset and Evaluation
Surangika Ranathunga
Rumesh Sirithunga
Himashi Rathnayake
Lahiru De Silva
Thamindu Aluthwala
Saman Peramuna
Ravi Shekhar
413
3
0
02 Dec 2024
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language
  Translation
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
Tong Su
Xin Peng
Sarubi Thillainathan
David Guzmán
Surangika Ranathunga
En-Shiun Annie Lee
259
8
0
05 Apr 2024
Quality Does Matter: A Detailed Look at the Quality and Utility of
  Web-Mined Parallel Corpora
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel CorporaConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Surangika Ranathunga
Nisansa de Silva
Menan Velayuthan
Aloka Fernando
Charitha Rathnayake
310
18
0
12 Feb 2024
Leveraging Auxiliary Domain Parallel Data in Intermediate Task
  Fine-tuning for Low-resource Translation
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Shravan Nayak
Surangika Ranathunga
Sarubi Thillainathan
Rikki Hung
Anthony Rinaldi
Yining Wang
Jonah Mackey
Andrew Ho
E. Lee
324
7
0
02 Jun 2023
Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource
  Sinhala-English Neural Machine Translation
Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource Sinhala-English Neural Machine TranslationPacific Asia Conference on Language, Information and Computation (PACLIC), 2022
Aloka Fernando
Surangika Ranathunga
249
8
0
18 May 2022
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for
  Low-Resource Language Translation?
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?Findings (Findings), 2022
E. Lee
Sarubi Thillainathan
Shravan Nayak
Surangika Ranathunga
David Ifeoluwa Adelani
Ruisi Su
Arya D. McCarthy
VLM
404
53
0
16 Mar 2022
Metric Learning in Multilingual Sentence Similarity Measurement for
  Document Alignment
Metric Learning in Multilingual Sentence Similarity Measurement for Document Alignment
Charith Rajitha
Lakmali Piyarathne
Dilan Sachintha
Surangika Ranathunga
145
4
0
21 Aug 2021
Samanantar: The Largest Publicly Available Parallel Corpora Collection
  for 11 Indic Languages
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic LanguagesTransactions of the Association for Computational Linguistics (TACL), 2021
Gowtham Ramesh
Sumanth Doddapaneni
Aravinth Bheemaraj
Mayank Jobanputra
AK Raghavan
...
K. Deepak
Vivek Raghavan
Anoop Kunchukuttan
Pratyush Kumar
Mitesh Khapra
LRM
440
280
0
12 Apr 2021
Survey on Publicly Available Sinhala Natural Language Processing Tools
  and Research
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
Nisansa de Silva
1.6K
66
0
05 Jun 2019
1
Page 1 of 1