Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2210.10692
Cited By
v1
v2 (latest)
Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
Conference on Machine Translation (WMT), 2022
19 October 2022
Idris Abdulmumin
Michael Beukman
Jesujoba Oluwadara Alabi
Chris C. Emezue
Everlyn Asiko
Tosin Adewumi
Shamsuddeen Hassan Muhammad
Mofetoluwa Adeyemi
Oreen Yousuf
Sahib Singh
T. Gwadabe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages"
7 / 7 papers shown
Title
Synthetic Voice Data for Automatic Speech Recognition in African Languages
Brian DeRenzi
Anna Dixon
Mohamed Aymane Farhi
Christian Resch
164
2
0
23 Jul 2025
The NaijaVoices Dataset: Cultivating Large-Scale, High-Quality, Culturally-Rich Speech Data for African Languages
Chris C. Emezue
NaijaVoices Community
Busayo Awobade
A. Owodunni
Handel Emezue
...
Nefertiti Nneoma Emezue
Sewade Ogun
Bunmi Akinremi
David Ifeoluwa Adelani
Chris Pal
229
4
0
26 May 2025
HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing
Shamsuddeen Hassan Muhammad
Ibrahim Said Ahmad
Idris Abdulmumin
Falalu Ibrahim Lawan
Babangida Sani
...
Sani Abdullahi Sani
Ali Usman Umar
T. Gwadabe
Kenneth Church
Vukosi Marivate
AI4TS
352
1
0
20 May 2025
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
322
6
0
29 May 2024
Leveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource Languages
Idris Abdulmumin
Auwal Abubakar Khalid
Shamsuddeen Hassan Muhammad
Ibrahim Said Ahmad
L. Aliyu
Babangida Sani
B.M. Abduljalil
Sani Ahmad Hassan
193
0
0
20 Nov 2023
SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)
International Workshop on Semantic Evaluation (SemEval), 2023
Shamsuddeen Hassan Muhammad
Idris Abdulmumin
Seid Muhie Yimam
David Ifeoluwa Adelani
Ibrahim Said Ahmad
N. Ousidhoum
Abinew Ali Ayele
Saif M. Mohammad
Meriem Beloucif
Sebastian Ruder
241
74
0
13 Apr 2023
The Impact of Data Corruption on Named Entity Recognition for Low-resourced Languages
Manuel A. Fokam
Michael Beukman
150
0
0
09 Aug 2022
1