ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.09095
  4. Cited By
The State and Fate of Linguistic Diversity and Inclusion in the NLP
  World
v1v2v3 (latest)

The State and Fate of Linguistic Diversity and Inclusion in the NLP World

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
20 April 2020
Pratik M. Joshi
Sebastin Santy
A. Budhiraja
Kalika Bali
Monojit Choudhury
    LMTD
ArXiv (abs)PDFHTML

Papers citing "The State and Fate of Linguistic Diversity and Inclusion in the NLP World"

50 / 573 papers shown
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
David Anugraha
Patrick Amadeus Irawan
A. Singh
En-Shiun Annie Lee
Genta Indra Winata
VLM
196
1
0
05 Dec 2025
Adapting Large Language Models to Low-Resource Tibetan: A Two-Stage Continual and Supervised Fine-Tuning Study
Adapting Large Language Models to Low-Resource Tibetan: A Two-Stage Continual and Supervised Fine-Tuning Study
Lifeng Chen
Ryan Lai
Tianming Liu
CLL
188
0
0
03 Dec 2025
Modeling Topics and Sociolinguistic Variation in Code-Switched Discourse: Insights from Spanish-English and Spanish-Guaraní
Modeling Topics and Sociolinguistic Variation in Code-Switched Discourse: Insights from Spanish-English and Spanish-Guaraní
Nemika Tyagi
Nelvin Licona-Guevara
Olga Kellert
90
0
0
03 Dec 2025
CACARA: Cross-Modal Alignment Leveraging a Text-Centric Approach for Cost-Effective Multimodal and Multilingual Learning
CACARA: Cross-Modal Alignment Leveraging a Text-Centric Approach for Cost-Effective Multimodal and Multilingual Learning
Diego A. B. Moreira
Alef Iury Ferreira
Jhessica Silva
G. O. D. Santos
Gustavo Bonil
...
Simone Tiemi Hashiguti
Nádia Da Silva
Carolina Scarton
Hélio Pedrini
Sandra Avila
116
0
0
29 Nov 2025
Named Entity Recognition for the Kurdish Sorani Language: Dataset Creation and Comparative Analysis
Named Entity Recognition for the Kurdish Sorani Language: Dataset Creation and Comparative Analysis
Bakhtawar Abdalla
Rebwar Mala Nabi
Hassan Eshkiki
Fabio Caraffini
121
1
0
27 Nov 2025
AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models
AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models
Yann Le Beux
Oluchi Audu
Oche D. Ankeli
Dhananjay Balakrishnan
Melissah Weya
Marie D. Ralaiarinosy
Ignatius Ezeani
165
0
0
27 Nov 2025
Donors and Recipients: On Asymmetric Transfer Across Tasks and Languages with Parameter-Efficient Fine-Tuning
Donors and Recipients: On Asymmetric Transfer Across Tasks and Languages with Parameter-Efficient Fine-Tuning
Kajetan Dymkiewicz
Ivan Vulić
Helen Yannakoudakis
Eilam Shapira
Roi Reichart
Anna Korhonen
191
1
0
17 Nov 2025
Rethinking what Matters: Effective and Robust Multilingual Realignment for Low-Resource Languages
Rethinking what Matters: Effective and Robust Multilingual Realignment for Low-Resource Languages
Quang Phuoc Nguyen
David Anugraha
Felix Gaschi
Jun Bin Cheng
En-Shiun Annie Lee
220
0
0
09 Nov 2025
Who Gets Heard? Rethinking Fairness in AI for Music Systems
Who Gets Heard? Rethinking Fairness in AI for Music Systems
Atharva Mehta
Shivam Chauhan
Megha Sharma
Gus Xia
Kaustuv Kanti Ganguli
Nishanth Chandran
Zeerak Talat
Monojit Choudhury
131
0
0
08 Nov 2025
Evaluating Machine Translation Datasets for Low-Web Data Languages: A Gendered Lens
Evaluating Machine Translation Datasets for Low-Web Data Languages: A Gendered Lens
Hellina Hailu Nigatu
Bethelhem Yemane Mamo
Bontu Fufa Balcha
Debora Taye Tesfaye
Elbethel Daniel Zewdie
Ikram Behiru Nesiru
Jitu Ewnetu Hailu
Senait Mengesha Yayo
107
0
0
05 Nov 2025
EvalCards: A Framework for Standardized Evaluation Reporting
EvalCards: A Framework for Standardized Evaluation Reporting
Ruchira Dhar
Danae Sanchez Villegas
Antonia Karamolegkou
Alice Schiavone
Yifei Yuan
...
Monorama Swain
Stephanie Brandl
Daniel Hershcovich
Anders Søgaard
Desmond Elliott
101
2
0
05 Nov 2025
Safer in Translation? Presupposition Robustness in Indic Languages
Safer in Translation? Presupposition Robustness in Indic Languages
Aadi Palnitkar
Arjun Suresh
Rishi Rajesh
Puneet Puli
128
0
0
03 Nov 2025
Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?
Why Do Multilingual Reasoning Gaps Emerge in Reasoning Language Models?
Deokhyung Kang
Seonjeong Hwang
Daehui Kim
Hyounghun Kim
Gary Geunbae Lee
LRMELM
252
3
0
31 Oct 2025
Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+
Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+
Mason Shipton
York Hay Ng
Aditya Khan
Phuong H. Hoang
Xiang Lu
A. Seza Doğruöz
En-Shiun Annie Lee
185
0
0
31 Oct 2025
Between Myths and Metaphors: Rethinking LLMs for SRH in Conservative Contexts
Between Myths and Metaphors: Rethinking LLMs for SRH in Conservative Contexts
Ameemah Humayun
Bushra Zubair
Maryam Mustafa
144
0
0
31 Oct 2025
Evaluating LLMs on Generating Age-Appropriate Child-Like Conversations
Evaluating LLMs on Generating Age-Appropriate Child-Like Conversations
Syed Zohaib Hassan
Pål Halvorsen
Miriam S. Johnson
Pierre Lison
93
1
0
28 Oct 2025
Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection
Confabulations from ACL Publications (CAP): A Dataset for Scientific Hallucination Detection
Federica Gamba
Aman Sinha
Timothee Mickus
Raul Vazquez
Patanjali Bhamidipati
...
Aryan Chandramania
Rohit Agarwal
Chuyuan Li
Ioana Buhnila
Radhika Mamidi
HILM
235
4
0
25 Oct 2025
Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
York Hay Ng
Aditya Khan
Xiang Lu
Matteo Salloum
Michael Zhou
Phuong H. Hoang
A. Seza Doğruöz
En-Shiun Annie Lee
207
1
0
22 Oct 2025
Identity-Aware Large Language Models require Cultural Reasoning
Identity-Aware Large Language Models require Cultural Reasoning
Alistair Plum
Anne-Marie Lutgen
Christoph Purschke
Achim Rettinger
LRM
145
5
0
21 Oct 2025
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models
ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models
Emily Chang
Niyati Bafna
ELM
196
0
0
19 Oct 2025
MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and Linking
MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and Linking
Sathyanarayanan Ramamoorthy
Vishwa Shah
Simran Khanuja
Zaid A. W. Sheikh
Shan Jie
Ann Chia
Shearman Chua
Graham Neubig
166
0
0
16 Oct 2025
Document Intelligence in the Era of Large Language Models: A Survey
Document Intelligence in the Era of Large Language Models: A Survey
Weishi Wang
Hengchang Hu
Zhijie Zhang
Zhaochen Li
Hongxin Shao
Daniel Dahlmeier
AI4TS
277
4
0
15 Oct 2025
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
Daniil Gurgurov
Josef van Genabith
Simon Ostermann
Simon Ostermann
MoE
258
0
0
15 Oct 2025
Cost Analysis of Human-corrected Transcription for Predominately Oral Languages
Cost Analysis of Human-corrected Transcription for Predominately Oral Languages
Yacouba Diarra
Nouhoum Souleymane Coulibaly
Michael Leventhal
95
3
0
14 Oct 2025
Invisible Languages of the LLM Universe
Invisible Languages of the LLM Universe
Saurabh Khanna
Xinxu Li
110
6
0
13 Oct 2025
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data
BabyBabelLM: A Multilingual Benchmark of Developmentally Plausible Training Data
Jaap Jumelet
Abdellah Fourtassi
Akari Haga
Bastian Bunzeck
Bhargav Shandilya
...
Yurii Paniv
Ziyin Zhang
Arianna Bisazza
Alex Warstadt
Leshem Choshen
184
2
0
11 Oct 2025
HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks
HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks
Adnan El Assadi
Isaac Chung
Roman Solomatin
Niklas Muennighoff
Kenneth Enevoldsen
310
1
0
11 Oct 2025
SkipSR: Faster Super Resolution with Token Skipping
SkipSR: Faster Super Resolution with Token Skipping
Rohan Choudhury
Shanchuan Lin
Jianyi Wang
Hao Chen
Qi Zhao
Feng Cheng
Lu Jiang
Kris Kitani
László A. Jeni
SupR
279
0
0
09 Oct 2025
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Sunflower: A New Approach To Expanding Coverage of African Languages in Large Language Models
Benjamin Akera
Evelyn Nafula Ouma
Gilbert Yiga
Patrick Walukagga
Phionah Natukunda
...
Imran Sekalala
Nimpamya Janat Namara
Engineer Bainomugisha
Ernest Mwebaze
John Quinn
216
1
0
08 Oct 2025
Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data
Lemma Dilemma: On Lemma Generation Without Domain- or Language-Specific Training Data
Olia Toporkov
Alan Akbik
Rodrigo Agerri
183
0
0
08 Oct 2025
Pragyaan: Designing and Curating High-Quality Cultural Post-Training Datasets for Indian Languages
Pragyaan: Designing and Curating High-Quality Cultural Post-Training Datasets for Indian Languages
Neel Prabhanjan Rachamalla
Aravind Konakalla
Gautam Rajeev
Ashish Kulkarni
Chandra Khatri
Shubham Agarwal
185
2
0
08 Oct 2025
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
Sheriff Issaka
Keyi Wang
Yinka Ajibola
Oluwatumininu Samuel-Ipaye
Zhaoyi Zhang
...
Jemimah Osei
Carlene Ajeneza
Persis Boateng
Prisca Adwoa Dufie Yeboah
Saadia Gabriel
153
0
0
07 Oct 2025
mR3: Multilingual Rubric-Agnostic Reward Reasoning Models
mR3: Multilingual Rubric-Agnostic Reward Reasoning Models
David Anugraha
Shou-Yi Hung
Zilu Tang
Annie En-Shiun Lee
Derry Wijaya
Genta Indra Winata
LRM
539
7
0
01 Oct 2025
Multilingual Vision-Language Models, A Survey
Multilingual Vision-Language Models, A Survey
Andrei-Alexandru Manea
Jindřich Libovický
VLM
215
1
0
26 Sep 2025
UPDESH: Synthesizing Grounded Instruction Tuning Data for 13 Indic Languages
UPDESH: Synthesizing Grounded Instruction Tuning Data for 13 Indic Languages
Pranjal A. Chitale
Varun Gumma
Sanchit Ahuja
Prashant Kodali
Manan Uppadhyay
Deepthi Sudharsan
Sunayana Sitaram
SyDa
333
0
0
25 Sep 2025
Low-Resource English-Tigrinya MT: Leveraging Multilingual Models, Custom Tokenizers, and Clean Evaluation Benchmarks
Low-Resource English-Tigrinya MT: Leveraging Multilingual Models, Custom Tokenizers, and Clean Evaluation Benchmarks
Hailay Teklehaymanot
Gebrearegawi Gidey
Wolfgang Nejdl
160
0
0
24 Sep 2025
Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
Scaling, Simplification, and Adaptation: Lessons from Pretraining on Machine-Translated Text
Dan John Velasco
M. R
CLLLRM
153
0
0
22 Sep 2025
DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
DIVERS-Bench: Evaluating Language Identification Across Domain Shifts and Code-Switching
Jessica Ojo
Zina Kamel
David Ifeoluwa Adelani
146
3
0
22 Sep 2025
Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages
Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages
Wenhao Zhuang
Yuan Sun
Xiaobing Zhao
151
1
0
22 Sep 2025
Cross-Attention is Half Explanation in Speech-to-Text Models
Cross-Attention is Half Explanation in Speech-to-Text Models
Sara Papi
Dennis Fucci
Marco Gaido
Matteo Negri
L. Bentivogli
LRM
226
1
0
22 Sep 2025
Towards Open-Ended Discovery for Low-Resource NLP
Towards Open-Ended Discovery for Low-Resource NLP
Bonaventure F. P. Dossou
Henri Aïdasso
168
0
0
22 Sep 2025
TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla
TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla
Nishat Raihan
Antonios Anastasopoulos
Marcos Zampieri
188
15
0
11 Sep 2025
COCO-Urdu: A Large-Scale Urdu Image-Caption Dataset with Multimodal Quality Estimation
COCO-Urdu: A Large-Scale Urdu Image-Caption Dataset with Multimodal Quality Estimation
Umair Hassan
125
0
0
10 Sep 2025
Advancing Conversational AI with Shona Slang: A Dataset and Hybrid Model for Digital Inclusion
Advancing Conversational AI with Shona Slang: A Dataset and Hybrid Model for Digital Inclusion
Happymore Masoka
69
0
0
10 Sep 2025
Exploring Subjective Tasks in Farsi: A Survey Analysis and Evaluation of Language Models
Exploring Subjective Tasks in Farsi: A Survey Analysis and Evaluation of Language Models
Donya Rooein
Flor Miriam Plaza del Arco
Debora Nozza
Dirk Hovy
219
0
0
06 Sep 2025
No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata
No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata
Jessica M. Lundin
Ada Zhang
David Adelani
Cody Carroll
109
0
0
05 Sep 2025
Social Bias in Multilingual Language Models: A Survey
Social Bias in Multilingual Language Models: A Survey
Lance Calvin Lim Gamboa
Yue Feng
Mark Lee
305
0
0
27 Aug 2025
It's All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs
It's All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs
Yue Li
Zhixue Zhao
Carolina Scarton
187
6
0
26 Aug 2025
Quantifying Language Disparities in Multilingual Large Language Models
Quantifying Language Disparities in Multilingual Large Language Models
Songbo Hu
Ivan Vulić
Anna Korhonen
155
4
0
23 Aug 2025
Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology
Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology
Jay L. Cunningham
Adinawa Adjagbodjou
Jeffrey Basoah
Jainaba Jawara
Kowe Kadoma
Aaleyah Lewis
133
2
0
20 Aug 2025
1234...101112
Next
Page 1 of 12
Pageof 12