ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.03497
  4. Cited By
Dataset Geography: Mapping Language Data to Language Users
v1v2 (latest)

Dataset Geography: Mapping Language Data to Language Users

Annual Meeting of the Association for Computational Linguistics (ACL), 2021
7 December 2021
Fahim Faisal
Yinkai Wang
Antonios Anastasopoulos
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "Dataset Geography: Mapping Language Data to Language Users"

19 / 19 papers shown
Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
Eshaan Tanwar
Anwoy Chatterjee
Michael Stephen Saxon
Alon Albalak
William Wang
Tanmoy Chakraborty
168
2
0
01 Nov 2025
Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+
York Hay Ng
Aditya Khan
Xiang Lu
Matteo Salloum
Michael Zhou
Phuong H. Hoang
A. Seza Doğruöz
En-Shiun Annie Lee
199
1
0
22 Oct 2025
Bridging Cultural Distance Between Models Default and Local Classroom Demands: How Global Teachers Adopt GenAI to Support Everyday Teaching Practices
Bridging Cultural Distance Between Models Default and Local Classroom Demands: How Global Teachers Adopt GenAI to Support Everyday Teaching Practices
Ruiwei Xiao
Qing Xiao
Xinying Hou
Hanqi Li
Phenyo Phemelo Moletsane
Hong Shen
John Stamper
213
1
0
13 Sep 2025
Conflicts in Texts: Data, Implications and Challenges
Conflicts in Texts: Data, Implications and Challenges
Siyi Liu
Dan Roth
1.0K
1
0
28 Apr 2025
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
1.4K
2
0
07 Oct 2024
Worldwide Federated Training of Language Models
Worldwide Federated Training of Language Models
Alexandru Iacob
Lorenzo Sani
Bill Marino
Preslav Aleksandrov
William F. Shen
Nicholas D. Lane
FedML
453
7
0
23 May 2024
The Future of Large Language Model Pre-training is Federated
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
503
43
0
17 May 2024
Validating and Exploring Large Geographic Corpora
Validating and Exploring Large Geographic CorporaInternational Conference on Language Resources and Evaluation (LREC), 2024
Jonathan Dunn
220
0
0
13 Mar 2024
On the Scaling Laws of Geographical Representation in Language Models
On the Scaling Laws of Geographical Representation in Language Models
Nathan Godey
Eric Villemonte de la Clergerie
Benoît Sagot
351
12
0
29 Feb 2024
Towards Better Inclusivity: A Diverse Tweet Corpus of English Varieties
Towards Better Inclusivity: A Diverse Tweet Corpus of English VarietiesLaw (LAW), 2024
Nhi Pham
Lachlan Pham
Adam L. Meyers
152
4
0
21 Jan 2024
A Material Lens on Coloniality in NLP
A Material Lens on Coloniality in NLP
William B. Held
Camille Harris
Michael Best
Diyi Yang
429
22
0
14 Nov 2023
SituatedGen: Incorporating Geographical and Temporal Contexts into
  Generative Commonsense Reasoning
SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense ReasoningNeural Information Processing Systems (NeurIPS), 2023
Yunxiang Zhang
Xiaojun Wan
AILawLRM
307
10
0
21 Jun 2023
Geographic and Geopolitical Biases of Language Models
Geographic and Geopolitical Biases of Language Models
Fahim Faisal
Antonios Anastasopoulos
285
32
0
20 Dec 2022
TaTa: A Multilingual Table-to-Text Dataset for African Languages
TaTa: A Multilingual Table-to-Text Dataset for African LanguagesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Sebastian Gehrmann
Sebastian Ruder
Vitaly Nikolaev
Jan A. Botha
Michael Chavinda
Ankur P. Parikh
Clara E. Rivera
LMTD
383
14
0
31 Oct 2022
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity
  Recognition
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity RecognitionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
David Ifeoluwa Adelani
Graham Neubig
Sebastian Ruder
Shruti Rijhwani
Michael Beukman
...
Idris Abdulmumin
Odunayo Ogundepo
Oreen Yousuf
Tatiana Moteu Ngoli
Dietrich Klakow
332
62
0
22 Oct 2022
Some Languages are More Equal than Others: Probing Deeper into the
  Linguistic Disparity in the NLP World
Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World
Surangika Ranathunga
Nisansa de Silva
331
62
0
16 Oct 2022
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained
  Language Models
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Da Yin
Hritik Bansal
Masoud Monajatipoor
Liunian Harold Li
Kai-Wei Chang
285
39
0
24 May 2022
Graph-based Ensemble Machine Learning for Student Performance Prediction
Graph-based Ensemble Machine Learning for Student Performance Prediction
Yinkai Wang
A. Ding
Kaiyi Guan
Shixi Wu
Yuanqi Du
214
7
0
15 Dec 2021
TyDi QA: A Benchmark for Information-Seeking Question Answering in
  Typologically Diverse Languages
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse LanguagesTransactions of the Association for Computational Linguistics (TACL), 2020
J. Clark
Eunsol Choi
Michael Collins
Dan Garrette
Tom Kwiatkowski
Vitaly Nikolaev
J. Palomaki
730
718
0
10 Mar 2020
1
Page 1 of 1