Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.16538
Cited By
v1
v2 (latest)
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
22 May 2025
Ercong Nie
Helmut Schmid
Hinrich Schutze
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models"
34 / 34 papers shown
Title
Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation
Collin Zhang
Fei Huang
C. Yuan
Junyang Lin
97
0
0
20 Oct 2025
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Mingyang Wang
Heike Adel
Lukas Lange
Yihong Liu
Ercong Nie
Jannik Strötgen
Hinrich Schütze
HILM
267
23
0
05 Apr 2025
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zeping Yu
Sophia Ananiadou
LRM
MILM
267
22
0
21 Sep 2024
Beyond English-Centric LLMs: What Language Do Multilingual Language Models Think in?
Chengzhi Zhong
Fei Cheng
Qianying Liu
Junfeng Jiang
Zhen Wan
Chenhui Chu
Yugo Murawaki
Sadao Kurohashi
LRM
228
37
0
20 Aug 2024
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yifei Wang
Yuheng Chen
Wanting Wen
Yu Sheng
Linjing Li
D. Zeng
KELM
317
14
0
06 Aug 2024
Understanding and Mitigating Language Confusion in LLMs
Kelly Marchisio
Wei-Yin Ko
Alexandre Berard
Théo Dehaze
Sebastian Ruder
498
56
0
28 Jun 2024
Unlocking the Future: Exploring Look-Ahead Planning Mechanistic Interpretability in Large Language Models
Tianyi Men
Pengfei Cao
Zhuoran Jin
Yubo Chen
Kang Liu
Jun Zhao
LLMAG
AIFin
219
15
0
23 Jun 2024
Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks
Nadezhda Chirkova
Vassilina Nikoulina
271
12
0
19 Feb 2024
Do Llamas Work in English? On the Latent Language of Multilingual Transformers
Chris Wendler
V. Veselovsky
Giovanni Monea
Robert West
541
212
0
16 Feb 2024
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Uri Shaham
Jonathan Herzig
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
Matan Eyal
LRM
362
68
0
03 Jan 2024
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Jun Zhao
Zhihao Zhang
Luhui Gao
Tao Gui
Tao Gui
Xuanjing Huang
ELM
329
103
0
02 Jan 2024
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Tannon Kew
Florian Schottmann
Rico Sennrich
LRM
226
48
0
20 Dec 2023
Future Lens: Anticipating Subsequent Tokens from a Single Hidden State
Koyena Pal
Jiuding Sun
Andrew Yuan
Byron C. Wallace
David Bau
181
87
0
08 Nov 2023
Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Rico Sennrich
Jannis Vamvas
Alireza Mohammadshahi
HILM
437
51
0
13 Sep 2023
Extrapolating Large Language Models to Non-English by Aligning Languages
Wenhao Zhu
Yunzhe Lv
Qingxiu Dong
Fei Yuan
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
265
85
0
09 Aug 2023
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Viet Dac Lai
Chien Van Nguyen
Nghia Trung Ngo
Thuat Nguyen
Franck Dernoncourt
Ryan Rossi
Thien Huu Nguyen
ALM
353
196
0
29 Jul 2023
Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tianjian Li
Kenton W. Murray
231
28
0
27 May 2023
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Akari Asai
Sneha Kudugunta
Xinyan Velocity Yu
Terra Blevins
Hila Gonen
Machel Reid
Yulia Tsvetkov
Sebastian Ruder
Hannaneh Hajishirzi
295
80
0
24 May 2023
mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jonas Pfeiffer
Francesco Piccinno
Massimo Nicosia
Xinyi Wang
Machel Reid
Sebastian Ruder
VLM
LRM
256
32
0
23 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Neural Information Processing Systems (NeurIPS), 2023
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
493
439
0
28 Apr 2023
Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mor Geva
Jasmijn Bastings
Katja Filippova
Amir Globerson
KELM
677
412
0
28 Apr 2023
MEGA: Multilingual Evaluation of Generative AI
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kabir Ahuja
Harshita Diddee
Rishav Hada
Millicent Ochieng
Krithika Ramesh
...
T. Ganu
Sameer Segal
Maxamed Axmed
Kalika Bali
Sunayana Sitaram
LM&MA
LRM
ELM
529
342
0
22 Mar 2023
Eliciting Latent Predictions from Transformers with the Tuned Lens
Nora Belrose
Zach Furman
Logan Smith
Danny Halawi
Igor V. Ostrovsky
Lev McKinney
Stella Biderman
Jacob Steinhardt
552
307
0
14 Mar 2023
A Survey of Code-switching: Linguistic and Social Perspectives for Language Technologies
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
A. Seza Doğruöz
Sunayana Sitaram
Barbara E. Bullock
Almeida Jacqueline Toribio
213
94
0
05 Jan 2023
The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Genta Indra Winata
Alham Fikri Aji
Zheng-Xin Yong
Thamar Solorio
293
48
0
19 Dec 2022
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
International Conference on Learning Representations (ICLR), 2022
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
552
760
0
01 Nov 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
581
456
0
28 Mar 2022
Locating and Editing Factual Associations in GPT
Neural Information Processing Systems (NeurIPS), 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
895
1,898
0
10 Feb 2022
Transformer Feed-Forward Layers Are Key-Value Memories
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
599
1,118
0
29 Dec 2020
Understanding the Role of Individual Units in a Deep Neural Network
Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2020
David Bau
Jun-Yan Zhu
Hendrik Strobelt
Àgata Lapedriza
Bolei Zhou
Antonio Torralba
GAN
240
496
0
10 Sep 2020
GLUECoS : An Evaluation Benchmark for Code-Switched NLP
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Simran Khanuja
Sandipan Dandapat
A. Srinivasan
Sunayana Sitaram
Monojit Choudhury
ELM
219
172
0
26 Apr 2020
XNLI: Evaluating Cross-lingual Sentence Representations
Alexis Conneau
Guillaume Lample
Ruty Rinott
Adina Williams
Samuel R. Bowman
Holger Schwenk
Veselin Stoyanov
ELM
333
1,520
0
13 Sep 2018
FastText.zip: Compressing text classification models
Armand Joulin
Edouard Grave
Piotr Bojanowski
Matthijs Douze
Edouard Grave
Tomas Mikolov
MQ
414
1,288
0
12 Dec 2016
Bag of Tricks for Efficient Text Classification
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2016
Armand Joulin
Edouard Grave
Piotr Bojanowski
Tomas Mikolov
VLM
908
4,867
0
06 Jul 2016
1