Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs

24 February 2025

Abstract

We explore Cross-lingual Backdoor ATtacks (X-BAT) in multilingual Large Language Models (mLLMs), revealing how backdoors inserted in one language can automatically transfer to others through shared embedding spaces. Using toxicity classification as a case study, we demonstrate that attackers can compromise multilingual systems by poisoning data in a single language, with rare tokens serving as specific effective triggers. Our findings expose a critical vulnerability in the fundamental architecture that enables cross-lingual transfer in these models. Our code and data are publicly available atthis https URL.

View on arXiv

@article{beniwal2025_2502.16901,
  title={ Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs },
  author={ Himanshu Beniwal and Sailesh Panda and Mayank Singh },
  journal={arXiv preprint arXiv:2502.16901},
  year={ 2025 }
}

Comments on this paper