TokAlign: Efficient Vocabulary Adaptation via Token AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
On Linear Representations and Pretraining Data Frequency in Language ModelsInternational Conference on Learning Representations (ICLR), 2025 |
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive InvestigationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Performance Evaluation of Tokenizers in Large Language Models for the
Assamese LanguageInternational journal of information technology (IJIT), 2024 |