MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025 |
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in
Sequence-Level Knowledge DistillationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024 |