
v1v2 (latest)
Baby Llama: knowledge distillation from an ensemble of teachers trained
on a small dataset with no performance penalty
Papers citing "Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty"
43 / 43 papers shown
Title |
|---|
![]() Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
![]() GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code GenerationIEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid), 2025 |
![]() Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient
LLMs Under CompressionInternational Conference on Machine Learning (ICML), 2024 |










































