MiniLLM: Knowledge Distillation of Large Language Models

MiniLLM: Knowledge Distillation of Large Language Models

14 June 2023

Papers citing "MiniLLM: Knowledge Distillation of Large Language Models"

17 / 17 papers shown

Title
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights Fenglu Hong Ravi Raju Jonathan Li Bo Li Urmish Thakker Avinash Ravichandran Swayambhoo Jain Changran Hu 35 0 0 10 Mar 2025
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time Justin Deschenaux Çağlar Gülçehre 39 2 0 28 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling W. Xu Rujun Han Z. Wang L. Le Dhruv Madeka Lei Li W. Wang Rishabh Agarwal Chen-Yu Lee Tomas Pfister 69 8 0 15 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models Zhiteng Li X. Yan Tianao Zhang Haotong Qin Dong Xie Jiang Tian Zhongchao Shi Linghe Kong Yulun Zhang Xiaokang Yang MQ 26 2 0 04 Oct 2024
Parameter Efficient Diverse Paraphrase Generation Using Sequence-Level Knowledge Distillation Lasal Jayawardena Prasan Yapa BDL 16 1 0 19 Apr 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression Xin Wang Yu Zheng Zhongwei Wan Mi Zhang MQ 55 43 0 12 Mar 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes Lucio Dery Steven Kolawole Jean-Francois Kagey Virginia Smith Graham Neubig Ameet Talwalkar 33 27 0 08 Feb 2024
Knowledge Fusion of Large Language Models Fanqi Wan Xinting Huang Deng Cai Xiaojun Quan Wei Bi Shuming Shi MoMe 22 61 0 19 Jan 2024
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Arnav Chavan Nahush Lele Deepak Gupta 10 1 0 12 Dec 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation Xiaowei Huang Wenjie Ruan Wei Huang Gao Jin Yizhen Dong ... Sihao Wu Peipei Xu Dengyu Wu André Freitas Mustafa A. Mustafa ALM 27 81 0 19 May 2023
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions Minghao Wu Abdul Waheed Chiyu Zhang Muhammad Abdul-Mageed Alham Fikri Aji ALM 121 115 0 27 Apr 2023
Instruction Tuning with GPT-4 Baolin Peng Chunyuan Li Pengcheng He Michel Galley Jianfeng Gao SyDa ALM LM&MA 157 576 0 06 Apr 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 203 1,651 0 15 Oct 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems Sergey Levine Aviral Kumar George Tucker Justin Fu OffRL GP 321 1,944 0 04 May 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 220 4,424 0 23 Jan 2020
Language GANs Falling Short Massimo Caccia Lucas Page-Caccia W. Fedus Hugo Larochelle Joelle Pineau Laurent Charlin 112 214 0 06 Nov 2018