Autoregressive Knowledge Distillation through Imitation Learning

15 September 2020

Papers citing "Autoregressive Knowledge Distillation through Imitation Learning"

8 / 8 papers shown

Title
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Jahyun Koo Yerin Hwang Yongil Kim Taegwan Kang Hyunkyung Bae Kyomin Jung 40 0 0 25 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling W. Xu Rujun Han Z. Wang L. Le Dhruv Madeka Lei Li W. Wang Rishabh Agarwal Chen-Yu Lee Tomas Pfister 72 8 0 15 Oct 2024
f-Divergence Minimization for Sequence-Level Knowledge Distillation Yuqiao Wen Zichao Li Wenyu Du Lili Mou 25 53 0 27 Jul 2023
Target-Side Augmentation for Document-Level Machine Translation Guangsheng Bao Zhiyang Teng Yue Zhang 8 10 0 08 May 2023
Improving Scheduled Sampling with Elastic Weight Consolidation for Neural Machine Translation Michalis Korakakis Andreas Vlachos CLL 23 2 0 13 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration Gabriel Recchia 26 22 0 05 Sep 2021
Text Summarization with Pretrained Encoders Yang Liu Mirella Lapata MILM 254 1,428 0 22 Aug 2019
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 214 7,687 0 17 Aug 2015