Distilling Large Language Models into Tiny and Effective Students using pQRNN

21 January 2021

Papers citing "Distilling Large Language Models into Tiny and Effective Students using pQRNN"

9 / 9 papers shown

Title
Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models Harshita Diddee Sandipan Dandapat Monojit Choudhury T. Ganu Kalika Bali 27 5 0 27 Oct 2022
Unsupervised Term Extraction for Highly Technical Domains Francesco Fusco Peter W. J. Staar Diego Antognini 20 4 0 24 Oct 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language Francesco Fusco Damian Pascual Peter W. J. Staar Diego Antognini 26 29 0 09 Feb 2022
Tiny Neural Models for Seq2Seq A. Kandoor 18 0 0 07 Aug 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better Gaurav Menghani VLM MedIm 23 360 0 16 Jun 2021
Rethinking embedding coupling in pre-trained language models Hyung Won Chung Thibault Févry Henry Tsai Melvin Johnson Sebastian Ruder 93 142 0 24 Oct 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT Sheng Shen Zhen Dong Jiayu Ye Linjian Ma Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer MQ 225 574 0 12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 294 6,950 0 20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,740 0 26 Sep 2016