Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.13657
Cited By
Efficient Transformer Knowledge Distillation: A Performance Review
22 November 2023
Nathan Brown
Ashton Williamson
Tahj Anderson
Logan Lawrence
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Transformer Knowledge Distillation: A Performance Review"
5 / 5 papers shown
Title
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
82
0
0
09 Oct 2024
LSG Attention: Extrapolation of pretrained Transformers to long sequences
Charles Condevaux
S. Harispe
30
24
0
13 Oct 2022
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
280
2,015
0
28 Jul 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,450
0
18 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1