Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.21082
Cited By
Accelerating Large Language Model Inference with Self-Supervised Early Exits
30 July 2024
Florian Valade
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Accelerating Large Language Model Inference with Self-Supervised Early Exits"
3 / 3 papers shown
Title
EERO: Early Exit with Reject Option for Efficient Classification with limited budget
Florian Valade
Mohamed Hebiri
Paul Gay
21
2
0
06 Feb 2024
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
183
0
31 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
214
505
0
12 Sep 2019
1