ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.14682
  4. Cited By
CascadeBERT: Accelerating Inference of Pre-trained Language Models via
  Calibrated Complete Models Cascade

CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade

29 December 2020
Lei Li
Yankai Lin
Deli Chen
Shuhuai Ren
Peng Li
Jie Zhou
Xu Sun
ArXivPDFHTML

Papers citing "CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade"

10 / 10 papers shown
Title
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
56
0
0
29 Apr 2025
A Unified Approach to Routing and Cascading for LLMs
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
60
2
0
17 Feb 2025
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on
  Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
29
3
0
03 Feb 2024
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
13
332
0
28 Mar 2022
A Simple Hash-Based Early Exiting Approach For Language Understanding
  and Generation
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
19
40
0
03 Mar 2022
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu
Tianxiang Sun
Junliang He
Jiawen Wu
Lingling Wu
Xinyu Zhang
Hao Jiang
Zhao Cao
Xuanjing Huang
Xipeng Qiu
ELM
11
46
0
13 Oct 2021
Learning to Cascade: Confidence Calibration for Improving the Accuracy
  and Computational Cost of Cascade Inference Systems
Learning to Cascade: Confidence Calibration for Improving the Accuracy and Computational Cost of Cascade Inference Systems
Shohei Enomoto
Takeharu Eda
UQCV
34
17
0
15 Apr 2021
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
196
0
07 Feb 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
571
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1