Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.04037
Cited By
v1
v2 (latest)
DynaBERT: Dynamic BERT with Adaptive Width and Depth
Neural Information Processing Systems (NeurIPS), 2020
8 April 2020
Lu Hou
Zhiqi Huang
Lifeng Shang
Xin Jiang
Xiao Chen
Qun Liu
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3098★)
Papers citing
"DynaBERT: Dynamic BERT with Adaptive Width and Depth"
50 / 229 papers shown
Route Experts by Sequence, not by Token
Tiansheng Wen
Y. Wang
Aosong Feng
Long Ma
Xinyang Liu
Y. Wang
Lixuan Guo
Bo Chen
Stefanie Jegelka
Chenyu You
MoE
KELM
253
1
0
30 Mar 2026
AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiruvathukal
Yung-Hsiang Lu
James C. Davis
182
0
0
22 Nov 2025
NeuronMM: High-Performance Matrix Multiplication for LLM Inference on AWS Trainium
Dinghong Song
Jierui Xu
Weichu Yang
Pengfei Su
Dong Li
232
0
0
29 Oct 2025
Elastic ViTs from Pretrained Models without Retraining
Walter Simoncini
Michael Dorkenwald
Tijmen Blankevoort
Cees G. M. Snoek
Yuki Markus Asano
VLM
196
0
0
20 Oct 2025
Efficient Adaptive Transformer: An Empirical Study and Reproducible Framework
Jan Miller
143
0
0
14 Oct 2025
Entropy Meets Importance: A Unified Head Importance-Entropy Score for Stable and Efficient Transformer Pruning
Minsik Choi
Hyegang Son
Changhoon Kim
Young Geun Kim
AAML
183
0
0
10 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
204
4
0
08 Oct 2025
Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models
Paulius Rauba
M. Schaar
205
2
0
22 Sep 2025
SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Yuqing Huang
Rongyang Zhang
Qimeng Wang
Chengqiang Lu
Yan Gao
...
Xuyang Zhi
Guiquan Liu
Xin Li
Hao Wang
Tong Xu
CLL
207
7
0
04 Sep 2025
EA-ViT: Efficient Adaptation for Elastic Vision Transformer
Chen Zhu
Wangbo Zhao
Huiwen Zhang
Samir Khaki
Yuhao Zhou
...
Zhihang Yuan
Yuzhang Shang
Xiaojiang Peng
Kai Wang
Dawei Yang
229
4
0
25 Jul 2025
ACME: Adaptive Customization of Large Models via Distributed Systems
IEEE International Conference on Distributed Computing Systems (ICDCS), 2025
Ziming Dai
Chao Qiu
Fei Gao
Yunfeng Zhao
Xiaofei Wang
340
1
0
20 Jul 2025
ThinkingViT: Matryoshka Thinking Vision Transformer for Elastic Inference
A. Hojjat
Janek Haberer
Soren Pirk
Olaf Landsiedel
LRM
270
3
0
14 Jul 2025
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Sangmin Bae
Yujin Kim
Reza Bayat
S. Kim
Jiyoun Ha
...
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Aaron Courville
Se-Young Yun
MoE
360
44
0
14 Jul 2025
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
Pattern Recognition Letters (Pattern Recogn. Lett.), 2025
Jiquan Shan
Junxiao Wang
Lifeng Zhao
Liang Cai
Hongyuan Zhang
Ioannis Liritzis
ViT
859
8
0
22 May 2025
On Multilingual Encoder Language Model Compression for Low-Resource Languages
Daniil Gurgurov
Michal Gregor
Josef van Genabith
Simon Ostermann
506
0
0
22 May 2025
How to Train Your Metamorphic Deep Neural Network
Thomas Sommariva
Simone Calderara
Angelo Porrello
274
0
0
07 May 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
334
2
0
29 Apr 2025
AdaVid: Adaptive Video-Language Pretraining
Chaitanya Patel
Juan Carlos Niebles
Ehsan Adeli
VLM
245
0
0
16 Apr 2025
DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Xiaojiang Peng
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
750
3
0
09 Apr 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
ACM Computing Surveys (ACM Comput. Surv.), 2025
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
474
115
0
08 Mar 2025
Balcony: A Lightweight Approach to Dynamic Inference of Generative Language Models
Benyamin Jamialahmadi
Parsa Kavehzadeh
Mehdi Rezagholizadeh
Parsa Farinneya
Hossein Rajabzadeh
A. Jafari
Boxing Chen
Marzieh S. Tahaei
358
3
0
06 Mar 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Tiansheng Wen
Yifei Wang
Zequn Zeng
Zhong Peng
Yudi Su
Xinyang Liu
Bo Chen
Hongwei Liu
Stefanie Jegelka
Chenyu You
CLL
808
17
0
03 Mar 2025
Ten Challenging Problems in Federated Foundation Models
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2025
Tao Fan
Hanlin Gu
Xuemei Cao
Chee Seng Chan
Qian Chen
...
Yu Zhang
Xiaojin Zhang
Zhenzhe Zheng
Lixin Fan
Qiang Yang
FedML
704
45
0
14 Feb 2025
M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference
Nikhil Bhendawade
Mahyar Najibi
Devang Naik
Irina Belousova
MoE
519
1
0
04 Feb 2025
Tailored-LLaMA: Optimizing Few-Shot Learning in Pruned LLaMA Models with Task-Specific Prompts
European Conference on Artificial Intelligence (ECAI), 2024
Danyal Aftab
Steven Davy
ALM
407
3
0
10 Jan 2025
Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment
International Conference on Artificial Neural Networks (ICANN), 2025
Xubin Wang
Weijia Jia
Weijia Jia
616
21
0
04 Jan 2025
TT-MPD: Test Time Model Pruning and Distillation
Haihang Wu
Wei Wang
T. Malepathirana
Sachith Seneviratne
D. Oetomo
Saman K. Halgamuge
375
0
0
10 Dec 2024
Slicing Vision Transformer for Flexible Inference
Neural Information Processing Systems (NeurIPS), 2024
Yitian Zhang
Huseyin Coskun
Xu Ma
Huan Wang
Ke Ma
Xi
Chen
Derek Hao Hu
Y. Fu
ViT
379
2
0
06 Dec 2024
CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration
Hongpeng Jin
Yanzhao Wu
715
27
0
05 Nov 2024
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
Neural Information Processing Systems (NeurIPS), 2024
Haowei Zhu
Dehua Tang
Ji Liu
Mingjie Lu
Jintu Zheng
...
Spandan Tiwari
Ashish Sirasao
Jun-Hai Yong
Bin Wang
E. Barsoum
DiffM
214
39
0
22 Oct 2024
Pre-training Distillation for Large Language Models: A Design Space Exploration
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Hao Peng
Xin Lv
Yushi Bai
Zijun Yao
Jing Zhang
Lei Hou
Juanzi Li
307
11
0
21 Oct 2024
FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction
Akriti Jain
Saransh Sharma
Koyel Mukherjee
Soumyabrata Pal
434
0
0
16 Oct 2024
Neural Metamorphosis
European Conference on Computer Vision (ECCV), 2024
Xingyi Yang
Xinchao Wang
381
5
0
10 Oct 2024
ElasticTok: Adaptive Tokenization for Image and Video
International Conference on Learning Representations (ICLR), 2024
Wilson Yan
Matei A. Zaharia
Volodymyr Mnih
Pieter Abbeel
Aleksandra Faust
Hao Liu
VGen
491
28
0
10 Oct 2024
Presto! Distilling Steps and Layers for Accelerating Music Generation
International Conference on Learning Representations (ICLR), 2024
Cheng-i Wang
Ge Zhu
Jonah Casebeer
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
492
18
0
07 Oct 2024
Dynamic Diffusion Transformer
International Conference on Learning Representations (ICLR), 2024
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Yibing Song
Gao Huang
Fan Wang
Yang You
437
53
0
04 Oct 2024
HydraViT: Stacking Heads for a Scalable ViT
Neural Information Processing Systems (NeurIPS), 2024
Janek Haberer
A. Hojjat
Olaf Landsiedel
255
7
0
26 Sep 2024
Mixture of Efficient Diffusion Experts Through Automatic Interval and Sub-Network Selection
European Conference on Computer Vision (ECCV), 2024
Alireza Ganjdanesh
Yan Kang
Yuchen Liu
Richard Y. Zhang
Zhe Lin
Heng Huang
DiffM
380
13
0
23 Sep 2024
Efficient Training of Large Vision Models via Advanced Automated Progressive Learning
Changlin Li
Jiawei Zhang
Sihao Lin
Zongxin Yang
Junwei Liang
Xiaodan Liang
Xiaojun Chang
VLM
307
2
0
06 Sep 2024
Exploiting Student Parallelism for Efficient GPU Inference of BERT-like Models in Online Services
Weiyan Wang
Yilun Jin
Yiming Zhang
Victor Junqiu Wei
Han Tian
Li Chen
Jinbao Xue
Yangyu Tao
Di Wang
Kai Chen
292
0
0
22 Aug 2024
Membership Inference Attack Against Masked Image Modeling
Hui Yuan
Xinlei He
Ning Yu
Yang Zhang
253
3
0
13 Aug 2024
A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models
International Conference on Machine Learning (ICML), 2024
Taehong Moon
Moonseok Choi
Eunggu Yun
Jongmin Yoon
Gayoung Lee
Jaewoong Cho
Juho Lee
285
9
0
12 Aug 2024
LLAVADI: What Matters For Multimodal Large Language Models Distillation
Shilin Xu
Xiangtai Li
Haobo Yuan
Lu Qi
Yunhai Tong
Ming-Hsuan Yang
258
19
0
28 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
443
0
0
22 Jul 2024
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deployment
Yuhao Ji
Chao Fang
Shaobo Ma
Haikuo Shao
Zhongfeng Wang
MQ
339
4
0
16 Jul 2024
Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs
Jiahuan Yan
Jintai Chen
Qianxing Wang
Benlin Liu
Jian Wu
264
9
0
13 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han
Qifan Wang
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Yi Fang
Qiang Guan
Lifu Huang
Dongfang Liu
VLM
238
18
0
05 Jul 2024
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALM
KELM
307
101
0
02 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
287
33
0
01 Jul 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
274
5
0
24 Jun 2024
1
2
3
4
5
Next
Page 1 of 5