Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.09355
Cited By
Patient Knowledge Distillation for BERT Model Compression
25 August 2019
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Patient Knowledge Distillation for BERT Model Compression"
50 / 491 papers shown
Title
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian
Andrei Panferov
Denis Kuznedelev
Elias Frantar
Artem Babenko
Dan Alistarh
MQ
98
88
0
11 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xintao Hu
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
19
65
0
04 Jan 2024
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Ziqian Zeng
Yihuai Hong
Hongliang Dai
Huiping Zhuang
Cen Chen
19
10
0
19 Dec 2023
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup
Maolin Wang
Yao-Min Zhao
Jiajia Liu
Jingdong Chen
Chenyi Zhuang
Jinjie Gu
Ruocheng Guo
Xiangyu Zhao
20
6
0
10 Dec 2023
SlimSAM: 0.1% Data Makes Segment Anything Slim
Zigeng Chen
Gongfan Fang
Xinyin Ma
Xinchao Wang
30
13
0
08 Dec 2023
Accelerating Convolutional Neural Network Pruning via Spatial Aura Entropy
Bogdan Musat
Razvan Andonie
13
0
0
08 Dec 2023
Target-agnostic Source-free Domain Adaptation for Regression Tasks
Tianlang He
Zhiqiu Xia
Jierun Chen
Haoliang Li
S.-H. Gary Chan
36
4
0
01 Dec 2023
Exponentially Faster Language Modelling
Peter Belcak
Roger Wattenhofer
8
2
0
15 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Dawei Song
Zheyu Ye
Yan Gao
ELM
30
20
0
13 Nov 2023
What is Lost in Knowledge Distillation?
Manas Mohanty
Tanya Roosta
Peyman Passban
13
1
0
07 Nov 2023
Co-training and Co-distillation for Quality Improvement and Compression of Language Models
Hayeon Lee
Rui Hou
Jongpil Kim
Davis Liang
Hongbo Zhang
Sung Ju Hwang
Alexander Min
13
0
0
06 Nov 2023
Data-Free Distillation of Language Model by Text-to-Text Transfer
Zheyuan Bai
Xinduo Liu
Hailin Hu
Tianyu Guo
Qinghua Zhang
Yunhe Wang
40
2
0
03 Nov 2023
EELBERT: Tiny Models through Dynamic Embeddings
Gabrielle Cohn
Rishika Agarwal
Deepanshu Gupta
Siddharth Patwardhan
11
2
0
31 Oct 2023
Label Poisoning is All You Need
Rishi Jha
J. Hayase
Sewoong Oh
AAML
22
28
0
29 Oct 2023
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
Chaojun Xiao
Yuqi Luo
Wenbin Zhang
Pengle Zhang
Xu Han
...
Zhengyan Zhang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
22
0
0
24 Oct 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression
Jiduan Liu
Jiahao Liu
Qifan Wang
Jingang Wang
Xunliang Cai
Dongyan Zhao
R. Wang
Rui Yan
19
4
0
24 Oct 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
38
4
0
24 Oct 2023
MCC-KD: Multi-CoT Consistent Knowledge Distillation
Hongzhan Chen
Siyue Wu
Xiaojun Quan
Rui Wang
Ming Yan
Ji Zhang
LRM
19
17
0
23 Oct 2023
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection
Jianwei Li
Weizhi Gao
Qi Lei
Dongkuan Xu
22
2
0
19 Oct 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
Yangyang Guo
Guangzhi Wang
Mohan S. Kankanhalli
21
2
0
16 Oct 2023
NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models
Jongwoo Ko
Seungjoon Park
Yujin Kim
Sumyeong Ahn
Du-Seong Chang
Euijai Ahn
SeYoung Yun
14
4
0
16 Oct 2023
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models
Hang Shao
Bei Liu
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
42
17
0
14 Oct 2023
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models
Takuma Udagawa
Aashka Trivedi
Michele Merler
Bishwaranjan Bhattacharjee
36
7
0
13 Oct 2023
Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention
Huiyin Xue
Nikolaos Aletras
23
0
0
11 Oct 2023
Sparse Fine-tuning for Inference Acceleration of Large Language Models
Eldar Kurtic
Denis Kuznedelev
Elias Frantar
Michael Goin
Dan Alistarh
27
8
0
10 Oct 2023
Module-wise Adaptive Distillation for Multimodality Foundation Models
Chen Liang
Jiahui Yu
Ming-Hsuan Yang
Matthew A. Brown
Yin Cui
Tuo Zhao
Boqing Gong
Tianyi Zhou
11
10
0
06 Oct 2023
Talking Models: Distill Pre-trained Knowledge to Downstream Models via Interactive Communication
Zhe Zhao
Qingyun Liu
Huan Gui
Bang An
Lichan Hong
Ed H. Chi
15
1
0
04 Oct 2023
Ensemble Distillation for Unsupervised Constituency Parsing
Behzad Shayegh
Yanshuai Cao
Xiaodan Zhu
Jackie C.K. Cheung
Lili Mou
44
5
0
03 Oct 2023
A Comprehensive Review of Generative AI in Healthcare
Yasin Shokrollahi
Sahar Yarmohammadtoosky
Matthew M. Nikahd
Pengfei Dong
Xianqi Li
Linxia Gu
MedIm
AI4CE
19
19
0
01 Oct 2023
ELIP: Efficient Language-Image Pre-training with Fewer Vision Tokens
Yangyang Guo
Haoyu Zhang
Yongkang Wong
Liqiang Nie
Mohan S. Kankanhalli
VLM
22
3
0
28 Sep 2023
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems
Takuma Udagawa
Masayuki Suzuki
Gakuto Kurata
Masayasu Muraoka
G. Saon
30
2
0
07 Sep 2023
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection
Yifan Xu
Mengdan Zhang
Xiaoshan Yang
Changsheng Xu
ObjD
19
5
0
30 Aug 2023
Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning
Apoorv Dankar
Adeem Jassani
Kartikaeya Kumar
6
1
0
26 Aug 2023
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
Jiawen Xie
Pengyu Cheng
Xiao Liang
Yong Dai
Nan Du
32
7
0
25 Aug 2023
DLIP: Distilling Language-Image Pre-training
Huafeng Kuang
Jie Wu
Xiawu Zheng
Ming Li
Xuefeng Xiao
Rui Wang
Min Zheng
Rongrong Ji
VLM
36
4
0
24 Aug 2023
Radio2Text: Streaming Speech Recognition Using mmWave Radio Signals
Running Zhao
Jiang-Tao Luca Yu
H. Zhao
Edith C. H. Ngai
24
4
0
16 Aug 2023
Slot Induction via Pre-trained Language Model Probing and Multi-level Contrastive Learning
Hoang Nguyen
Chenwei Zhang
Ye Liu
Philip S. Yu
31
5
0
09 Aug 2023
Sci-CoT: Leveraging Large Language Models for Enhanced Knowledge Distillation in Small Models for Scientific QA
Yuhan Ma
Haiqi Jiang
Chenyou Fan
LRM
26
14
0
09 Aug 2023
Teacher-Student Architecture for Knowledge Distillation: A Survey
Chengming Hu
Xuan Li
Danyang Liu
Haolun Wu
Xi Chen
Ju Wang
Xue Liu
21
16
0
08 Aug 2023
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
25
5
0
07 Aug 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT
Souvik Kundu
S. Nittur
Maciej Szankin
Sairam Sundaresan
MQ
25
2
0
14 Jul 2023
Frameless Graph Knowledge Distillation
Dai Shi
Zhiqi Shao
Yi Guo
Junbin Gao
28
4
0
13 Jul 2023
Transformers in Healthcare: A Survey
Subhash Nerella
S. Bandyopadhyay
Jiaqing Zhang
Miguel Contreras
Scott Siegel
...
Jessica Sena
B. Shickel
A. Bihorac
Kia Khezeli
Parisa Rashidi
MedIm
AI4CE
19
25
0
30 Jun 2023
Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference
Junyan Li
Li Lyna Zhang
Jiahang Xu
Yujing Wang
Shaoguang Yan
...
Ting Cao
Hao-Lun Sun
Weiwei Deng
Qi Zhang
Mao Yang
25
10
0
26 Jun 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Siyu Ren
Kenny Q. Zhu
6
9
0
25 Jun 2023
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
33
65
0
20 Jun 2023
Bridging the Gap between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models
Qinhong Zhou
Zonghan Yang
Peng Li
Yang Liu
17
3
0
15 Jun 2023
MiniLLM: Knowledge Distillation of Large Language Models
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
ALM
31
77
0
14 Jun 2023
EM-Network: Oracle Guided Self-distillation for Sequence Learning
J. Yoon
Sunghwan Ahn
Hyeon Seung Lee
Minchan Kim
Seokhwan Kim
N. Kim
VLM
25
2
0
14 Jun 2023
Previous
1
2
3
4
5
6
...
8
9
10
Next