ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.10351
  4. Cited By
TinyBERT: Distilling BERT for Natural Language Understanding
v1v2v3v4v5 (latest)

TinyBERT: Distilling BERT for Natural Language Understanding

Findings (Findings), 2019
23 September 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
    VLM
ArXiv (abs)PDFHTML

Papers citing "TinyBERT: Distilling BERT for Natural Language Understanding"

50 / 1,055 papers shown
ELAD: Explanation-Guided Large Language Models Active Distillation
ELAD: Explanation-Guided Large Language Models Active Distillation
Yifei Zhang
Bo Pan
Chen Ling
Yuntong Hu
Bo Pan
225
10
0
20 Feb 2024
PromptKD: Distilling Student-Friendly Knowledge for Generative Language
  Models via Prompt Tuning
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
Gyeongman Kim
Doohyuk Jang
Eunho Yang
VLM
283
19
0
20 Feb 2024
Distilling Large Language Models for Text-Attributed Graph Learning
Distilling Large Language Models for Text-Attributed Graph Learning
Bo Pan
Zhengwu Zhang
Yifei Zhang
Yuntong Hu
Bo Pan
211
25
0
19 Feb 2024
Utilizing BERT for Information Retrieval: Survey, Applications,
  Resources, and Challenges
Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges
Jiajia Wang
Jimmy Xiangji Huang
Xinhui Tu
Junmei Wang
Angela J. Huang
Md Tahmid Rahman Laskar
Amran Bhuiyan
342
93
0
18 Feb 2024
Efficiency at Scale: Investigating the Performance of Diminutive
  Language Models in Clinical Tasks
Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks
Niall Taylor
U. Ghose
Omid Rohanian
Mohammadmahdi Nouriborji
Andrey Kormilitzin
David Clifton
A. Nevado-Holgado
LM&MAALM
250
11
0
16 Feb 2024
Fast Vocabulary Transfer for Language Model Compression
Fast Vocabulary Transfer for Language Model Compression
Leonidas Gee
Andrea Zugarini
Leonardo Rigutini
Paolo Torroni
182
41
0
15 Feb 2024
Multi-word Tokenization for Sequence Compression
Multi-word Tokenization for Sequence Compression
Leonidas Gee
Leonardo Rigutini
Marco Ernandes
Andrea Zugarini
195
14
0
15 Feb 2024
NutePrune: Efficient Progressive Pruning with Numerous Teachers for
  Large Language Models
NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models
Shengrui Li
Junzhe Chen
Xueting Han
Jing Bai
262
8
0
15 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
284
88
0
15 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
276
46
0
08 Feb 2024
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Tao Gui
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
194
6
0
03 Feb 2024
TransFR: Transferable Federated Recommendation with Adapter Tuning on Pre-trained Language Models
TransFR: Transferable Federated Recommendation with Adapter Tuning on Pre-trained Language Models
Honglei Zhang
Zhiwei Li
Haoxuan Li
Xin Zhou
J. Zhang
Yidong Li
209
4
0
02 Feb 2024
Security and Privacy Challenges of Large Language Models: A Survey
Security and Privacy Challenges of Large Language Models: A Survey
B. Das
M. H. Amini
Yanzhao Wu
PILMELM
383
307
0
30 Jan 2024
A Comprehensive Survey of Compression Algorithms for Language Models
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
329
20
0
27 Jan 2024
Keep Decoding Parallel with Effective Knowledge Distillation from
  Language Models to End-to-end Speech Recognisers
Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech RecognisersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Michael Hentschel
Yuta Nishikawa
Tatsuya Komatsu
Yusuke Fujita
266
5
0
22 Jan 2024
Confidence Preservation Property in Knowledge Distillation Abstractions
Confidence Preservation Property in Knowledge Distillation AbstractionsSGAI Conferences (SGAI), 2024
Dmitry Vengertsev
Elena Sherman
202
1
0
21 Jan 2024
Knowledge Fusion of Large Language Models
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
250
99
0
19 Jan 2024
Large Language Models for Scientific Information Extraction: An
  Empirical Study for Virology
Large Language Models for Scientific Information Extraction: An Empirical Study for Virology
Mahsa Shamsabadi
Jennifer D'Souza
Sören Auer
305
12
0
18 Jan 2024
Solving Continual Offline Reinforcement Learning with Decision
  Transformer
Solving Continual Offline Reinforcement Learning with Decision Transformer
Kaixin Huang
Li Shen
Chen Zhao
Chun Yuan
Dacheng Tao
CLLOffRL
256
6
0
16 Jan 2024
Knowledge Distillation for Closed-Source Language Models
Knowledge Distillation for Closed-Source Language Models
Hongzhan Chen
Xiaojun Quan
Hehong Chen
Ming Yan
Ji Zhang
BDL
128
3
0
13 Jan 2024
An Empirical Investigation into the Effect of Parameter Choices in
  Knowledge Distillation
An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation
Md Arafat Sultan
Aashka Trivedi
Parul Awasthy
Avirup Sil
229
0
0
12 Jan 2024
Location Aware Modular Biencoder for Tourism Question Answering
Location Aware Modular Biencoder for Tourism Question AnsweringInternational Joint Conference on Natural Language Processing (IJCNLP), 2024
Jinyan Su
Martin Tomko
Timothy Baldwin
KELM
178
1
0
04 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xiaoyan Cai
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
458
121
0
04 Jan 2024
Safety and Performance, Why Not Both? Bi-Objective Optimized Model
  Compression against Heterogeneous Attacks Toward AI Software Deployment
Safety and Performance, Why Not Both? Bi-Objective Optimized Model Compression against Heterogeneous Attacks Toward AI Software DeploymentIEEE Transactions on Software Engineering (TSE), 2024
Jie Zhu
Leye Wang
Xiao Han
Anmin Liu
Tao Xie
AAML
203
6
0
02 Jan 2024
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
Chengwei Qin
Wenhan Xia
Fangkai Jiao
Chen Chen
Yuchen Hu
Bosheng Ding
R. Chen
Shafiq Joty
355
7
0
28 Dec 2023
Large Language Models for Conducting Advanced Text Analytics Information
  Systems Research
Large Language Models for Conducting Advanced Text Analytics Information Systems Research
Benjamin Ampel
Chi-Heng Yang
Junjie Hu
Hsinchun Chen
349
12
0
27 Dec 2023
Knowledge Distillation of LLM for Automatic Scoring of Science Education
  Assessments
Knowledge Distillation of LLM for Automatic Scoring of Science Education Assessments
Ehsan Latif
Luyang Fang
Ping Ma
Xiaoming Zhai
249
9
0
26 Dec 2023
Multi-Task Multi-Agent Shared Layers are Universal Cognition of
  Multi-Agent Coordination
Multi-Task Multi-Agent Shared Layers are Universal Cognition of Multi-Agent Coordination
Jiawei Wang
Jian Zhao
Zhengtao Cao
Ruili Feng
Rongjun Qin
Yang Yu
195
1
0
25 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Xupeng Miao
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
392
119
0
23 Dec 2023
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse
  Weight Factorization
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
181
5
0
20 Dec 2023
Turning Dust into Gold: Distilling Complex Reasoning Capabilities from
  LLMs by Leveraging Negative Data
Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data
Yiwei Li
Peiwen Yuan
Shaoxiong Feng
Boyuan Pan
Bin Sun
Xinglin Wang
Heda Wang
Kan Li
LRM
233
28
0
20 Dec 2023
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for
  Accelerating Language Models Inference
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
Huiping Zhuang
Yihuai Hong
Hongliang Dai
Huiping Zhuang
Cen Chen
276
17
0
19 Dec 2023
A Multimodal Approach for Advanced Pest Detection and Classification
A Multimodal Approach for Advanced Pest Detection and Classification
Jinli Duan
Haoyu Ding
Sung Kim
89
6
0
18 Dec 2023
Can persistent homology whiten Transformer-based black-box models? A
  case study on BERT compression
Can persistent homology whiten Transformer-based black-box models? A case study on BERT compression
Luis Balderas
Miguel Lastra
José M. Benítez
120
2
0
17 Dec 2023
LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian
  Language
LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language
Pierpaolo Basile
Elio Musacchio
Marco Polignano
Lucia Siciliani
G. Fiameni
Giovanni Semeraro
218
57
0
15 Dec 2023
Large Multimodal Model Compression via Efficient Pruning and
  Distillation at AntGroup
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup
Xinjian Zhao
Yao-Min Zhao
Jiajia Liu
Jingdong Chen
Chenyi Zhuang
Jinjie Gu
Ruocheng Guo
Xiangyu Zhao
145
8
0
10 Dec 2023
Building Variable-sized Models via Learngene Pool
Building Variable-sized Models via Learngene PoolAAAI Conference on Artificial Intelligence (AAAI), 2023
Boyu Shi
Shiyu Xia
Xu Yang
Haokun Chen
Zhi Kou
Xin Geng
179
5
0
10 Dec 2023
Transformer as Linear Expansion of Learngene
Transformer as Linear Expansion of LearngeneAAAI Conference on Artificial Intelligence (AAAI), 2023
Shiyu Xia
Miaosen Zhang
Xu Yang
Ruiming Chen
Haokun Chen
Xin Geng
191
11
0
09 Dec 2023
Language Model Knowledge Distillation for Efficient Question Answering
  in Spanish
Language Model Knowledge Distillation for Efficient Question Answering in Spanish
A. Bazaga
Pietro Lio
G. Micklem
169
1
0
07 Dec 2023
Sample-based Dynamic Hierarchical Transformer with Layer and Head Flexibility via Contextual Bandit
Fanfei Meng
Lele Zhang
Yu Chen
Yuxin Wang
223
10
0
05 Dec 2023
Jellyfish: A Large Language Model for Data Preprocessing
Jellyfish: A Large Language Model for Data Preprocessing
Haochen Zhang
Yuyang Dong
Chuan Xiao
Masafumi Oyamada
509
36
0
04 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
392
33
0
01 Dec 2023
LinguaLinked: A Distributed Large Language Model Inference System for
  Mobile Devices
LinguaLinked: A Distributed Large Language Model Inference System for Mobile Devices
Junchen Zhao
Yurun Song
Simeng Liu
Ian G. Harris
Sangeetha Abdu Jyothi
209
9
0
01 Dec 2023
Compression of end-to-end non-autoregressive image-to-speech system for
  low-resourced devices
Compression of end-to-end non-autoregressive image-to-speech system for low-resourced devices
Gokul Srinivasagan
Michael Deisher
Munir Georges
VLM
198
0
0
30 Nov 2023
Mergen: The First Manchu-Korean Machine Translation Model Trained on
  Augmented Data
Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data
Jean Seo
Sungjoo Byun
Minha Kang
Sangah Lee
162
3
0
29 Nov 2023
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with
  Semantic Vector-Quantized Tokenizer
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer
Jacob Zhiyuan Fang
Skyler Zheng
Vasu Sharma
Robinson Piramuthu
VLM
388
1
0
28 Nov 2023
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation
  in non-English Text-to-Image Generation
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image GenerationEuropean Conference on Computer Vision (ECCV), 2023
Jiancang Ma
Chen Chen
Qingsong Xie
H. Lu
DiffMVLM
217
8
0
28 Nov 2023
Cosine Similarity Knowledge Distillation for Individual Class
  Information Transfer
Cosine Similarity Knowledge Distillation for Individual Class Information Transfer
Gyeongdo Ham
Seonghak Kim
Suin Lee
Jae-Hyeok Lee
Daeshik Kim
166
9
0
24 Nov 2023
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation
Efficient and Robust Jet Tagging at the LHC with Knowledge Distillation
Ryan Liu
A. Gandrakota
J. Ngadiuba
M. Spiropulu
J. Vlimant
240
3
0
23 Nov 2023
Knowledge Distillation Based Semantic Communications For Multiple Users
Knowledge Distillation Based Semantic Communications For Multiple UsersIEEE Transactions on Wireless Communications (IEEE TWC), 2023
Chenguang Liu
Yuxin Zhou
Yunfei Chen
Shuang-Hua Yang
141
15
0
23 Nov 2023
Previous
123...567...202122
Next