v1v2v3v4v5 (latest)

TinyBERT: Distilling BERT for Natural Language Understanding

Findings (Findings), 2019

23 September 2019

Xiaoqi Jiao

Yichun Yin

Lifeng Shang

Xin Jiang

Linlin Li

Qun Liu

Papers citing "TinyBERT: Distilling BERT for Natural Language Understanding"

50 / 1,055 papers shown

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge DistillationIEEE International Conference on Robotics and Automation (ICRA), 2024

Junyou Zhu

Yanyuan Qiao

Siqi Zhang

Xingjian He

Qi Wu

Jing Liu

VLM

382

27 Sep 2024

Cascade Prompt Learning for Vision-Language Model AdaptationEuropean Conference on Computer Vision (ECCV), 2024

Ge Wu

Xin Zhang

Zheng Li

Zhaowei Chen

Jiajun Liang

Jian Yang

Xiang Li

VLM

308

26 Sep 2024

Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification

Guanyi Mou

Yichuan Li

Kyumin Lee

276

26 Sep 2024

Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification

442

26 Sep 2024

An Effective, Robust and Fairness-aware Hate Speech Detection Framework

Guanyi Mou

Kyumin Lee

288

25 Sep 2024

DSG-KD: Knowledge Distillation from Domain-Specific to General Language ModelsIEEE Access (IEEE Access), 2024

139

23 Sep 2024

Towards Building Efficient Sentence BERT Models using Layer PruningPacific Asia Conference on Language, Information and Computation (PACLIC), 2024

Anushka Shelke

Riya Savant

Raviraj Joshi

143

21 Sep 2024

On Importance of Pruning and Distillation for Efficient Low Resource NLP

Raviraj Joshi

Geetanjali Kale

233

21 Sep 2024

Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5Jahrestagung der Gesellschaft für Informatik (GI Jahrestagung), 2024

Marcel Lamott

Muhammad Armaghan Shakir

168

17 Sep 2024

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Bo Wang

...

Feng Wang

Georgios Mastrapas

Andreas Koukounas

Nan Wang

Han Xiao

RALM

560

108

16 Sep 2024

Recent Advances in Attack and Defense Approaches of Large Language Models

339

05 Sep 2024

Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews

Sachintha Rajith Ponnamperuma

G. Sandamali

K. L. Sudheera

164

23 Aug 2024

MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

Han Zhu

243

19 Aug 2024

FuseChat: Knowledge Fusion of Chat Models

Xiaojun Quan

353

15 Aug 2024

PhishLang: A Real-Time, Fully Client-Side Phishing Detection Framework Using MobileBERT

Sayak Saha Roy

Shirin Nilizadeh

380

11 Aug 2024

ProFuser: Progressive Fusion of Large Language Models

350

09 Aug 2024

Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial RobustnessArtificial Intelligence and Cloud Computing Conference (AICC), 2024

Xiaojing Fan

Chunliang Tao

AAML

271

08 Aug 2024

MPC-Minimized Secure LLM Inference

Deevashwer Rathee

Dacheng Li

Ion Stoica

Hao Zhang

Raluca A. Popa

268

07 Aug 2024

Cross-layer Attention Sharing for Pre-trained Large Language Models

...

254

04 Aug 2024

Pruning Large Language Models with Semi-Structural Adaptive Sparse Training

Weiyu Huang

Yuezhou Hu

Guohao Jian

Jun Zhu

Jianfei Chen

314

30 Jul 2024

Dataset Distillation for Offline Reinforcement Learning

Jonathan Light

Yuanzhe Liu

Ziniu Hu

344

29 Jul 2024

Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation

Heejoon Koo

508

28 Jul 2024

LLAVADI: What Matters For Multimodal Large Language Models Distillation

Xiangtai Li

Ming-Hsuan Yang

212

28 Jul 2024

Graph-Structured Speculative Decoding

Dongyan Zhao

Rui Yan

185

23 Jul 2024

Reconstruct the Pruned Model without Any Retraining

Shengchao Hu

220

18 Jul 2024

Retrieval-Augmented Generation for Natural Language Processing: A Survey

Shangyu Wu

Yufei Cui

...

Xue Liu

432

18 Jul 2024

Sharif-STR at SemEval-2024 Task 1: Transformer as a Regression Model for Fine-Grained Scoring of Textual Semantic Relations

Seyedeh Fatemeh Ebrahimi

176

17 Jul 2024

Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection

248

17 Jul 2024

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding

342

17 Jul 2024

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language ModelsIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024

350

16 Jul 2024

Multi-Granularity Semantic Revision for Large Language Model Distillation

170

14 Jul 2024

AutoTask: Task Aware Multi-Faceted Single Model for Multi-Task Ads Relevance

Shouchang Guo

Sonam Damani

Keng-hao Chang

154

09 Jul 2024

Aspect-Based Sentiment Analysis Techniques: A Comparative Study

Sachintha Rajith Ponnamperuma

G. Sandamali

K. L. Sudheera

187

03 Jul 2024

MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models

136

03 Jul 2024

Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application

Chuanpeng Yang

Wang Lu

Yao Zhu

Yidong Wang

Yiqiang Chen

284

02 Jul 2024

$$\text{Memory}^3$: Language Modeling with Explicit Memory$

\text{Memory}^3

: Language Modeling with Explicit Memory

Zhiyu Li

...

Weinan E

219

01 Jul 2024

FoldGPT: Simple and Effective Large Language Model Compression Scheme

235

01 Jul 2024

Direct Preference Knowledge Distillation for Large Language Models

435

28 Jun 2024

InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation

Liang Gou

354

25 Jun 2024

Dual-Space Knowledge Distillation for Large Language Models

Songming Zhang

Xue Zhang

Zengkui Sun

Yufeng Chen

Jinan Xu

290

25 Jun 2024

Exploring compressibility of transformer based text-to-music (TTM) models

Vasileios Moschopoulos

Thanasis Kotsiopoulos

Pablo Peso Parada

Konstantinos Nikiforidis

Alexandros Stergiadis

Karthikeyan P. Saravanan

163

24 Jun 2024

The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation

Haryo Akbarianto Wibowo

Thamar Solorio

Alham Fikri Aji

189

24 Jun 2024

A Complete Survey on LLM-based AI Chatbots

Sumit Kumar Dam

Choong Seon Hong

Yu Qiao

Chaoning Zhang

278

123

17 Jun 2024

An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers

Ashim Gupta

Sina Mahdipour Saravani

P. Sadayappan

Vivek Srikumar

248

17 Jun 2024

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Prashanth Vijayaraghavan

220

16 Jun 2024

Optimized Speculative Sampling for GPU Hardware Accelerators

Seanie Lee

209

16 Jun 2024

Discovering influential text using convolutional neural networksAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

199

14 Jun 2024

GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model

324

12 Jun 2024

Survey for Landing Generative AI in Social and E-commerce Recsys -- the Industry Perspectives

Da Xu

134

10 Jun 2024

VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning

Oshin Dutta

Ritvik Gupta

Sumeet Agarwal

325

07 Jun 2024

All Papers

TinyBERT: Distilling BERT for Natural Language Understanding

Papers citing "TinyBERT: Distilling BERT for Natural Language Understanding"