Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Home
Papers
1909.10351
Cited By
v1
v2
v3
v4
v5 (latest)
TinyBERT: Distilling BERT for Natural Language Understanding
Findings (Findings), 2019
23 September 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TinyBERT: Distilling BERT for Natural Language Understanding"
50 / 1,055 papers shown
MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation
IEEE International Conference on Robotics and Automation (ICRA), 2024
Junyou Zhu
Yanyuan Qiao
Siqi Zhang
Xingjian He
Qi Wu
Jing Liu
VLM
382
4
0
27 Sep 2024
Cascade Prompt Learning for Vision-Language Model Adaptation
European Conference on Computer Vision (ECCV), 2024
Ge Wu
Xin Zhang
Zheng Li
Zhaowei Chen
Jiajun Liang
Jian Yang
Xiang Li
VLM
308
23
0
26 Sep 2024
Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification
Guanyi Mou
Yichuan Li
Kyumin Lee
276
3
0
26 Sep 2024
Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification
Raja Kumar
Raghav Singhal
Pranamya Kulkarni
Deval Mehta
Kshitij S. Jadhav
442
3
0
26 Sep 2024
An Effective, Robust and Fairness-aware Hate Speech Detection Framework
Guanyi Mou
Kyumin Lee
288
3
0
25 Sep 2024
DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models
IEEE Access (IEEE Access), 2024
Sangyeon Cho
Jangyeong Jeon
Dongjoon Lee
Changhee Lee
Junyeong Kim
139
2
0
23 Sep 2024
Towards Building Efficient Sentence BERT Models using Layer Pruning
Pacific Asia Conference on Language, Information and Computation (PACLIC), 2024
Anushka Shelke
Riya Savant
Raviraj Joshi
143
1
0
21 Sep 2024
On Importance of Pruning and Distillation for Efficient Low Resource NLP
Aishwarya Mirashi
Purva Lingayat
Srushti Sonavane
Tejas Padhiyar
Raviraj Joshi
Geetanjali Kale
233
2
0
21 Sep 2024
Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5
Jahrestagung der Gesellschaft für Informatik (GI Jahrestagung), 2024
Marcel Lamott
Muhammad Armaghan Shakir
168
4
0
17 Sep 2024
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Saba Sturua
Isabelle Mohr
Mohammad Kalim Akram
Michael Gunther
Bo Wang
...
Feng Wang
Georgios Mastrapas
Andreas Koukounas
Nan Wang
Han Xiao
RALM
560
108
0
16 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
339
8
0
05 Sep 2024
Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews
Dineth Jayakody
A. V. A. Malkith
Koshila Isuranda
Vishal Thenuwara
Nisansa de Silva
Sachintha Rajith Ponnamperuma
G. Sandamali
K. L. Sudheera
164
8
0
23 Aug 2024
MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
Lionel Z. Wang
Yiming Ma
Renfei Gao
Beichen Guo
Han Zhu
Wenqi Fan
Zexin Lu
Ka Chung Ng
SyDa
243
8
0
19 Aug 2024
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
353
38
0
15 Aug 2024
PhishLang: A Real-Time, Fully Client-Side Phishing Detection Framework Using MobileBERT
Sayak Saha Roy
Shirin Nilizadeh
380
0
0
11 Aug 2024
ProFuser: Progressive Fusion of Large Language Models
Tianyuan Shi
Fanqi Wan
Canbin Huang
Xiaojun Quan
Chenliang Li
Ming Yan
J. Zhang
Minhua Huang
Wu Kai
MoMe
350
3
0
09 Aug 2024
Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness
Artificial Intelligence and Cloud Computing Conference (AICC), 2024
Xiaojing Fan
Chunliang Tao
AAML
271
33
0
08 Aug 2024
MPC-Minimized Secure LLM Inference
Deevashwer Rathee
Dacheng Li
Ion Stoica
Hao Zhang
Raluca A. Popa
268
8
0
07 Aug 2024
Cross-layer Attention Sharing for Pre-trained Large Language Models
Yongyu Mu
Yuzhang Wu
Yuchun Fan
Chenglong Wang
Hengyu Li
...
Murun Yang
Fandong Meng
Jie Zhou
Tong Xiao
Jingbo Zhu
254
6
0
04 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
314
17
0
30 Jul 2024
Dataset Distillation for Offline Reinforcement Learning
Jonathan Light
Yuanzhe Liu
Ziniu Hu
DD
344
4
0
29 Jul 2024
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation
Heejoon Koo
508
0
0
28 Jul 2024
LLAVADI: What Matters For Multimodal Large Language Models Distillation
Shilin Xu
Xiangtai Li
Haobo Yuan
Lu Qi
Yunhai Tong
Ming-Hsuan Yang
212
15
0
28 Jul 2024
Graph-Structured Speculative Decoding
Zhuocheng Gong
Jiahao Liu
Ziyue Wang
Pengfei Wu
Jingang Wang
Xunliang Cai
Dongyan Zhao
Rui Yan
185
6
0
23 Jul 2024
Reconstruct the Pruned Model without Any Retraining
Pingjie Wang
Ziqing Fan
Shengchao Hu
Zhe Chen
Yanfeng Wang
Yu Wang
220
2
0
18 Jul 2024
Retrieval-Augmented Generation for Natural Language Processing: A Survey
Shangyu Wu
Ying Xiong
Yufei Cui
Haolun Wu
Can Chen
...
Lianming Huang
Xue Liu
Tei-Wei Kuo
Nan Guan
Chun Jason Xue
3DV
RALM
432
90
0
18 Jul 2024
Sharif-STR at SemEval-2024 Task 1: Transformer as a Regression Model for Fine-Grained Scoring of Textual Semantic Relations
Seyedeh Fatemeh Ebrahimi
Karim Akhavan Azari
Amirmasoud Iravani
Hadi Alizadeh
Zeinab Taghavi
Hossein Sameti
176
4
0
17 Jul 2024
Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection
Jintang Xue
Yun Cheng Wang
Chengwei Wei
C.-C. Jay Kuo
248
1
0
17 Jul 2024
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Ofir Abramovich
Niv Nayman
Sharon Fogel
I. Lavi
Ron Litman
Shahar Tsiper
Royee Tichauer
Srikar Appalaraju
Shai Mazor
R. Manmatha
VLM
342
6
0
17 Jul 2024
R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models
IEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Aladin Djuhera
Amin Seffo
Xinyan Li
U. Mönich
Holger Boche
Walid Saad
350
8
0
16 Jul 2024
Multi-Granularity Semantic Revision for Large Language Model Distillation
Xiaoyu Liu
Yun-feng Zhang
Wei Li
Simiao Li
Xu Huang
Hanting Chen
Yehui Tang
Jie Hu
Zhiwei Xiong
Yunhe Wang
170
3
0
14 Jul 2024
AutoTask: Task Aware Multi-Faceted Single Model for Multi-Task Ads Relevance
Shouchang Guo
Sonam Damani
Keng-hao Chang
154
0
0
09 Jul 2024
Aspect-Based Sentiment Analysis Techniques: A Comparative Study
Dineth Jayakody
Koshila Isuranda
A. V. A. Malkith
Nisansa de Silva
Sachintha Rajith Ponnamperuma
G. Sandamali
K. L. Sudheera
187
6
0
03 Jul 2024
MLKD-BERT: Multi-level Knowledge Distillation for Pre-trained Language Models
Ying Zhang
Ziheng Yang
Shufan Ji
KELM
136
2
0
03 Jul 2024
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
Chuanpeng Yang
Wang Lu
Yao Zhu
Yidong Wang
Qian Chen
Chenlong Gao
Bingjie Yan
Yiqiang Chen
ALM
KELM
284
69
0
02 Jul 2024
Memory
3
\text{Memory}^3
Memory
3
: Language Modeling with Explicit Memory
Hongkang Yang
Peng Liu
Wenjin Wang
Huayi Lai
Zhiyu Li
...
Yu Yu
Kai Chen
Feiyu Xiong
Linpeng Tang
Weinan E
219
32
0
01 Jul 2024
FoldGPT: Simple and Effective Large Language Model Compression Scheme
Songwei Liu
Chao Zeng
Lianqiang Li
Chenqian Yan
Xing Mei
Lean Fu
Fangmin Chen
235
8
0
01 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
435
13
0
28 Jun 2024
InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation
Jinbin Huang
Wenbin He
Liang Gou
Liu Ren
Chris Bryan
354
0
0
25 Jun 2024
Dual-Space Knowledge Distillation for Large Language Models
Songming Zhang
Xue Zhang
Zengkui Sun
Yufeng Chen
Jinan Xu
290
16
0
25 Jun 2024
Exploring compressibility of transformer based text-to-music (TTM) models
Vasileios Moschopoulos
Thanasis Kotsiopoulos
Pablo Peso Parada
Konstantinos Nikiforidis
Alexandros Stergiadis
Gerasimos Papakostas
Md. Asif Jalal
Jisi Zhang
Anastasios Drosou
Karthikeyan P. Saravanan
163
0
0
24 Jun 2024
The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation
Haryo Akbarianto Wibowo
Thamar Solorio
Alham Fikri Aji
189
3
0
24 Jun 2024
A Complete Survey on LLM-based AI Chatbots
Sumit Kumar Dam
Choong Seon Hong
Yu Qiao
Chaoning Zhang
278
123
0
17 Jun 2024
An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers
Ashim Gupta
Sina Mahdipour Saravani
P. Sadayappan
Vivek Srikumar
248
2
0
17 Jun 2024
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification
Prashanth Vijayaraghavan
Hongzhi Wang
Luyao Shi
Tyler Baldwin
David Beymer
Ehsan Degan
220
3
0
16 Jun 2024
Optimized Speculative Sampling for GPU Hardware Accelerators
Dominik Wagner
Seanie Lee
Ilja Baumann
Philipp Seeberger
Korbinian Riedhammer
Tobias Bocklet
209
4
0
16 Jun 2024
Discovering influential text using convolutional neural networks
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Megan Ayers
Luke Sanford
Margaret E. Roberts
Eddie Yang
199
0
0
14 Jun 2024
GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model
Yingying Gao
Shilei Zhang
Chao Deng
Junlan Feng
324
1
0
12 Jun 2024
Survey for Landing Generative AI in Social and E-commerce Recsys -- the Industry Perspectives
Da Xu
Danqing Zhang
Guangyu Yang
Bo Yang
Shuyuan Xu
Lingling Zheng
Cindy Liang
134
4
0
10 Jun 2024
VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning
Oshin Dutta
Ritvik Gupta
Sumeet Agarwal
325
4
0
07 Jun 2024
Previous
1
2
3
4
5
...
20
21
22
Next