Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.01108
Cited By
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
2 October 2019
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter"
50 / 131 papers shown
Title
ViTOC: Vision Transformer and Object-aware Captioner
Feiyang Huang
63
0
0
09 Nov 2024
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu
Xinyan Velocity Yu
Dani Yogatama
Jiasen Lu
Yoon Kim
AIFin
62
17
0
07 Nov 2024
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
44
0
0
05 Nov 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Youngjoon Lee
J. Gong
Joonhyuk Kang
67
0
0
31 Oct 2024
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Haoyang Li
Xiaogeng Liu
SILM
58
5
0
30 Oct 2024
Vulnerability of LLMs to Vertically Aligned Text Manipulations
Zhecheng Li
Yijiao Wang
Bryan Hooi
Yujun Cai
Zhen Xiong
Nanyun Peng
Kai-Wei Chang
93
1
0
26 Oct 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
83
0
0
25 Oct 2024
Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges
Farid Ariai
Gianluca Demartini
ELM
AILaw
VLM
50
4
0
25 Oct 2024
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs
Ferdi Kossmann
Bruce Fontaine
Daya Khudia
Michael Cafarella
Samuel Madden
204
2
0
23 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
122
5
0
22 Oct 2024
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
Han Zhou
Jordy Van Landeghem
Teodora Popordanoska
Matthew B. Blaschko
55
2
0
20 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenyuan Xu
Rujun Han
Zhenting Wang
L. Le
Dhruv Madeka
Lei Li
Wenjie Wang
Rishabh Agarwal
Chen-Yu Lee
Tomas Pfister
100
9
0
15 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
128
5
0
14 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
73
5
0
10 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
112
0
0
09 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza
Mengjie Zhao
Zhuoyuan Mao
Sivan Doveh
Wei Lin
...
Yuki Mitsufuji
Horst Possegger
Rogerio Feris
Leonid Karlinsky
James Glass
VLM
118
1
0
08 Oct 2024
Efficient Inference for Large Language Model-based Generative Recommendation
Xinyu Lin
Chaoqun Yang
Wenjie Wang
Yongqi Li
Cunxiao Du
Fuli Feng
See-Kiong Ng
Tat-Seng Chua
91
4
0
07 Oct 2024
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik
Jason Liu
Claire Wang
Amish Sethi
Saikat Dutta
Mayur Naik
Eric Wong
51
2
0
04 Oct 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
96
6
0
27 Sep 2024
Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses
Mathieu Bazinet
Valentina Zantedeschi
Pascal Germain
MLT
AI4CE
45
2
0
26 Sep 2024
One missing piece in Vision and Language: A Survey on Comics Understanding
Emanuele Vivoli
Andrey Barsky
Mohamed Ali Souibgui
Artemis LLabres
Marco Bertini
Dimosthenis Karatzas
55
4
0
14 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
134
26
0
10 Sep 2024
On The Role of Prompt Construction In Enhancing Efficacy and Efficiency of LLM-Based Tabular Data Generation
Banooqa H. Banday
Kowshik Thopalli
Tanzima Z. Islam
Jayaraman J. Thiagarajan
80
0
0
06 Sep 2024
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture
Qianlong Xiang
Miao Zhang
Yuzhang Shang
Jianlong Wu
Yan Yan
Liqiang Nie
DiffM
85
10
0
05 Sep 2024
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
84
4
0
15 Aug 2024
The advantages of context specific language models: the case of the Erasmian Language Model
João Gonçalves
Nick Jelicic
Michele Murgia
Evert Stamhuis
55
0
0
13 Aug 2024
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Mervat Abassy
Kareem Elozeiri
Alexander Aziz
Minh Ngoc Ta
Raj Vardhan Tomar
...
Alham Fikri Aji
Artem Shelmanov
Nizar Habash
Iryna Gurevych
Preslav Nakov
DeLMO
72
17
0
08 Aug 2024
Private Collaborative Edge Inference via Over-the-Air Computation
Selim F. Yilmaz
Burak Hasircioglu
Li Qiao
Deniz Gunduz
FedML
97
1
0
30 Jul 2024
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation
Heejoon Koo
78
0
0
28 Jul 2024
FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios
Yongjian Tang
Rakebul Hasan
Thomas Runkler
101
2
0
10 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
62
6
0
28 Jun 2024
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
USVSN Sai Prashanth
Alvin Deng
Kyle O'Brien
Jyothir S V
Mohammad Aflah Khan
...
Jacob Ray Fuehne
Stella Biderman
Tracy Ke
Katherine Lee
Naomi Saphra
104
12
0
25 Jun 2024
A Syntax-Injected Approach for Faster and More Accurate Sentiment Analysis
Muhammad Imran
Olga Kellert
Carlos Gómez-Rodríguez
28
1
0
21 Jun 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
80
7
0
20 Jun 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Patrick Emami
Zhaonan Li
Saumya Sinha
Truc Nguyen
86
1
0
30 May 2024
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
Albin Soutif--Cormerais
Simone Magistri
Joost van de Weijer
Andew D. Bagdanov
51
1
0
28 May 2024
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
103
0
0
27 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang
Yanda Li
Junyuan Jiang
Zepeng Ding
Ziqin Luo
Guochao Jiang
Jiaqing Liang
Deqing Yang
51
13
0
27 May 2024
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
Abdelrahman Abdelhamed
Mahmoud Afifi
Alec Go
MLLM
VLM
68
3
0
24 May 2024
Full Line Code Completion: Bringing AI to Desktop
Anton Semenkin
Vitaliy Bibaev
Yaroslav Sokolov
Kirill Krylov
Alexey Kalina
...
Mikhail Podvitskii
Petr Surkov
Yaroslav Golubev
Nikita Povarov
T. Bryksin
56
2
0
14 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
71
33
0
08 May 2024
LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models
Dingkun Zhang
Sijia Li
Chen Chen
Qingsong Xie
H. Lu
54
25
0
17 Apr 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
77
3
0
21 Mar 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
...
Thomas Kollar
Sergey Levine
Chelsea Finn
Sergey Levine
Chelsea Finn
104
197
0
19 Mar 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
69
1
0
21 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALM
LRM
68
7
0
18 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
54
29
0
08 Feb 2024
Large Language Model Agent for Hyper-Parameter Optimization
Siyi Liu
Chen Gao
Yong Li
70
21
0
02 Feb 2024
Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending
Mario Sanz-Guerrero
Javier Arroyo
47
5
0
29 Jan 2024
FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems
Shiyuan Luo
Juntong Ni
Shengyu Chen
Runlong Yu
Yiqun Xie
Licheng Liu
Zhenong Jin
Huaxiu Yao
Xiaowei Jia
65
8
0
17 Nov 2023
Previous
1
2
3
Next