DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

2 October 2019

Papers citing "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter"

50 / 131 papers shown

Title
ViTOC: Vision Transformer and Object-aware Captioner Feiyang Huang 63 0 0 09 Nov 2024
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities Zhaofeng Wu Xinyan Velocity Yu Dani Yogatama Jiasen Lu Yoon Kim AIFin 62 17 0 07 Nov 2024
Navigating Extremes: Dynamic Sparsity in Large Output Spaces Nasib Ullah Erik Schultheis Mike Lasby Yani Andrew Ioannou Rohit Babbar 44 0 0 05 Nov 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks Youngjoon Lee J. Gong Joonhyuk Kang 67 0 0 31 Oct 2024
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models Haoyang Li Xiaogeng Liu SILM 58 5 0 30 Oct 2024
Vulnerability of LLMs to Vertically Aligned Text Manipulations Zhecheng Li Yijiao Wang Bryan Hooi Yujun Cai Zhen Xiong Nanyun Peng Kai-Wei Chang 93 1 0 26 Oct 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Jahyun Koo Yerin Hwang Yongil Kim Taegwan Kang Hyunkyung Bae Kyomin Jung 83 0 0 25 Oct 2024
Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges Farid Ariai Gianluca Demartini ELM AILaw VLM 50 4 0 25 Oct 2024
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs Ferdi Kossmann Bruce Fontaine Daya Khudia Michael Cafarella Samuel Madden 204 2 0 23 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models Yuxian Gu Hao Zhou Fandong Meng Jie Zhou Minlie Huang 122 5 0 22 Oct 2024
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators Han Zhou Jordy Van Landeghem Teodora Popordanoska Matthew B. Blaschko 55 2 0 20 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling Wenyuan Xu Rujun Han Zhenting Wang L. Le Dhruv Madeka Lei Li Wenjie Wang Rishabh Agarwal Chen-Yu Lee Tomas Pfister 100 9 0 15 Oct 2024
Locality Alignment Improves Vision-Language Models Ian Covert Tony Sun James Zou Tatsunori Hashimoto VLM 128 5 0 14 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation Qingwen Bu Hongyang Li Li Chen Jisong Cai Jia Zeng Heming Cui Maoqing Yao Yu Qiao 73 5 0 10 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity Mutian He Philip N. Garner 112 0 0 09 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Muhammad Jehanzeb Mirza Mengjie Zhao Zhuoyuan Mao Sivan Doveh Wei Lin ... Yuki Mitsufuji Horst Possegger Rogerio Feris Leonid Karlinsky James Glass VLM 118 1 0 08 Oct 2024
Efficient Inference for Large Language Model-based Generative Recommendation Xinyu Lin Chaoqun Yang Wenjie Wang Yongqi Li Cunxiao Du Fuli Feng See-Kiong Ng Tat-Seng Chua 91 4 0 07 Oct 2024
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning Aaditya Naik Jason Liu Claire Wang Amish Sethi Saikat Dutta Mayur Naik Eric Wong 51 2 0 04 Oct 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation Kun Wu Yichen Zhu Jinming Li Junjie Wen Ning Liu Zhiyuan Xu Qinru Qiu 96 6 0 27 Sep 2024
Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses Mathieu Bazinet Valentina Zantedeschi Pascal Germain MLT AI4CE 45 2 0 26 Sep 2024
One missing piece in Vision and Language: A Survey on Comics Understanding Emanuele Vivoli Andrey Barsky Mohamed Ali Souibgui Artemis LLabres Marco Bertini Dimosthenis Karatzas 55 4 0 14 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey Lihu Chen Gaël Varoquaux ALM 134 26 0 10 Sep 2024
On The Role of Prompt Construction In Enhancing Efficacy and Efficiency of LLM-Based Tabular Data Generation Banooqa H. Banday Kowshik Thopalli Tanzima Z. Islam Jayaraman J. Thiagarajan 80 0 0 06 Sep 2024
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture Qianlong Xiang Miao Zhang Yuzhang Shang Jianlong Wu Yan Yan Liqiang Nie DiffM 85 10 0 05 Sep 2024
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability Joakim Edin Andreas Geert Motzfeldt Casper L. Christensen Tuukka Ruotsalo Lars Maaløe Maria Maistro 84 4 0 15 Aug 2024
The advantages of context specific language models: the case of the Erasmian Language Model João Gonçalves Nick Jelicic Michele Murgia Evert Stamhuis 55 0 0 13 Aug 2024
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection Mervat Abassy Kareem Elozeiri Alexander Aziz Minh Ngoc Ta Raj Vardhan Tomar ... Alham Fikri Aji Artem Shelmanov Nizar Habash Iryna Gurevych Preslav Nakov DeLMO 72 17 0 08 Aug 2024
Private Collaborative Edge Inference via Over-the-Air Computation Selim F. Yilmaz Burak Hasircioglu Li Qiao Deniz Gunduz FedML 97 1 0 30 Jul 2024
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation Heejoon Koo 78 0 0 28 Jul 2024
FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios Yongjian Tang Rakebul Hasan Thomas Runkler 101 2 0 10 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models Yixing Li Yuxian Gu Li Dong Dequan Wang Yu Cheng Furu Wei 62 6 0 28 Jun 2024
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon USVSN Sai Prashanth Alvin Deng Kyle O'Brien Jyothir S V Mohammad Aflah Khan ... Jacob Ray Fuehne Stella Biderman Tracy Ke Katherine Lee Naomi Saphra 104 12 0 25 Jun 2024
A Syntax-Injected Approach for Faster and More Accurate Sentiment Analysis Muhammad Imran Olga Kellert Carlos Gómez-Rodríguez 28 1 0 21 Jun 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation Jiaming Zhou Teli Ma Kun-Yu Lin Ronghe Qiu Zifan Wang Junwei Liang 80 7 0 20 Jun 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems Patrick Emami Zhaonan Li Saumya Sinha Truc Nguyen 86 1 0 30 May 2024
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates Albin Soutif--Cormerais Simone Magistri Joost van de Weijer Andew D. Bagdanov 51 1 0 28 May 2024
SoK: Leveraging Transformers for Malware Analysis Pradip Kunwar Kshitiz Aryal Maanak Gupta Mahmoud Abdelsalam Elisa Bertino 103 0 0 27 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization Dixuan Wang Yanda Li Junyuan Jiang Zepeng Ding Ziqin Luo Guochao Jiang Jiaqing Liang Deqing Yang 51 13 0 27 May 2024
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models Abdelrahman Abdelhamed Mahmoud Afifi Alec Go MLLM VLM 68 3 0 24 May 2024
Full Line Code Completion: Bringing AI to Desktop Anton Semenkin Vitaliy Bibaev Yaroslav Sokolov Kirill Krylov Alexey Kalina ... Mikhail Podvitskii Petr Surkov Yaroslav Golubev Nikita Povarov T. Bryksin 56 2 0 14 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review HanXiang Xu Shenao Wang Ningke Li Kaidi Wang Yanjie Zhao Kai Chen Ting Yu Yang Liu Haoyu Wang 71 33 0 08 May 2024
LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models Dingkun Zhang Sijia Li Chen Chen Qingsong Xie H. Lu 54 25 0 17 Apr 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding Jingjing Hu Dan Guo Kun Li Zhan Si Xun Yang Xiaojun Chang Meng Wang 77 3 0 21 Mar 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset Alexander Khazatsky Karl Pertsch Suraj Nair Ashwin Balakrishna Sudeep Dasari ... Thomas Kollar Sergey Levine Chelsea Finn Sergey Levine Chelsea Finn 104 197 0 19 Mar 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality Rahul Zalkikar Kanchan Chandra 69 1 0 21 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models S. Hayati Taehee Jung Tristan Bodding-Long Sudipta Kar A. Sethy Joo-Kyung Kim Dongyeop Kang ALM LRM 68 7 0 18 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes Lucio Dery Steven Kolawole Jean-Francois Kagey Virginia Smith Graham Neubig Ameet Talwalkar 54 29 0 08 Feb 2024
Large Language Model Agent for Hyper-Parameter Optimization Siyi Liu Chen Gao Yong Li 70 21 0 02 Feb 2024
Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending Mario Sanz-Guerrero Javier Arroyo 47 5 0 29 Jan 2024
FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems Shiyuan Luo Juntong Ni Shengyu Chen Runlong Yu Yiqun Xie Licheng Liu Zhenong Jin Huaxiu Yao Xiaowei Jia 65 8 0 17 Nov 2023