Do Deep Nets Really Need to be Deep?

21 December 2013

Lei Jimmy Ba

Papers citing "Do Deep Nets Really Need to be Deep?"

50 / 337 papers shown

Title
Large Language Models Can Self-Improve Jiaxin Huang S. Gu Le Hou Yuexin Wu Xuezhi Wang Hongkun Yu Jiawei Han ReLM AI4MH LRM 47 566 0 20 Oct 2022
IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors Sheng Xu Yanjing Li Bo-Wen Zeng Teli Ma Baochang Zhang Xianbin Cao Penglei Gao Jinhu Lv 30 15 0 07 Oct 2022
Meta-Ensemble Parameter Learning Zhengcong Fei Shuman Tian Junshi Huang Xiaoming Wei Xiaolin K. Wei OOD 44 2 0 05 Oct 2022
Using Knowledge Distillation to improve interpretable models in a retail banking context Maxime Biehler Mohamed Guermazi Célim Starck 62 2 0 30 Sep 2022
Compressed Gastric Image Generation Based on Soft-Label Dataset Distillation for Medical Data Sharing Guang Li Ren Togo Takahiro Ogawa Miki Haseyama DD 32 40 0 29 Sep 2022
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference Mu Yuan Lan Zhang Zimu Zheng Yi-Nan Zhang Xiang-Yang Li 25 2 0 28 Sep 2022
Efficient Few-Shot Learning Without Prompts Lewis Tunstall Nils Reimers Unso Eun Seo Jo Luke Bates Daniel Korat Moshe Wasserblat Oren Pereg VLM 34 182 0 22 Sep 2022
Semi-Supervised and Unsupervised Deep Visual Learning: A Survey Yanbei Chen Massimiliano Mancini Xiatian Zhu Zeynep Akata 45 113 0 24 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey Dalin Zhang Kaixuan Chen Yan Zhao B. Yang Li-Ping Yao Christian S. Jensen 48 3 0 22 Aug 2022
Effectiveness of Function Matching in Driving Scene Recognition Shingo Yashima 26 1 0 20 Aug 2022
Causality-Inspired Taxonomy for Explainable Artificial Intelligence Pedro C. Neto Tiago B. Gonccalves João Ribeiro Pinto W. Silva Ana F. Sequeira Arun Ross Jaime S. Cardoso XAI 36 12 0 19 Aug 2022
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment Jie Zhu Leye Wang Xiao Han 28 9 0 11 Aug 2022
ProSelfLC: Progressive Self Label Correction Towards A Low-Temperature Entropy State Xinshao Wang Yang Hua Elyor Kodirov S. Mukherjee David A. Clifton N. Robertson 19 6 0 30 Jun 2022
Knowledge Distillation of Transformer-based Language Models Revisited Chengqiang Lu Jianwei Zhang Yunfei Chu Zhengyu Chen Jingren Zhou Fei Wu Haiqing Chen Hongxia Yang VLM 27 10 0 29 Jun 2022
Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks Zhiwei Bai Tao Luo Z. Xu Yaoyu Zhang 31 4 0 26 May 2022
Improving the Latent Space of Image Style Transfer Yun-Hao Bai Cairong Wang C. Yuan Yanbo Fan Jue Wang DRL 37 0 0 24 May 2022
Knowledge Distillation via the Target-aware Transformer Sihao Lin Hongwei Xie Bing Wang Kaicheng Yu Xiaojun Chang Xiaodan Liang G. Wang ViT 20 104 0 22 May 2022
A Closer Look at Branch Classifiers of Multi-exit Architectures Shaohui Lin Bo Ji Rongrong Ji Angela Yao 12 4 0 28 Apr 2022
HRPose: Real-Time High-Resolution 6D Pose Estimation Network Using Knowledge Distillation Qingze Guan Zihao Sheng Shibei Xue 3DH 19 15 0 20 Apr 2022
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation Minsoo Kang Jaeyoo Park Bohyung Han CLL 27 179 0 02 Apr 2022
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis Huan Wang Jian Ren Zeng Huang Kyle Olszewski Menglei Chai Yun Fu Sergey Tulyakov 42 80 0 31 Mar 2022
Knowledge Distillation with the Reused Teacher Classifier Defang Chen Jianhan Mei Hailin Zhang C. Wang Yan Feng Chun-Yen Chen 36 166 0 26 Mar 2022
Efficient Sub-structured Knowledge Distillation Wenye Lin Yangming Li Lemao Liu Shuming Shi Haitao Zheng 12 1 0 09 Mar 2022
The rise of the lottery heroes: why zero-shot pruning is hard Enzo Tartaglione 29 6 0 24 Feb 2022
Distilled Neural Networks for Efficient Learning to Rank F. M. Nardini Cosimo Rulli Salvatore Trani Rossano Venturini FedML 29 16 0 22 Feb 2022
Submodlib: A Submodular Optimization Library Vishal Kaushal Ganesh Ramakrishnan Rishabh K. Iyer 43 12 0 22 Feb 2022
Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning Kexue Fu Peng Gao Renrui Zhang Hongsheng Li Yu Qiao Manning Wang SSL 3DPC 28 23 0 09 Feb 2022
Keyword localisation in untranscribed speech using visually grounded speech models Kayode Olaleye Dan Oneaţă Herman Kamper 32 7 0 02 Feb 2022
Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank? Sheikh Shams Azam Seyyedali Hosseinalipour Qiang Qiu Christopher G. Brinton FedML 26 20 0 01 Feb 2022
Training Thinner and Deeper Neural Networks: Jumpstart Regularization Carles Roger Riera Molina Camilo Rey Thiago Serra Eloi Puertas O. Pujol 27 4 0 30 Jan 2022
Dynamic Rectification Knowledge Distillation Fahad Rahman Amik Ahnaf Ismat Tasin Silvia Ahmed M. M. L. Elahi Nabeel Mohammed 28 5 0 27 Jan 2022
Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer Kaiqi Zhao Yitao Chen Ming Zhao 25 3 0 22 Jan 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems Yoshitomo Matsubara Luca Soldaini Eric Lind Alessandro Moschitti 29 6 0 15 Jan 2022
An Experimental Study of the Impact of Pre-training on the Pruning of a Convolutional Neural Network Nathan Hubens M. Mancas B. Gosselin Marius Preda T. Zaharia VLM CVBM 23 8 0 15 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image Yuki M. Asano Aaqib Saeed 43 7 0 01 Dec 2021
Improving Deep Learning Interpretability by Saliency Guided Training Aya Abdelsalam Ismail H. C. Bravo S. Feizi FAtt 20 80 0 29 Nov 2021
Multi-label Iterated Learning for Image Classification with Label Ambiguity Sai Rajeswar Pau Rodríguez López Soumye Singhal David Vazquez Rameswar Panda VLM 26 30 0 23 Nov 2021
Meta-Teacher For Face Anti-Spoofing Yunxiao Qin Zitong Yu Longbin Yan Zezheng Wang Chenxu Zhao Zhen Lei CVBM 25 61 0 12 Nov 2021
Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models J. Yoon H. Kim Hyeon Seung Lee Sunghwan Ahn N. Kim 36 1 0 05 Nov 2021
RGP: Neural Network Pruning through Its Regular Graph Structure Zhuangzhi Chen Jingyang Xiang Yao Lu Qi Xuan Xiaoniu Yang 27 1 0 28 Oct 2021
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations Louis Fortier-Dubois Gaël Letarte Benjamin Leblanc Franccois Laviolette Pascal Germain UQCV 17 0 0 28 Oct 2021
Pixel-by-Pixel Cross-Domain Alignment for Few-Shot Semantic Segmentation A. Tavera Fabio Cermelli Carlo Masone Barbara Caputo 29 19 0 22 Oct 2021
An Economy of Neural Networks: Learning from Heterogeneous Experiences A. Kuriksha 19 7 0 22 Oct 2021
Augmenting Knowledge Distillation With Peer-To-Peer Mutual Learning For Model Compression Usma Niyaz Deepti R. Bathula 18 8 0 21 Oct 2021
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation Sumanth Chennupati Mohammad Mahdi Kamani Zhongwei Cheng Lin Chen 26 4 0 19 Oct 2021
Efficient and Private Federated Learning with Partially Trainable Networks Hakim Sidahmed Zheng Xu Ankush Garg Yuan Cao Mingqing Chen FedML 49 13 0 06 Oct 2021
Multilingual AMR Parsing with Noisy Knowledge Distillation Deng Cai Xin Li Jackie Chun-Sing Ho Lidong Bing W. Lam 27 18 0 30 Sep 2021
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network Takaaki Saeki Shinnosuke Takamichi Hiroshi Saruwatari 34 3 0 22 Sep 2021
A Studious Approach to Semi-Supervised Learning Sahil Khose Shruti Jain V. Manushree 13 0 0 18 Sep 2021
Comfetch: Federated Learning of Large Networks on Constrained Clients via Sketching Tahseen Rabbani Brandon Yushan Feng Marco Bornstein Kyle Rui Sang Yifan Yang Arjun Rajkumar A. Varshney Furong Huang FedML 59 2 0 17 Sep 2021