ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1312.6184
  4. Cited By
Do Deep Nets Really Need to be Deep?

Do Deep Nets Really Need to be Deep?

21 December 2013
Lei Jimmy Ba
R. Caruana
ArXivPDFHTML

Papers citing "Do Deep Nets Really Need to be Deep?"

50 / 337 papers shown
Title
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
Prime-Aware Adaptive Distillation
Prime-Aware Adaptive Distillation
Youcai Zhang
Zhonghao Lan
Yuchen Dai
Fangao Zeng
Yan Bai
Jie Chang
Yichen Wei
18
40
0
04 Aug 2020
Teacher-Student Training and Triplet Loss for Facial Expression
  Recognition under Occlusion
Teacher-Student Training and Triplet Loss for Facial Expression Recognition under Occlusion
Mariana-Iuliana Georgescu
Radu Tudor Ionescu
CVBM
27
25
0
03 Aug 2020
When stakes are high: balancing accuracy and transparency with
  Model-Agnostic Interpretable Data-driven suRRogates
When stakes are high: balancing accuracy and transparency with Model-Agnostic Interpretable Data-driven suRRogates
Roel Henckaerts
Katrien Antonio
Marie-Pier Côté
23
3
0
14 Jul 2020
T-Basis: a Compact Representation for Neural Networks
T-Basis: a Compact Representation for Neural Networks
Anton Obukhov
M. Rakhuba
Stamatios Georgoulis
Menelaos Kanakis
Dengxin Dai
Luc Van Gool
39
27
0
13 Jul 2020
Tracking-by-Trackers with a Distilled and Reinforced Model
Tracking-by-Trackers with a Distilled and Reinforced Model
Matteo Dunnhofer
N. Martinel
C. Micheloni
VOT
OffRL
27
4
0
08 Jul 2020
A Sequential Self Teaching Approach for Improving Generalization in
  Sound Event Recognition
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
Anurag Kumar
V. Ithapu
19
35
0
30 Jun 2020
Fast, Accurate, and Simple Models for Tabular Data via Augmented
  Distillation
Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation
Rasool Fakoor
Jonas W. Mueller
Nick Erickson
Pratik Chaudhari
Alex Smola
26
54
0
25 Jun 2020
Modeling Lost Information in Lossy Image Compression
Modeling Lost Information in Lossy Image Compression
Yaolong Wang
Mingqing Xiao
Chang-Shu Liu
Shuxin Zheng
Tie-Yan Liu
32
23
0
22 Jun 2020
When Does Preconditioning Help or Hurt Generalization?
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
36
32
0
18 Jun 2020
Dataset Condensation with Gradient Matching
Dataset Condensation with Gradient Matching
Bo Zhao
Konda Reddy Mopuri
Hakan Bilen
DD
36
472
0
10 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,843
0
09 Jun 2020
Self-Distillation as Instance-Specific Label Smoothing
Self-Distillation as Instance-Specific Label Smoothing
Zhilu Zhang
M. Sabuncu
20
116
0
09 Jun 2020
Hardware Implementation of Spiking Neural Networks Using
  Time-To-First-Spike Encoding
Hardware Implementation of Spiking Neural Networks Using Time-To-First-Spike Encoding
Seongbin Oh
D. Kwon
Gyuho Yeom
Won-Mook Kang
Soochang Lee
S. Woo
Jaehyeon Kim
Min Kyu Park
Jong-Ho Lee
58
15
0
09 Jun 2020
ResKD: Residual-Guided Knowledge Distillation
ResKD: Residual-Guided Knowledge Distillation
Xuewei Li
Songyuan Li
Bourahla Omar
Fei Wu
Xi Li
21
47
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Is deeper better? It depends on locality of relevant features
Is deeper better? It depends on locality of relevant features
Takashi Mori
Masahito Ueda
OOD
25
4
0
26 May 2020
Feature Statistics Guided Efficient Filter Pruning
Feature Statistics Guided Efficient Filter Pruning
Hang Li
Chen Ma
Wenyuan Xu
Xue Liu
20
34
0
21 May 2020
Large scale weakly and semi-supervised learning for low-resource video
  ASR
Large scale weakly and semi-supervised learning for low-resource video ASR
Kritika Singh
Vimal Manohar
Alex Xiao
Sergey Edunov
Ross B. Girshick
Vitaliy Liptchinsky
Christian Fuegen
Yatharth Saraf
Geoffrey Zweig
Abdel-rahman Mohamed
31
9
0
16 May 2020
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a
  Teacher-Student Framework With Loss Masking
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking
Eduardo Fonseca
Shawn Hershey
Manoj Plakal
D. Ellis
A. Jansen
R. C. Moore
Xavier Serra
NoLa
25
23
0
02 May 2020
Pruning artificial neural networks: a way to find well-generalizing,
  high-entropy sharp minima
Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima
Enzo Tartaglione
Andrea Bragagnolo
Marco Grangetto
31
11
0
30 Apr 2020
A Review of Privacy-preserving Federated Learning for the
  Internet-of-Things
A Review of Privacy-preserving Federated Learning for the Internet-of-Things
Christopher Briggs
Zhong Fan
Péter András
28
15
0
24 Apr 2020
Distilling Knowledge for Fast Retrieval-based Chat-bots
Distilling Knowledge for Fast Retrieval-based Chat-bots
Amir Vakili Tahami
Kamyar Ghajar
A. Shakery
22
31
0
23 Apr 2020
Structure-Level Knowledge Distillation For Multilingual Sequence
  Labeling
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Xinyu Wang
Yong-jia Jiang
Nguyen Bach
Tao Wang
Fei Huang
Kewei Tu
28
36
0
08 Apr 2020
Direct Speech-to-image Translation
Direct Speech-to-image Translation
Jiguo Li
Xinfeng Zhang
Chuanmin Jia
Jizheng Xu
Li Zhang
Y. Wang
Siwei Ma
Wen Gao
36
29
0
07 Apr 2020
A Survey of Methods for Low-Power Deep Learning and Computer Vision
A Survey of Methods for Low-Power Deep Learning and Computer Vision
Abhinav Goel
Caleb Tung
Yung-Hsiang Lu
George K. Thiruvathukal
VLM
12
92
0
24 Mar 2020
Distilling Knowledge from Graph Convolutional Networks
Distilling Knowledge from Graph Convolutional Networks
Yiding Yang
Jiayan Qiu
Xiuming Zhang
Dacheng Tao
Xinchao Wang
164
226
0
23 Mar 2020
Teacher-Student Domain Adaptation for Biosensor Models
Teacher-Student Domain Adaptation for Biosensor Models
Lawrence Phillips
David B. Grimes
Yihan Li
OOD
20
2
0
17 Mar 2020
Towards Practical Lottery Ticket Hypothesis for Adversarial Training
Towards Practical Lottery Ticket Hypothesis for Adversarial Training
Bai Li
Shiqi Wang
Yunhan Jia
Yantao Lu
Zhenyu Zhong
Lawrence Carin
Suman Jana
AAML
20
14
0
06 Mar 2020
Residual Knowledge Distillation
Residual Knowledge Distillation
Mengya Gao
Yujun Shen
Quanquan Li
Chen Change Loy
14
28
0
21 Feb 2020
Taurus: A Data Plane Architecture for Per-Packet ML
Taurus: A Data Plane Architecture for Per-Packet ML
Tushar Swamy
Alexander Rucker
M. Shahbaz
Ishan Gaur
K. Olukotun
21
82
0
12 Feb 2020
Lightweight 3D Human Pose Estimation Network Training Using
  Teacher-Student Learning
Lightweight 3D Human Pose Estimation Network Training Using Teacher-Student Learning
D. Hwang
Suntae Kim
Nicolas Monet
Hideki Koike
Soonmin Bae
3DH
25
39
0
15 Jan 2020
PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning
PoPS: Policy Pruning and Shrinking for Deep Reinforcement Learning
Dor Livne
Kobi Cohen
29
50
0
14 Jan 2020
Resource-Efficient Neural Networks for Embedded Systems
Resource-Efficient Neural Networks for Embedded Systems
Wolfgang Roth
Günther Schindler
Lukas Pfeifenberger
Robert Peharz
Sebastian Tschiatschek
Holger Fröning
Franz Pernkopf
Zoubin Ghahramani
34
47
0
07 Jan 2020
SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy
  Learning
SAM: Squeeze-and-Mimic Networks for Conditional Visual Driving Policy Learning
Albert Zhao
Tong He
Yitao Liang
Haibin Huang
Mathias Niepert
Stefano Soatto
17
16
0
06 Dec 2019
Blockwisely Supervised Neural Architecture Search with Knowledge
  Distillation
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation
Changlin Li
Jiefeng Peng
Liuchun Yuan
Guangrun Wang
Xiaodan Liang
Liang Lin
Xiaojun Chang
31
179
0
29 Nov 2019
Preparing Lessons: Improve Knowledge Distillation with Better
  Supervision
Preparing Lessons: Improve Knowledge Distillation with Better Supervision
Tiancheng Wen
Shenqi Lai
Xueming Qian
25
67
0
18 Nov 2019
Self-training with Noisy Student improves ImageNet classification
Self-training with Noisy Student improves ImageNet classification
Qizhe Xie
Minh-Thang Luong
Eduard H. Hovy
Quoc V. Le
NoLa
64
2,362
0
11 Nov 2019
Domain Robustness in Neural Machine Translation
Domain Robustness in Neural Machine Translation
Mathias Müller
Annette Rios Gonzales
Rico Sennrich
33
95
0
08 Nov 2019
Real-time Memory Efficient Large-pose Face Alignment via Deep
  Evolutionary Network
Real-time Memory Efficient Large-pose Face Alignment via Deep Evolutionary Network
Bin Sun
Ming Shao
Siyu Xia
Y. Fu
3DH
CVBM
15
2
0
25 Oct 2019
Contrastive Representation Distillation
Contrastive Representation Distillation
Yonglong Tian
Dilip Krishnan
Phillip Isola
47
1,031
0
23 Oct 2019
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Subhabrata Mukherjee
Ahmed Hassan Awadallah
18
25
0
04 Oct 2019
On the Efficacy of Knowledge Distillation
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
17
598
0
03 Oct 2019
Exascale Deep Learning to Accelerate Cancer Research
Exascale Deep Learning to Accelerate Cancer Research
Robert M. Patton
J. T. Johnston
Steven R. Young
Catherine D. Schuman
T. Potok
...
Junghoon Chae
L. Hou
Shahira Abousamra
Dimitris Samaras
Joel H. Saltz
21
15
0
26 Sep 2019
Compact Trilinear Interaction for Visual Question Answering
Compact Trilinear Interaction for Visual Question Answering
Tuong Khanh Long Do
Thanh-Toan Do
Huy Tran
Erman Tjiputra
Quang-Dieu Tran
36
59
0
26 Sep 2019
FEED: Feature-level Ensemble for Knowledge Distillation
FEED: Feature-level Ensemble for Knowledge Distillation
Seonguk Park
Nojun Kwak
FedML
17
41
0
24 Sep 2019
Adversarial Learning with Margin-based Triplet Embedding Regularization
Adversarial Learning with Margin-based Triplet Embedding Regularization
Yaoyao Zhong
Weihong Deng
AAML
28
50
0
20 Sep 2019
Extreme Low Resolution Activity Recognition with Confident
  Spatial-Temporal Attention Transfer
Extreme Low Resolution Activity Recognition with Confident Spatial-Temporal Attention Transfer
Yucai Bai
Qinglong Zou
Xieyuanli Chen
Lingxi Li
Zhengming Ding
Long Chen
18
3
0
09 Sep 2019
A Novel Design of Adaptive and Hierarchical Convolutional Neural
  Networks using Partial Reconfiguration on FPGA
A Novel Design of Adaptive and Hierarchical Convolutional Neural Networks using Partial Reconfiguration on FPGA
Mohammad Farhadi
Mehdi Ghasemi
Yezhou Yang
16
27
0
05 Sep 2019
Knowledge Distillation for End-to-End Person Search
Knowledge Distillation for End-to-End Person Search
Bharti Munjal
Fabio Galasso
S. Amin
FedML
40
15
0
03 Sep 2019
Previous
1234567
Next