EIE: Efficient Inference Engine on Compressed Deep Neural Network

4 February 2016

Song Han

Papers citing "EIE: Efficient Inference Engine on Compressed Deep Neural Network"

50 / 211 papers shown

Title
Multiply-and-Fire (MNF): An Event-driven Sparse Neural Network Accelerator Miao Yu Tingting Xiang Venkata Pavan Kumar Miriyala Trevor E. Carlson 15 1 0 20 Apr 2022
Accelerating Attention through Gradient-Based Learned Runtime Pruning Zheng Li Soroush Ghodrati Amir Yazdanbakhsh H. Esmaeilzadeh Mingu Kang 19 16 0 07 Apr 2022
Energy-Latency Attacks via Sponge Poisoning Antonio Emanuele Cinà Ambra Demontis Battista Biggio Fabio Roli Marcello Pelillo SILM 39 29 0 14 Mar 2022
GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks Ranggi Hwang M. Kang Jiwon Lee D. Kam Youngjoo Lee Minsoo Rhu GNN 11 20 0 01 Mar 2022
EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators Lois Orosa Skanda Koppula Yaman Umuroglu Konstantinos Kanellopoulos Juan Gómez Luna Michaela Blott K. Vissers O. Mutlu 38 4 0 04 Feb 2022
Real-Time Gaze Tracking with Event-Driven Eye Segmentation Yu Feng Nathan Goulding Asif Khan Hans Reyserhove Yuhao Zhu 25 38 0 19 Jan 2022
Problem-dependent attention and effort in neural networks with applications to image resolution and model selection Chris Rohlfs 16 4 0 05 Jan 2022
Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction Marcin Pietroñ Dominik Zurek 22 13 0 28 Dec 2021
Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting Minghai Qin Tianyun Zhang Fei Sun Yen-kuang Chen M. Fardad Yanzhi Wang Yuan Xie 31 0 0 21 Dec 2021
Automated Deep Learning: Neural Architecture Search Is Not the End Xuanyi Dong D. Kedziora Katarzyna Musial Bogdan Gabrys 25 26 0 16 Dec 2021
Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks Ángel López García-Arias Masanori Hashimoto Masato Motomura Jaehoon Yu 31 5 0 24 Nov 2021
SPA-GCN: Efficient and Flexible GCN Accelerator with an Application for Graph Similarity Computation Atefeh Sohrabizadeh Yuze Chi Jason Cong GNN 29 1 0 10 Nov 2021
Phantom: A High-Performance Computational Core for Sparse Convolutional Neural Networks Mahmood Azhar Qureshi Arslan Munir 19 0 0 09 Nov 2021
Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks Hassan Dbouk Naresh R Shanbhag AAML 19 7 0 28 Oct 2021
Bandwidth Utilization Side-Channel on ML Inference Accelerators Sarbartha Banerjee Shijia Wei Prakash Ramrakhyani Mohit Tiwari 18 3 0 14 Oct 2021
Memory-Efficient CNN Accelerator Based on Interlayer Feature Map Compression Zhuang Shao Xiaoliang Chen Li Du Lei Chen Yuan Du Weihao Zhuang Huadong Wei Chenjia Xie Zhongfeng Wang 13 26 0 12 Oct 2021
Prune Your Model Before Distill It Jinhyuk Park Albert No VLM 38 27 0 30 Sep 2021
On the Accuracy of Analog Neural Network Inference Accelerators T. Xiao Ben Feinberg C. Bennett V. Prabhakar Prashant Saxena V. Agrawal S. Agarwal M. Marinella 22 32 0 03 Sep 2021
Design and Scaffolded Training of an Efficient DNN Operator for Computer Vision on the Edge Vinod Ganesan Pratyush Kumar 34 2 0 25 Aug 2021
Differentiable Subset Pruning of Transformer Heads Jiaoda Li Ryan Cotterell Mrinmaya Sachan 37 53 0 10 Aug 2021
Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion Mingbao Lin Bohong Chen Fei Chao Rongrong Ji VLM 25 23 0 14 Jul 2021
Improving the Efficiency of Transformers for Resource-Constrained Devices Hamid Tabani Ajay Balasubramaniam Shabbir Marzban Elahe Arani Bahram Zonooz 33 20 0 30 Jun 2021
Layer Folding: Neural Network Depth Reduction using Activation Linearization Amir Ben Dror Niv Zehngut Avraham Raviv E. Artyomov Ran Vitek R. Jevnisek 21 20 0 17 Jun 2021
VersaGNN: a Versatile accelerator for Graph neural networks Feng Shi Yiqiao Jin Song-Chun Zhu GNN 48 17 0 04 May 2021
SETGAN: Scale and Energy Trade-off GANs for Image Applications on Mobile Platforms Nitthilan Kanappan Jayakodi J. Doppa P. Pande GAN 28 4 0 23 Mar 2021
Extending Sparse Tensor Accelerators to Support Multiple Compression Formats Eric Qin Geonhwa Jeong William Won Sheng-Chun Kao Hyoukjun Kwon S. Srinivasan Dipankar Das G. Moon S. Rajamanickam T. Krishna 27 18 0 18 Mar 2021
unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation Stylianos I. Venieris Javier Fernandez-Marques Nicholas D. Lane 16 11 0 09 Mar 2021
Knowledge Evolution in Neural Networks Ahmed Taha Abhinav Shrivastava L. Davis 45 21 0 09 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search Kartik Hegde Po-An Tsai Sitao Huang Vikas Chandra A. Parashar Christopher W. Fletcher 26 90 0 02 Mar 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey Tailin Liang C. Glossner Lei Wang Shaobo Shi Xiaotong Zhang MQ 124 673 0 24 Jan 2021
BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification Seyed Abolfazl Ghasemzadeh E. Tavakoli M. Kamal A. Afzali-Kusha Massoud Pedram 8 13 0 07 Jan 2021
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead Maurizio Capra Beatrice Bussolino Alberto Marchisio Guido Masera Maurizio Martina Muhammad Shafique BDL 56 140 0 21 Dec 2020
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning Hanrui Wang Zhekai Zhang Song Han 20 373 0 17 Dec 2020
Robustness and Transferability of Universal Attacks on Compressed Models Alberto G. Matachana Kenneth T. Co Luis Muñoz-González David Martínez Emil C. Lupu AAML 21 10 0 10 Dec 2020
The Why, What and How of Artificial General Intelligence Chip Development Alex P. James 13 20 0 08 Dec 2020
Bringing AI To Edge: From Deep Learning's Perspective Di Liu Hao Kong Xiangzhong Luo Weichen Liu Ravi Subramaniam 47 116 0 25 Nov 2020
In-Memory Nearest Neighbor Search with FeFET Multi-Bit Content-Addressable Memories Arman Kazemi M. Sharifi Ann Franchesca Laguna F. Müller R. Rajaei R. Olivo T. Kämpfe M. Niemier X. S. Hu MQ 8 37 0 13 Nov 2020
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference Yujeong Choi Yunseong Kim Minsoo Rhu 19 66 0 25 Oct 2020
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training Youngeun Kwon Yunjae Lee Minsoo Rhu 19 39 0 25 Oct 2020
FPRaker: A Processing Element For Accelerating Neural Network Training Omar Mohamed Awad Mostafa Mahmoud Isak Edo Vivancos Ali Hadi Zadeh Ciaran Bannon Anand Jayarajan Gennady Pekhimenko Andreas Moshovos 20 15 0 15 Oct 2020
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning Bingbing Li Zhenglun Kong Tianyun Zhang Ji Li Z. Li Hang Liu Caiwen Ding VLM 24 64 0 17 Sep 2020
MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework Sung-En Chang Yanyu Li Mengshu Sun Weiwen Jiang Runbin Shi Xue Lin Yanzhi Wang MQ 19 7 0 16 Sep 2020
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity Cong Guo B. Hsueh Jingwen Leng Yuxian Qiu Yue Guan Zehuan Wang Xiaoying Jia Xipeng Li M. Guo Yuhao Zhu 32 82 0 29 Aug 2020
CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices Parth Mannan A. Samajdar T. Krishna 23 2 0 27 Aug 2020
Training Sparse Neural Networks using Compressed Sensing Jonathan W. Siegel Jianhong Chen Pengchuan Zhang Jinchao Xu 24 5 0 21 Aug 2020
Artificial Neural Networks and Fault Injection Attacks Shahin Tajik F. Ganji SILM 11 10 0 17 Aug 2020
Always-On 674uW @ 4GOP/s Error Resilient Binary Neural Networks with Aggressive SRAM Voltage Scaling on a 22nm IoT End-Node Alfio Di Mauro Francesco Conti Pasquale Davide Schiavone D. Rossi Luca Benini 13 9 0 17 Jul 2020
AQD: Towards Accurate Fully-Quantized Object Detection Peng Chen Jing Liu Bohan Zhuang Mingkui Tan Chunhua Shen MQ 23 10 0 14 Jul 2020
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights Shail Dave Riyadh Baghdadi Tony Nowatzki Sasikanth Avancha Aviral Shrivastava Baoxin Li 46 81 0 02 Jul 2020
AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles Sicong Liu Junzhao Du Kaiming Nan Zimu Zhou Zhangyang Wang Yingyan Lin 19 30 0 08 Jun 2020