Trained Ternary Quantization

4 December 2016

Song Han

Papers citing "Trained Ternary Quantization"

50 / 509 papers shown

Title
Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions Stylianos I. Venieris Alexandros Kouris C. Bouganis 6 184 0 15 Mar 2018
Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu Q. Lu Yu Hu Lin Yang X. S. Hu D. Z. Chen Yiyu Shi MedIm 21 85 0 13 Mar 2018
Deep Neural Network Compression with Single and Multiple Level Quantization Yuhui Xu Yongzhuang Wang Aojun Zhou Weiyao Lin H. Xiong MQ 20 114 0 06 Mar 2018
An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks Qianxiao Li Shuji Hao 11 75 0 04 Mar 2018
The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md. Zahangir Alom T. Taha C. Yakopcic Stefan Westberg P. Sidike Mst Shamima Nasrin B. Van Essen A. Awwal V. Asari VLM 29 873 0 03 Mar 2018
WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics Asit K. Mishra Debbie Marr 14 6 0 01 Mar 2018
PBGen: Partial Binarization of Deconvolution-Based Generators for Edge Intelligence Jinglan Liu Jiaxin Zhang Yukun Ding Xiaowei Xu Meng-Long Jiang Yiyu Shi 28 4 0 26 Feb 2018
Loss-aware Weight Quantization of Deep Networks Lu Hou James T. Kwok MQ 15 127 0 23 Feb 2018
Model compression via distillation and quantization A. Polino Razvan Pascanu Dan Alistarh MQ 17 718 0 15 Feb 2018
Training and Inference with Integers in Deep Neural Networks Shuang Wu Guoqi Li F. Chen Luping Shi MQ 19 389 0 13 Feb 2018
On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks Yukun Ding Jinglan Liu Jinjun Xiong Yiyu Shi MQ 21 21 0 10 Feb 2018
AMC: AutoML for Model Compression and Acceleration on Mobile Devices Yihui He Ji Lin Zhijian Liu Hanrui Wang Li-Jia Li Song Han 33 1,339 0 10 Feb 2018
Effective Quantization Approaches for Recurrent Neural Networks Md. Zahangir Alom A. Moody N. Maruyama B. Van Essen T. Taha MQ 8 33 0 07 Feb 2018
Deep Versus Wide Convolutional Neural Networks for Object Recognition on Neuromorphic System Md. Zahangir Alom Theodora Josue Md Nayim Rahman Will Mitchell C. Yakopcic T. Taha 9 20 0 07 Feb 2018
Universal Deep Neural Network Compression Yoojin Choi Mostafa El-Khamy Jungwon Lee MQ 81 85 0 07 Feb 2018
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks Jian Cheng Peisong Wang Gang Li Qinghao Hu Hanqing Lu 16 3 0 03 Feb 2018
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning Yixing Li Fengbo Ren MQ 14 12 0 03 Feb 2018
Alternating Multi-bit Quantization for Recurrent Neural Networks Chen Xu Jianqiang Yao Zhouchen Lin Wenwu Ou Yuanbin Cao Zhirong Wang H. Zha MQ 27 115 0 01 Feb 2018
TernaryNet: Faster Deep Model Inference without GPUs for Medical 3D Segmentation using Sparse and Binary Convolutions M. Heinrich Maximilian Blendowski Ozan Oktay MedIm 14 40 0 29 Jan 2018
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks Jason Kuen Xiangfei Kong Zhe-nan Lin G. Wang Jianxiong Yin Simon See Yap-Peng Tan BDL 8 25 0 29 Jan 2018
Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights Arun Mallya Dillon Davis Svetlana Lazebnik CLL 10 35 0 19 Jan 2018
BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights Penghang Yin Shuai Zhang J. Lyu Stanley Osher Y. Qi Jack Xin MQ 22 78 0 19 Jan 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference Benoit Jacob S. Kligys Bo Chen Menglong Zhu Matthew Tang Andrew G. Howard Hartwig Adam Dmitry Kalenichenko MQ 18 3,043 0 15 Dec 2017
StrassenNets: Deep Learning with a Multiplication Budget Michael Tschannen Aran Khanna Anima Anandkumar 9 29 0 11 Dec 2017
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks Hardik Sharma Jongse Park Naveen Suda Liangzhen Lai Benson Chau J. Kim Vikas Chandra H. Esmaeilzadeh MQ 16 485 0 05 Dec 2017
Deep Learning for Real-Time Crime Forecasting and its Ternarization Bao Wang Penghang Yin Andrea L. Bertozzi P. Brantingham Stanley J. Osher Jack Xin AI4TS 36 82 0 23 Nov 2017
Deep Expander Networks: Efficient Deep Networks from Graph Theory Ameya Prabhu G. Varma A. Namboodiri GNN 24 70 0 23 Nov 2017
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy Asit K. Mishra Debbie Marr FedML 27 330 0 15 Nov 2017
SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks Sanchari Sen Shubham Jain Swagath Venkataramani A. Raghunathan 16 30 0 07 Nov 2017
Efficient Inferencing of Compressed Deep Neural Networks Dharma Teja Vooturi Saurabh Goyal Anamitra R. Choudhury Yogish Sabharwal Ashish Verma 16 6 0 01 Nov 2017
Minimum Energy Quantized Neural Networks Bert Moons Koen Goetschalckx Nick Van Berckelaer Marian Verhelst MQ 19 123 0 01 Nov 2017
Towards Effective Low-bitwidth Convolutional Neural Networks Bohan Zhuang Chunhua Shen Mingkui Tan Lingqiao Liu Ian Reid MQ 26 231 0 01 Nov 2017
Deep Learning as a Mixed Convex-Combinatorial Optimization Problem A. Friesen Pedro M. Domingos 18 20 0 31 Oct 2017
A Survey of Model Compression and Acceleration for Deep Neural Networks Yu Cheng Duo Wang Pan Zhou Zhang Tao 23 1,087 0 23 Oct 2017
Deep Neural Network Approximation using Tensor Sketching S. Kasiviswanathan Nina Narodytska Hongxia Jin 16 9 0 21 Oct 2017
Learning Discrete Weights Using the Local Reparameterization Trick Oran Shayer Dan Levi Ethan Fetaya 13 88 0 21 Oct 2017
TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization D. Loroch Norbert Wehn Franz-Josef Pfreundt J. Keuper MQ 25 23 0 13 Oct 2017
To prune, or not to prune: exploring the efficacy of pruning for model compression Michael Zhu Suyog Gupta 37 1,248 0 05 Oct 2017
WRPN: Wide Reduced-Precision Networks Asit K. Mishra Eriko Nurvitadhi Jeffrey J. Cook Debbie Marr MQ 20 266 0 04 Sep 2017
BitNet: Bit-Regularized Deep Neural Networks Aswin Raghavan Mohamed R. Amer S. Chai Graham Taylor MQ 27 10 0 16 Aug 2017
Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization Yinpeng Dong Renkun Ni Jianguo Li Yurong Chen Jun Zhu Hang Su MQ 18 62 0 03 Aug 2017
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform Chaim Baskin Natan Liss Evgenii Zheltonozhskii A. Bronstein A. Mendelson GNN MQ 28 35 0 31 Jul 2017
Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM Cong Leng Hao Li Shenghuo Zhu R. L. Jin MQ 27 286 0 24 Jul 2017
Ternary Residual Networks Abhisek Kundu K. Banerjee Naveen Mellempudi Dheevatsa Mudigere Dipankar Das Bharat Kaul Pradeep Dubey 23 8 0 15 Jul 2017
Model compression as constrained optimization, with application to neural nets. Part II: quantization M. A. Carreira-Perpiñán Yerlan Idelbayev MQ 9 37 0 13 Jul 2017
Model compression as constrained optimization, with application to neural nets. Part I: general framework Miguel Á. Carreira-Perpiñán MQ 12 32 0 05 Jul 2017
Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations Yoonho Boo Wonyong Sung MQ 22 11 0 01 Jul 2017
Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks H. Elsayed Bruno U. Pedroni Sadique Sheik Gert Cauwenberghs 13 8 0 15 Jun 2017
YellowFin and the Art of Momentum Tuning Jian Zhang Ioannis Mitliagkas ODL 15 108 0 12 Jun 2017
Training Quantized Nets: A Deeper Understanding Hao Li Soham De Zheng Xu Christoph Studer H. Samet Tom Goldstein MQ 17 209 0 07 Jun 2017