Papers citing 'HAQ: Hardware-Aware Automated Quantization with Mixed Precision'

Title
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and AcceleratorsInternational Conference on Machine Learning (ICML), 2021 Yonggan Fu Yongan Zhang Yang Zhang David D. Cox Yingyan Lin MQ 266 21 0 11 Jun 2021
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationNeural Information Processing Systems (NeurIPS), 2021 Yongming Rao Wenliang Zhao Benlin Liu Jiwen Lu Jie Zhou Cho-Jui Hsieh ViT 397 900 0 03 Jun 2021
RED : Looking for Redundancies for Data-Free Structured Compression of Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2021 Edouard Yvinec Arnaud Dapogny Matthieu Cord Kévin Bailly CVBM 120 29 0 31 May 2021
NAAS: Neural Accelerator Architecture SearchDesign Automation Conference (DAC), 2021 Chengyue Wu Mengtian Yang Song Han 128 70 0 27 May 2021
Low-Precision Hardware Architectures Meet Recommendation Model Inference at ScaleIEEE Micro (IEEE Micro), 2021 Zhaoxia Deng Deng Jongsoo Park P. T. P. Tang Haixin Liu ... S. Nadathur Changkyu Kim Maxim Naumov S. Naghshineh M. Smelyanskiy 139 12 0 26 May 2021
DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural Networks for Edge Vision Applications Yaoyu Zhang Wai Teng Tang Matthew Kay Fei Lee Chuping Qu Weng-Fai Wong Rick Siow Mong Goh 126 0 0 25 May 2021
BatchQuant: Quantized-for-all Architecture Search with Robust QuantizerNeural Information Processing Systems (NeurIPS), 2021 Haoping Bai Mengsi Cao Ping Huang Jiulong Shan MQ 172 38 0 19 May 2021
Pareto-Optimal Quantized ResNet Is Mostly 4-bit AmirAli Abdolrashidi Lisa Wang Shivani Agrawal J. Malmaud Oleg Rybakov Chas Leichner Lukasz Lew MQ 137 44 0 07 May 2021
On the Adversarial Robustness of Quantized Neural NetworksACM Great Lakes Symposium on VLSI (GLSVLSI), 2021 Micah Gorsline James T. Smith Cory E. Merkel AAML 170 23 0 01 May 2021
HAO: Hardware-aware neural Architecture Optimization for Efficient InferenceIEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2021 Zhen Dong Yizhao Gao Qijing Huang J. Wawrzynek Hayden Kwok-Hay So Kurt Keutzer 151 40 0 26 Apr 2021
Spatio-Temporal Pruning and Quantization for Low-latency Spiking Neural NetworksIEEE International Joint Conference on Neural Network (IJCNN), 2021 Sayeed Shafayet Chowdhury Isha Garg Kaushik Roy 193 46 0 26 Apr 2021
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision NetworksDesign Automation Conference (DAC), 2021 Yonggan Fu Zhongzhi Yu Yongan Zhang Lezhi Li Chaojian Li Yongyuan Liang Mingchao Jiang Zinan Lin Yingyan Lin 250 4 0 22 Apr 2021
Differentiable Model Compression via Pseudo Quantization Noise Alexandre Défossez Yossi Adi Gabriel Synnaeve DiffM MQ 178 58 0 20 Apr 2021
Coarse-to-Fine Searching for Efficient Generative Adversarial Networks Jiahao Wang Han Shu Weihao Xia Yujiu Yang Yunhe Wang GAN 123 5 0 19 Apr 2021
TENT: Efficient Quantization of Neural Networks on the tiny Edge with Tapered FixEd PoiNT H. F. Langroudi Vedant Karia Tej Pandit Dhireesha Kudithipudi MQ 116 11 0 06 Apr 2021
LeViT: a Vision Transformer in ConvNet's Clothing for Faster InferenceIEEE International Conference on Computer Vision (ICCV), 2021 Ben Graham Alaaeldin El-Nouby Hugo Touvron Pierre Stock Armand Joulin Edouard Grave Matthijs Douze ViT 310 936 0 02 Apr 2021
Network Quantization with Element-wise Gradient ScalingComputer Vision and Pattern Recognition (CVPR), 2021 Junghyup Lee Jeimin Jeon Bumsub Ham MQ 187 140 0 02 Apr 2021
Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric QuantizerIEEE Access (IEEE Access), 2021 Phuoc Pham J. Abraham Jaeyong Chung MQ 173 15 0 01 Apr 2021
Bit-Mixer: Mixed-precision networks with runtime bit-width selectionIEEE International Conference on Computer Vision (ICCV), 2021 Adrian Bulat Georgios Tzimiropoulos MQ 146 30 0 31 Mar 2021
RCT: Resource Constrained Training for Edge AIIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021 Tian Huang Yaoyu Zhang Ming Yan Qiufeng Wang Rick Siow Mong Goh 267 11 0 26 Mar 2021
n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization Yuiko Sakuma Hiroshi Sumihiro Jun Nishikawa Toshiki Nakamura Ryoji Ikegaya MQ 155 3 0 22 Mar 2021
Data-free mixed-precision quantization using novel sensitivity metricInternational Conference on Information Photonics (ICIP), 2021 Donghyun Lee M. Cho Seungwon Lee Joonho Song Changkyu Choi MQ 186 2 0 18 Mar 2021
Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained DevicesPattern Recognition (Pattern Recogn.), 2021 Md Mohaimenuzzaman Christoph Bergmeir I. West B. Meyer 354 51 0 05 Mar 2021
Anycost GANs for Interactive Image Synthesis and EditingComputer Vision and Pattern Recognition (CVPR), 2021 Ji Lin Richard Y. Zhang F. Ganz Song Han Jun-Yan Zhu 207 93 0 04 Mar 2021
Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space SearchInternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021 Kartik Hegde Po-An Tsai Sitao Huang Vikas Chandra A. Parashar Christopher W. Fletcher 182 108 0 02 Mar 2021
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-WidthsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021 Ximeng Sun Yikang Shen Chun-Fu Chen Naigang Wang Bowen Pan Bowen Pan Kailash Gopalakrishnan A. Oliva Rogerio Feris Kate Saenko MQ 264 7 0 02 Mar 2021
FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered DropoutNeural Information Processing Systems (NeurIPS), 2021 Samuel Horváth Stefanos Laskaridis Mario Almeida Ilias Leondiadis Stylianos I. Venieris Nicholas D. Lane 601 322 0 26 Feb 2021
Ps and Qs: Quantization-aware pruning for efficient low latency neural network inferenceFrontiers in Artificial Intelligence (Front. Artif. Intell.), 2021 B. Hawks Javier Mauricio Duarte Nicholas J. Fraser Alessandro Pappalardo N. Tran Yaman Umuroglu MQ 219 65 0 22 Feb 2021
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network QuantizationInternational Conference on Learning Representations (ICLR), 2021 Huanrui Yang Lin Duan Yiran Chen Hai Helen Li MQ 186 78 0 20 Feb 2021
GradFreeBits: Gradient Free Bit Allocation for Dynamic Low Precision Neural Networks Ben Bodner G. B. Shalom Eran Treister MQ 167 2 0 18 Feb 2021
An Information-Theoretic Justification for Model PruningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Berivan Isik Tsachy Weissman Albert No 294 39 0 16 Feb 2021
FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation Chaofan Tao Rui Lin Quan Chen Zhaoyang Zhang Ping Luo Ngai Wong MQ 133 8 0 15 Feb 2021
Confounding Tradeoffs for Neural Network Quantization Sahaj Garg Anirudh Jain Joe Lou Mitchell Nahmias MQ 147 20 0 12 Feb 2021
Dynamic Precision Analog Computing for Neural NetworksIEEE Journal of Selected Topics in Quantum Electronics (JSTQE), 2021 Sahaj Garg Joe Lou Anirudh Jain Mitchell Nahmias 213 40 0 12 Feb 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block ReconstructionInternational Conference on Learning Representations (ICLR), 2021 Yuhang Li Yazhe Niu Xu Tan Yang Yang Peng Hu Tao Gui F. Yu Wei Wang Shi Gu MQ 304 545 0 10 Feb 2021
AHAR: Adaptive CNN for Energy-efficient Human Activity Recognition in Low-power Edge DevicesIEEE Internet of Things Journal (IEEE IoT Journal), 2021 Nafiul Rashid B. U. Demirel Mohammad Abdullah Al Faruque 258 94 0 03 Feb 2021
Rethinking Floating Point Overheads for Mixed Precision DNN AcceleratorsConference on Machine Learning and Systems (MLSys), 2021 Hamzah Abdel-Aziz Ali Shafiee J. Shin A. Pedram Joseph Hassoun MQ 170 13 0 27 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A SurveyNeurocomputing (Neurocomputing), 2021 Tailin Liang C. Glossner Lei Wang Shaobo Shi Xiaotong Zhang MQ 464 842 0 24 Jan 2021
Network Pruning using Adaptive Exemplar FiltersIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021 Mingbao Lin Rongrong Ji Shaojie Li Yan Wang Yongjian Wu Feiyue Huang QiXiang Ye VLM 144 64 0 20 Jan 2021
Multi-Task Network Pruning and Embedded Optimization for Real-time Deployment in ADAS F. Dellinger T. Boulay Diego Mendoza Barrenechea Said El-Hachimi Isabelle Leang Fabian Burger 108 3 0 19 Jan 2021
Single-path Bit Sharing for Automatic Loss-aware Model CompressionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021 Jing Liu Bohan Zhuang Peng Chen Chunhua Shen Jianfei Cai Zhuliang Yu MQ 186 13 0 13 Jan 2021
SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and TrainingIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021 Xiaohan Chen Yang Zhao Yue Wang Pengfei Xu Haoran You Chaojian Li Y. Fu Yingyan Lin Zinan Lin 287 1 0 04 Jan 2021
BinaryBERT: Pushing the Limit of BERT QuantizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Haoli Bai Wei Zhang Lu Hou Lifeng Shang Jing Jin Xin Jiang Qun Liu Michael Lyu Irwin King MQ 435 247 0 31 Dec 2020
FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN TrainingNeural Information Processing Systems (NeurIPS), 2020 Y. Fu Haoran You Yang Zhao Yue Wang Chaojian Li K. Gopalakrishnan Zinan Lin Yingyan Lin MQ 253 34 0 24 Dec 2020
Adaptive Precision Training for Resource Constrained DevicesIEEE International Conference on Distributed Computing Systems (ICDCS), 2020 Tian Huang Yaoyu Zhang Qiufeng Wang 164 6 0 23 Dec 2020
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road AheadIEEE Access (IEEE Access), 2020 Maurizio Capra Beatrice Bussolino Alberto Marchisio Guido Masera Maurizio Martina Mohamed Bennai BDL 283 172 0 21 Dec 2020
Machine Learning Systems in the IoT: Trustworthiness Trade-offs for Edge IntelligenceInternational Conference on Cognitive Machine Intelligence (ICCMI), 2020 Wiebke Toussaint Aaron Yi Ding 260 13 0 01 Dec 2020
Ax-BxP: Approximate Blocked Computation for Precision-Reconfigurable Deep Neural Network Acceleration Reena Elangovan Shubham Jain A. Raghunathan 240 7 0 25 Nov 2020
Bringing AI To Edge: From Deep Learning's PerspectiveNeurocomputing (Neurocomputing), 2020 Di Liu Hao Kong Xiangzhong Luo Weichen Liu Ravi Subramaniam 226 151 0 25 Nov 2020
Larq Compute Engine: Design, Benchmark, and Deploy State-of-the-Art Binarized Neural NetworksConference on Machine Learning and Systems (MLSys), 2020 T. Bannink Arash Bakhtiari Adam Hillier Lukas Geiger T. D. Bruin Leon Overweel J. Neeven K. Helwegen 3DV MQ 232 44 0 18 Nov 2020