Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.08509
Cited By
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
15 June 2020
Tianzhe Wang
Kuan-Chieh Jackson Wang
Han Cai
Ji Lin
Zhijian Liu
Song Han
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"APQ: Joint Search for Network Architecture, Pruning and Quantization Policy"
42 / 42 papers shown
Title
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
M. Rakka
Rachid Karami
A. Eltawil
M. Fouda
Fadi J. Kurdahi
MQ
37
1
0
03 Nov 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
37
5
0
31 May 2024
Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators
Jan Klhufek
Miroslav Safar
Vojtěch Mrázek
Z. Vašíček
Lukás Sekanina
MQ
32
1
0
08 Apr 2024
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer
Y. Tai
An-Yeu Wu
Wu
MQ
26
6
0
26 Jan 2024
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization
Qianhui Liu
Jiaqi Yan
Malu Zhang
Gang Pan
Haizhou Li
32
4
0
26 Jan 2024
InstaTune: Instantaneous Neural Architecture Search During Fine-Tuning
S. N. Sridhar
Souvik Kundu
Sairam Sundaresan
Maciej Szankin
Anthony Sarah
17
3
0
29 Aug 2023
FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search
Jordan Dotzel
Gang Wu
Andrew Li
M. Umar
Yun Ni
...
Liqun Cheng
Martin G. Dixon
N. Jouppi
Quoc V. Le
Sheng R. Li
MQ
25
3
0
07 Aug 2023
Data Aware Neural Architecture Search
Emil Njor
J. Madsen
Xenofon Fafoutis
16
7
0
04 Apr 2023
Pruning Compact ConvNets for Efficient Inference
Sayan Ghosh
Karthik Prasad
Xiaoliang Dai
Peizhao Zhang
Bichen Wu
Graham Cormode
Peter Vajda
VLM
19
4
0
11 Jan 2023
Towards Hardware-Specific Automatic Compression of Neural Networks
Torben Krieger
Bernhard Klein
Holger Fröning
MQ
19
2
0
15 Dec 2022
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference
Hai Wu
Ruifei He
Hao Hao Tan
Xiaojuan Qi
Kaibin Huang
MQ
19
2
0
10 Dec 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
72
31
0
14 Sep 2022
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari
Michelle Ma
Ameet Talwalkar
Tianqi Chen
36
1
0
25 Aug 2022
DeepPicarMicro: Applying TinyML to Autonomous Cyber Physical Systems
Michael Bechtel
QiTao Weng
H. Yun
BDL
21
7
0
23 Aug 2022
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
43
3
0
22 Aug 2022
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
18
11
0
11 Aug 2022
Design of High-Throughput Mixed-Precision CNN Accelerators on FPGA
Cecilia Latotzke
Tim Ciesielski
T. Gemmeke
MQ
13
7
0
09 Aug 2022
Learnable Mixed-precision and Dimension Reduction Co-design for Low-storage Activation
Yu-Shan Tai
Cheng-Yang Chang
Chieh-Fang Teng
AnYeu
A. Wu
25
5
0
16 Jul 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Hongyuan Zhu
M. Aly
Jie Lin
MQ
39
59
0
23 May 2022
PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution
Zhijian Liu
Haotian Tang
Shengyu Zhao
Kevin Shao
Song Han
3DPC
13
39
0
25 Apr 2022
A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators
Bingqian Lu
Zheyu Yan
Yiyu Shi
Shaolei Ren
21
2
0
25 Mar 2022
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
Hanrui Wang
Zi-Chen Li
Jiaqi Gu
Yongshan Ding
D. Pan
Song Han
32
52
0
26 Feb 2022
UDC: Unified DNAS for Compressible TinyML Models
Igor Fedorov
Ramon Matas
Hokchhay Tann
Chu Zhou
Matthew Mattina
P. Whatmough
AI4CE
21
13
0
15 Jan 2022
Automated Deep Learning: Neural Architecture Search Is Not the End
Xuanyi Dong
D. Kedziora
Katarzyna Musial
Bogdan Gabrys
25
26
0
16 Dec 2021
Sharpness-aware Quantization for Deep Neural Networks
Jing Liu
Jianfei Cai
Bohan Zhuang
MQ
27
24
0
24 Nov 2021
An Approach for Combining Multimodal Fusion and Neural Architecture Search Applied to Knowledge Tracing
Xinyi Ding
Tao Han
Yili Fang
Eric C. Larson
31
6
0
08 Nov 2021
When to Prune? A Policy towards Early Structural Pruning
Maying Shen
Pavlo Molchanov
Hongxu Yin
J. Álvarez
VLM
20
52
0
22 Oct 2021
S-Cyc: A Learning Rate Schedule for Iterative Pruning of ReLU-based Networks
Shiyu Liu
Chong Min John Tan
Mehul Motani
CLL
21
4
0
17 Oct 2021
Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices
Tianli Zhao
Xi Sheryl Zhang
Wentao Zhu
Jiaxing Wang
Sen Yang
Ji Liu
Jian Cheng
43
2
0
15 Oct 2021
DHA: End-to-End Joint Optimization of Data Augmentation Policy, Hyper-parameter and Architecture
Kaichen Zhou
Lanqing Hong
Shuailiang Hu
Fengwei Zhou
Binxin Ru
Jiashi Feng
Zhenguo Li
54
10
0
13 Sep 2021
QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits
Hanrui Wang
Yongshan Ding
Jiaqi Gu
Zirui Li
Yujun Lin
D. Pan
Frederic T. Chong
Song Han
14
170
0
22 Jul 2021
Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Yonggan Fu
Yongan Zhang
Yang Zhang
David D. Cox
Yingyan Lin
MQ
50
17
0
11 Jun 2021
unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation
Stylianos I. Venieris
Javier Fernandez-Marques
Nicholas D. Lane
16
11
0
09 Mar 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
K. E. Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
26
95
0
22 Jan 2021
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Maurizio Capra
Beatrice Bussolino
Alberto Marchisio
Guido Masera
Maurizio Martina
Muhammad Shafique
BDL
53
140
0
21 Dec 2020
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang
Zhekai Zhang
Song Han
20
373
0
17 Dec 2020
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
42
116
0
25 Nov 2020
Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
Mingzhu Shen
Feng Liang
Ruihao Gong
Yuhang Li
Chuming Li
Chen Lin
F. Yu
Junjie Yan
Wanli Ouyang
MQ
23
36
0
09 Oct 2020
Transform Quantization for CNN (Convolutional Neural Network) Compression
Sean I. Young
Wang Zhe
David S. Taubman
B. Girod
MQ
27
69
0
02 Sep 2020
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Haotian Tang
Zhijian Liu
Shengyu Zhao
Yujun Lin
Ji Lin
Hanrui Wang
Song Han
3DPC
31
630
0
31 Jul 2020
NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications
Tien-Ju Yang
Andrew G. Howard
Bo Chen
Xiao Zhang
Alec Go
Mark Sandler
Vivienne Sze
Hartwig Adam
90
515
0
09 Apr 2018
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
264
5,326
0
05 Nov 2016
1