Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.04354
Cited By
Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
9 October 2020
Mingzhu Shen
Feng Liang
Ruihao Gong
Yuhang Li
Chuming Li
Chen Lin
F. Yu
Junjie Yan
Wanli Ouyang
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search"
8 / 8 papers shown
Title
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
34
0
0
01 Nov 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
32
3
0
25 Mar 2024
Vertical Layering of Quantized Neural Networks for Heterogeneous Inference
Hai Wu
Ruifei He
Hao Hao Tan
Xiaojuan Qi
Kaibin Huang
MQ
19
2
0
10 Dec 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
Efficient Adaptive Activation Rounding for Post-Training Quantization
Zhengyi Li
Cong Guo
Zhanda Zhu
Yangjie Zhou
Yuxian Qiu
Xiaotian Gao
Jingwen Leng
Minyi Guo
MQ
25
3
0
25 Aug 2022
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
Peng Hu
Xi Peng
Hongyuan Zhu
M. Aly
Jie Lin
MQ
31
59
0
23 May 2022
QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
Xiuying Wei
Ruihao Gong
Yuhang Li
Xianglong Liu
F. Yu
MQ
VLM
17
165
0
11 Mar 2022
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark
Yuhang Li
Mingzhu Shen
Jian Ma
Yan Ren
Mingxin Zhao
Qi Zhang
Ruihao Gong
F. Yu
Junjie Yan
MQ
35
49
0
05 Nov 2021
1