Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11141
Cited By
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
23 May 2022
Peng Hu
Xi Peng
Hongyuan Zhu
M. Aly
Jie Lin
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization"
17 / 17 papers shown
Title
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
Xiaoyi Qu
David Aponte
Colby R. Banbury
Daniel P. Robinson
Tianyu Ding
K. Koishida
Ilya Zharkov
Tianyi Chen
MQ
57
0
0
23 Feb 2025
Progressive Binarization with Semi-Structured Pruning for LLMs
X. Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
MQ
54
0
0
03 Feb 2025
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
34
0
0
01 Nov 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
22
2
0
16 Oct 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
43
4
0
03 Aug 2024
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models
Heng Lu
Mehdi Alemi
Reza Rawassizadeh
29
1
0
05 Jul 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
29
4
0
31 May 2024
Differentiable Search for Finding Optimal Quantization Strategy
Lianqiang Li
Chenqian Yan
Yefei Chen
MQ
19
2
0
10 Apr 2024
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization
K. Balaskas
Andreas Karatzas
Christos Sad
K. Siozios
Iraklis Anagnostopoulos
Georgios Zervakis
Jörg Henkel
MQ
17
10
0
23 Dec 2023
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
17
47
0
06 Jul 2023
Group channel pruning and spatial attention distilling for object detection
Yun Chu
Pu Li
Yong Bai
Zhuhua Hu
Yongqing Chen
Jiafeng Lu
VLM
22
13
0
02 Jun 2023
Towards Optimal Compression: Joint Pruning and Quantization
Ben Zandonati
Glenn Bucagu
Adrian Alan Pol
M. Pierini
Olya Sirkin
Tal Kopetz
MQ
13
2
0
15 Feb 2023
Mixed-Precision Neural Networks: A Survey
M. Rakka
M. Fouda
Pramod P. Khargonekar
Fadi J. Kurdahi
MQ
16
11
0
11 Aug 2022
Recursive Least Squares for Training and Pruning Convolutional Neural Networks
Tianzong Yu
Chunyuan Zhang
Yuan Wang
Meng-tao Ma
Qingwei Song
20
1
0
13 Jan 2022
Single-path Bit Sharing for Automatic Loss-aware Model Compression
Jing Liu
Bohan Zhuang
Peng Chen
Chunhua Shen
Jianfei Cai
Mingkui Tan
MQ
11
7
0
13 Jan 2021
Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
Mingzhu Shen
Feng Liang
Ruihao Gong
Yuhang Li
Chuming Li
Chen Lin
F. Yu
Junjie Yan
Wanli Ouyang
MQ
13
36
0
09 Oct 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,471
0
17 Apr 2017
1