Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.08342
Cited By
Quantizing deep convolutional networks for efficient inference: A whitepaper
21 June 2018
Raghuraman Krishnamoorthi
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quantizing deep convolutional networks for efficient inference: A whitepaper"
50 / 464 papers shown
Title
Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio
Tu Duyen Nguyen
Adrien Lesage
Clotilde Cantini
Rachid Riad
21
0
0
15 May 2025
Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs
K. Jeziorek
T. Kryjak
AI4TS
39
0
0
12 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
54
0
0
05 May 2025
StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models
Yeona Hong
Hyewon Han
Woo-Jin Chung
Hong-Goo Kang
MQ
28
0
0
21 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
30
0
0
12 Apr 2025
Efficient FPGA-accelerated Convolutional Neural Networks for Cloud Detection on CubeSats
Angela Cratere
M. Salim Farissi
Andrea Carbone
Marcello Asciolla
Maria Rizzi
Francesco DellÓlio
Augusto Nascetti
Dario Spiller
28
1
0
04 Apr 2025
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
Zhuguanyu Wu
Jiayi Zhang
Jiaxin Chen
Jinyang Guo
Di Huang
Yunhong Wang
MQ
47
1
0
03 Apr 2025
QSViT: A Methodology for Quantizing Spiking Vision Transformers
Rachmad Vidya Wicaksana Putra
Saad Iftikhar
Muhammad Shafique
MQ
44
0
0
01 Apr 2025
Real-Time Navigation for Autonomous Aerial Vehicles Using Video
Khizar Anjum
Parul Pandey
Vidyasagar Sadhu
Roberto Tron
D. Pompili
41
0
0
01 Apr 2025
A 71.2-
μ
μ
μ
W Speech Recognition Accelerator with Recurrent Spiking Neural Network
Chih-Chyau Yang
Tian-Sheuan Chang
60
1
0
27 Mar 2025
Efficient Personalization of Quantized Diffusion Model without Backpropagation
H. Seo
Wongi Jeong
Kyungryeol Lee
Se Young Chun
DiffM
MQ
78
0
0
19 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
54
0
0
19 Mar 2025
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
David E. Hernandez
J. Chang
Torbjörn E. M. Nordling
58
0
0
17 Mar 2025
Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGA
Michal Danilowicz
T. Kryjak
VOT
58
0
0
17 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
60
0
0
12 Mar 2025
Accelerating Diffusion Sampling via Exploiting Local Transition Coherence
Shangwen Zhu
Han Zhang
Zhantao Yang
Qianyu Peng
Zhao Pu
H. Wang
Fan Cheng
DiffM
48
0
0
12 Mar 2025
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
Wenlun Zhang
Shimpei Ando
Kentaro Yoshioka
VLM
MQ
67
0
0
05 Mar 2025
Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants
Sourav Modak
Ahmet Oğuz Saltık
Anthony Stein
MQ
48
0
0
04 Mar 2025
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
127
84
0
21 Feb 2025
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
Zechun Liu
Changsheng Zhao
Hanxian Huang
Sijia Chen
Jing Zhang
...
Yuandong Tian
Bilge Soran
Raghuraman Krishnamoorthi
Tijmen Blankevoort
Vikas Chandra
MQ
73
3
0
04 Feb 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
29
0
0
28 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
Improving Quantization-aware Training of Low-Precision Network via Block Replacement on Full-Precision Counterpart
Chengting Yu
Shu Yang
Fengzhao Zhang
Hanzhi Ma
Aili Wang
Er-ping Li
MQ
77
2
0
20 Dec 2024
PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
Zining Wnag
J. Guo
Ruihao Gong
Yang Yong
Aishan Liu
Yushi Huang
Jiaheng Liu
X. Liu
71
1
0
10 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
59
0
0
03 Dec 2024
Behavior Backdoor for Deep Learning Models
J. T. Wang
Pengfei Zhang
R. Tao
Jian Yang
Hao Liu
X. Liu
Y. X. Wei
Yao Zhao
AAML
75
0
0
02 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
93
1
0
02 Dec 2024
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang
Tao Ge
Thomas Hartvigsen
Zhisong Zhang
Haitao Mi
Dong Yu
MQ
90
3
0
26 Nov 2024
Efficient Ternary Weight Embedding Model: Bridging Scalability and Performance
Jiayi Chen
Chen Wu
S. Zhang
Nan Li
L. Zhang
Qi Zhang
69
0
0
23 Nov 2024
Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations
Igor Fedorov
Kate Plawiak
Lemeng Wu
Tarek Elgamal
Naveen Suda
...
Bilge Soran
Zacharie Delpierre Coudert
Rachad Alao
Raghuraman Krishnamoorthi
Vikas Chandra
72
4
0
18 Nov 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
148
0
0
29 Oct 2024
Channel-Wise Mixed-Precision Quantization for Large Language Models
Zihan Chen
Bike Xie
Jundong Li
Cong Shen
MQ
27
2
0
16 Oct 2024
PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization
Tianshi Xu
Shuzhang Zhong
Wenxuan Zeng
Runsheng Wang
Meng Li
MQ
29
0
0
12 Oct 2024
Continuous Approximations for Improving Quantization Aware Training of LLMs
He Li
Jianhang Hong
Yuanzhuo Wu
Snehal Adbol
Zonglin Li
MQ
21
1
0
06 Oct 2024
Constraint Guided Model Quantization of Neural Networks
Quinten Van Baelen
P. Karsmakers
MQ
26
0
0
30 Sep 2024
Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model
Frederic Adjewa
Moez Esseghir
Leila Merghem-Boulahia
35
5
0
28 Sep 2024
SPAQ-DL-SLAM: Towards Optimizing Deep Learning-based SLAM for Resource-Constrained Embedded Platforms
Niraj Pudasaini
Muhammad Abdullah Hanif
Muhammad Shafique
26
0
0
22 Sep 2024
Thinking in Granularity: Dynamic Quantization for Image Super-Resolution by Intriguing Multi-Granularity Clues
Mingshen Wang
Zhao Zhang
Feng Li
Ke Xu
Kang Miao
Meng Wang
MQ
SupR
38
1
0
22 Sep 2024
Sampling Latent Material-Property Information From LLM-Derived Embedding Representations
Luke P J Gilligan
M. Cobelli
Hasan M. Sayeed
Taylor D. Sparks
Stefano Sanvito
19
0
0
18 Sep 2024
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare
Zhikai Li
Jing Zhang
Qingyi Gu
MedIm
36
1
0
14 Sep 2024
Computer Vision Model Compression Techniques for Embedded Systems: A Survey
Alexandre Lopes
Fernando Pereira dos Santos
D. Oliveira
Mauricio Schiezaro
Hélio Pedrini
28
5
0
15 Aug 2024
DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers
Lianwei Yang
Haisong Gong
Qingyi Gu
MQ
32
3
0
06 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
31
0
0
01 Aug 2024
Pixel Embedding: Fully Quantized Convolutional Neural Network with Differentiable Lookup Table
Hiroyuki Tokunaga
Joel Nicholls
Daria Vazhenina
Atsunori Kanemura
MQ
16
1
0
23 Jul 2024
Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models
Aayush Saxena
Arit Kumar Bishwas
Ayush Ashok Mishra
Ryan Armstrong
19
1
0
22 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
45
0
0
22 Jul 2024
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu
Jiaxin Chen
Hanwen Zhong
Di Huang
Yun Wang
MQ
40
9
0
17 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
29
0
0
16 Jul 2024
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers
M. Deutel
Frank Hannig
Christopher Mutschler
Jürgen Teich
MQ
25
0
0
15 Jul 2024
LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid
Tianyi Zhang
Anshumali Shrivastava
MQ
31
4
0
14 Jul 2024
1
2
3
4
...
8
9
10
Next