Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.05877
Cited By
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
15 December 2017
Benoit Jacob
S. Kligys
Bo Chen
Menglong Zhu
Matthew Tang
Andrew G. Howard
Hartwig Adam
Dmitry Kalenichenko
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference"
50 / 1,255 papers shown
Title
Efficient Mixed Precision Quantization in Graph Neural Networks
Samir Moustafa
Nils M. Kriege
Wilfried Gansterer
GNN
MQ
35
0
0
14 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
49
0
0
13 May 2025
Private LoRA Fine-tuning of Open-Source LLMs with Homomorphic Encryption
Jordan Fréry
Roman Bredehoft
Jakub Klemsa
Arthur Meyre
Andrei Stoian
21
0
0
12 May 2025
Sigma-Delta Neural Network Conversion on Loihi 2
Matthew Brehove
Sadia Anjum Tumpa
Espoir Kyubwa
Naresh Menon
Vijaykrishnan Narayanan
16
0
0
09 May 2025
Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model
Navin Ranjan
Andreas E. Savakis
MQ
VLM
63
0
0
08 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
21
0
0
05 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
52
0
0
05 May 2025
Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset
Jakub Wąsala
Bartłomiej Wrzalski
Kornelia Noculak
Yuliia Tarasenko
Oliwer Krupa
Jan Kocoń
Grzegorz Chodak
31
0
0
04 May 2025
RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization
Chen Xu
Yuxuan Yue
Zukang Xu
Xing Hu
Jiangyong Yu
Zhixuan Chen
Sifan Zhou
Zhihang Yuan
Dawei Yang
MQ
24
0
0
02 May 2025
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Changjun Li
Runqing Jiang
Zhuo Song
Pengpeng Yu
Ye Zhang
Yulan Guo
MQ
56
0
0
01 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
45
0
0
01 May 2025
DNAD: Differentiable Neural Architecture Distillation
Xuan Rao
Bo Zhao
Derong Liu
34
1
0
25 Apr 2025
Silenzio: Secure Non-Interactive Outsourced MLP Training
Jonas Sander
T. Eisenbarth
28
0
0
24 Apr 2025
AlphaGrad: Non-Linear Gradient Normalization Optimizer
Soham Sane
ODL
53
0
0
22 Apr 2025
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference
Coleman Hooper
Charbel Sakr
Ben Keller
Rangharajan Venkatesan
Kurt Keutzer
S.
Brucek Khailany
MQ
42
0
0
19 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
27
1
0
17 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
24
0
0
13 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
29
0
0
13 Apr 2025
Low-Bit Integerization of Vision Transformers using Operand Reodering for Efficient Hardware
Ching-Yi Lin
Sahil Shah
MQ
64
0
0
11 Apr 2025
Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks
Erin Carson
Xinye Chen
49
0
0
10 Apr 2025
PoGO: A Scalable Proof of Useful Work via Quantized Gradient Descent and Merkle Proofs
José I. Orlicki
24
0
0
10 Apr 2025
Generative Artificial Intelligence for Internet of Things Computing: A Systematic Survey
Fabrizio Mangione
Claudio Savaglio
Giancarlo Fortino
22
0
0
10 Apr 2025
Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation
Sirine Arfa
Bernhard Vogginger
Chen Liu
Johannes Partzsch
Mark Schöne
Christian Mayr
24
0
0
09 Apr 2025
MetaCLBench: Meta Continual Learning Benchmark on Resource-Constrained Edge Devices
Sijia Li
Young D. Kwon
Lik-Hang Lee
Pan Hui
34
0
0
31 Mar 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Z. Li
L. Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
54
0
0
31 Mar 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
85
30
0
25 Mar 2025
PRIOT: Pruning-Based Integer-Only Transfer Learning for Embedded Systems
Honoka Anada
Sefutsu Ryu
Masayuki Usui
Tatsuya Kaneko
Shinya Takamaeda-Yamazaki
49
1
0
21 Mar 2025
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen
Weize Ma
Jing Liu
Changdi Yang
Rui Ding
...
Wei Niu
Yanzhi Wang
Pu Zhao
Jun Lin
Jiuxiang Gu
MQ
57
0
0
20 Mar 2025
FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers
Ruichen Chen
Keith G. Mills
Di Niu
MQ
52
0
0
19 Mar 2025
Real-Time Multi-Object Tracking using YOLOv8 and SORT on a SoC FPGA
Michal Danilowicz
T. Kryjak
VOT
56
0
0
17 Mar 2025
Bridging Language Models and Financial Analysis
Alejandro Lopez-Lira
Jihoon Kwon
Sangwoon Yoon
Jy-yong Sohn
Chanyeol Choi
AIFin
39
0
0
14 Mar 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
74
3
0
13 Mar 2025
Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables
Prarthana Bhattacharyya
Joshua Mitton
Ryan Page
Owen Morgan
Oliver Powell
...
Kemi Jacobs
Paolo Baesso
Taru Muhonen
R. Vigars
Louis Berridge
43
0
0
10 Mar 2025
Breaking the Limits of Quantization-Aware Defenses: QADT-R for Robustness Against Patch-Based Adversarial Attacks in QNNs
Amira Guesmi
B. Ouni
Muhammad Shafique
MQ
AAML
36
0
0
10 Mar 2025
Hardware-Accelerated Event-Graph Neural Networks for Low-Latency Time-Series Classification on SoC FPGA
Hiroshi Nakano
Krzysztof Blachut
K. Jeziorek
Piotr Wzorek
Manon Dampfhoffer
Thomas Mesquida
Hiroaki Nishi
T. Kryjak
Thomas Dalgaty
GNN
60
0
0
09 Mar 2025
MoFE: Mixture of Frozen Experts Architecture
Jean Seo
Jaeyoon Kim
Hyopil Shin
MoE
155
0
0
09 Mar 2025
SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
Jing Zhang
Z. Li
Qingyi Gu
MQ
VLM
51
0
0
09 Mar 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
J. Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQ
MoMe
44
0
0
07 Mar 2025
QArtSR: Quantization via Reverse-Module and Timestep-Retraining in One-Step Diffusion based Image Super-Resolution
Libo Zhu
Haotong Qin
Kaicheng Yang
W. J. Li
Yong Guo
Yulun Zhang
Susanto Rahardja
Xiaokang Yang
MQ
DiffM
64
0
0
07 Mar 2025
Security and Real-time FPGA integration for Learned Image Compression
Alaa Mazouz
Carl De Sousa Tria
Sumanta Chaudhuri
A. Fiandrotti
Marco Cagnanzzo
Mihai P. Mitrea
Enzo Tartaglione
43
1
0
06 Mar 2025
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression
Souvik Kundu
Anahita Bhiwandiwalla
Sungduk Yu
Phillip Howard
Tiep Le
S. N. Sridhar
David Cobbley
Hao Kang
Vasudev Lal
MQ
54
1
0
06 Mar 2025
One Model to Train them All: Hierarchical Self-Distillation for Enhanced Early Layer Embeddings
Andrea Gurioli
Federico Pennino
João Monteiro
Maurizio Gabbrielli
46
0
0
04 Mar 2025
Lossy Neural Compression for Geospatial Analytics: A Review
Carlos Gomes
Isabelle Wittmann
Damien Robert
Johannes Jakubik
Tim Reichelt
...
Romeo Kienzler
Rania Briq
Sabrina Benassou
Michele Lazzarini
C. Albrecht
90
2
0
03 Mar 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Tiansheng Wen
Yifei Wang
Zequn Zeng
Zhong Peng
Yudi Su
Xinyang Liu
Bo Chen
Hongwei Liu
Stefanie Jegelka
Chenyu You
CLL
66
2
0
03 Mar 2025
Optimal Brain Apoptosis
Mingyuan Sun
Zheng Fang
Jiaxu Wang
Junjie Jiang
Delei Kong
Chenming Hu
Yuetong Fang
Renjing Xu
AAML
66
0
0
25 Feb 2025
A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
Zhizhi Peng
Taotao Wang
Chonghe Zhao
Guofu Liao
Zibin Lin
Y. Liu
Bin Cao
Long Shi
Qing Yang
Shengli Zhang
59
2
0
25 Feb 2025
Optimizing DNN Inference on Multi-Accelerator SoCs at Training-time
Matteo Risso
Alessio Burrello
Daniele Jahier Pagliari
41
0
0
24 Feb 2025
More for Keys, Less for Values: Adaptive KV Cache Quantization
Mohsen Hariri
Lam Nguyen
Sixu Chen
Shaochen Zhong
Qifan Wang
Xia Hu
Xiaotian Han
V. Chaudhary
MQ
38
0
0
24 Feb 2025
Verification of Bit-Flip Attacks against Quantized Neural Networks
Yedi Zhang
Lei Huang
Pengfei Gao
Fu Song
Jun Sun
Jin Song Dong
AAML
47
0
0
22 Feb 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
85
0
0
18 Feb 2025
1
2
3
4
...
24
25
26
Next