Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.11046
Cited By
Scalable Methods for 8-bit Training of Neural Networks
25 May 2018
Ron Banner
Itay Hubara
Elad Hoffer
Daniel Soudry
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scalable Methods for 8-bit Training of Neural Networks"
50 / 167 papers shown
Title
Silenzio: Secure Non-Interactive Outsourced MLP Training
Jonas Sander
T. Eisenbarth
28
0
0
24 Apr 2025
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Shiyang Zhou
Haijin Zeng
Yunfan Lu
Tong Shao
Ke Tang
Yongyong Chen
Jie Liu
Jingyong Su
Mamba
65
0
0
20 Mar 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
74
3
0
13 Mar 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
85
0
0
18 Feb 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou
Yifan Yang
Kai Zhen
Z. Liu
Yequan Zhao
Ershad Banijamali
Athanasios Mouchtaris
Ngai Wong
Zheng Zhang
MQ
41
0
0
17 Feb 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Eric Aubinais
Philippe Formont
Pablo Piantanida
Elisabeth Gassiat
38
0
0
10 Feb 2025
Optimizing Large Language Model Training Using FP4 Quantization
Ruizhe Wang
Yeyun Gong
Xiao Liu
Guoshuai Zhao
Ziyue Yang
Baining Guo
Zhengjun Zha
Peng Cheng
MQ
67
4
0
28 Jan 2025
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Alireza Ghaffari
Sharareh Younesian
Boxing Chen
Vahid Partovi Nia
M. Asgharian
MQ
61
0
0
17 Jan 2025
Fast and Slow Gradient Approximation for Binary Neural Network Optimization
Xinquan Chen
Junqi Gao
Biqing Qi
Dong Li
Yiang Luo
Fangyuan Li
Pengfei Li
MQ
73
0
0
16 Dec 2024
Towards Accurate and Efficient Sub-8-Bit Integer Training
Wenjin Guo
Donglai Liu
Weiying Xie
Yunsong Li
Xuefei Ning
Zihan Meng
Shulin Zeng
Jie Lei
Zhenman Fang
Yu Wang
MQ
34
1
0
17 Nov 2024
Lossless KV Cache Compression to 2%
Zhen Yang
Jizong Han
Kan Wu
Ruobing Xie
An Wang
X. Sun
Zhanhui Kang
VLM
MQ
31
2
0
20 Oct 2024
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi
K. Denolf
Eric Dellinger
MQ
32
0
0
15 Oct 2024
Differentiable Weightless Neural Networks
Alan T. L. Bacellar
Zachary Susskind
Mauricio Breternitz Jr.
E. John
L. John
P. Lima
F. M. G. França
22
3
0
14 Oct 2024
Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment
Aditya Bansal
Michael Yuhas
Arvind Easwaran
OODD
19
0
0
02 Sep 2024
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
Chang Gao
J. Chen
Kang Zhao
Jiaqi Wang
Liping Jing
MQ
35
2
0
26 Aug 2024
Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
Y. Kadokawa
Tomohito Kodera
Yoshihisa Tsurumine
Shinya Nishimura
Takamitsu Matsubara
29
1
0
23 Aug 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
45
0
0
22 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
29
0
0
16 Jul 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
45
0
0
29 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
32
1
0
24 Jun 2024
NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation
Timothy K Johnsen
Ian Harshbarger
Zixia Xia
Marco Levorato
20
1
0
18 Jun 2024
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning
Dingwen Zhang
Yan Li
De-Chun Cheng
N. Wang
J. Han
CLL
34
0
0
13 Jun 2024
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge J. Belongie
Vésteinn Snæbjarnarson
MQ
34
0
0
26 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
40
0
0
22 May 2024
Acceleration Algorithms in GNNs: A Survey
Lu Ma
Zeang Sheng
Xunkai Li
Xin Gao
Zhezheng Hao
Ling Yang
Wentao Zhang
Bin Cui
GNN
36
3
0
07 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
49
3
0
06 May 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
29
6
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
48
0
08 Apr 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
37
20
0
19 Mar 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
40
1
0
04 Mar 2024
Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs
Dingyi Dai
Yichi Zhang
Jiahao Zhang
Zhanqiu Hu
Yaohui Cai
Qi Sun
Zhiru Zhang
MQ
54
5
0
31 Jan 2024
Effect of Weight Quantization on Learning Models by Typical Case Analysis
Shuhei Kashiwamura
Ayaka Sakata
Masaaki Imaizumi
MQ
19
1
0
30 Jan 2024
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Yaniv Blumenfeld
Itay Hubara
Daniel Soudry
37
3
0
25 Jan 2024
Enabling On-device Continual Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Guido Borghi
Stefano Santi
MQ
31
5
0
18 Jan 2024
Knowledge Translation: A New Pathway for Model Compression
Wujie Sun
Defang Chen
Jiawei Chen
Yan Feng
Chun-Yen Chen
Can Wang
23
0
0
11 Jan 2024
FP8-BERT: Post-Training Quantization for Transformer
Jianwei Li
Tianchi Zhang
Ian En-Hsu Yen
Dongkuan Xu
MQ
10
5
0
10 Dec 2023
Low-Precision Mixed-Computation Models for Inference on Edge
Seyedarmin Azizi
M. Nazemi
M. Kamal
Massoud Pedram
MQ
13
1
0
03 Dec 2023
Improving the Robustness of Quantized Deep Neural Networks to White-Box Attacks using Stochastic Quantization and Information-Theoretic Ensemble Training
Saurabh Farkya
Aswin Raghavan
Avi Ziskind
14
0
0
30 Nov 2023
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Cansu Demirkıran
Guowei Yang
D. Bunandar
Ajay Joshi
18
1
0
29 Nov 2023
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
Minghao Yan
Hongyi Wang
Shivaram Venkataraman
10
0
0
30 Oct 2023
Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Yuedong Yang
Hung-Yueh Chiang
Guihong Li
Diana Marculescu
R. Marculescu
32
9
0
26 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
33
1
0
20 Sep 2023
On-Device Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Stefano Santi
MQ
18
4
0
29 Aug 2023
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Shuang Wang
B. Eravcı
Rustam Guliyev
Hakan Ferhatosmanoglu
GNN
MQ
19
6
0
29 Aug 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
27
43
0
19 Jul 2023
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
32
1
0
12 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
11
5
0
10 Jul 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
35
10
0
25 Jun 2023
Training Transformers with 4-bit Integers
Haocheng Xi
Changhao Li
Jianfei Chen
Jun Zhu
MQ
25
47
0
21 Jun 2023
1
2
3
4
Next