ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.11046
  4. Cited By
Scalable Methods for 8-bit Training of Neural Networks

Scalable Methods for 8-bit Training of Neural Networks

25 May 2018
Ron Banner
Itay Hubara
Elad Hoffer
Daniel Soudry
    MQ
ArXivPDFHTML

Papers citing "Scalable Methods for 8-bit Training of Neural Networks"

50 / 167 papers shown
Title
Silenzio: Secure Non-Interactive Outsourced MLP Training
Silenzio: Secure Non-Interactive Outsourced MLP Training
Jonas Sander
T. Eisenbarth
28
0
0
24 Apr 2025
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing
Shiyang Zhou
Haijin Zeng
Yunfan Lu
Tong Shao
Ke Tang
Yongyong Chen
Jie Liu
Jingyong Su
Mamba
65
0
0
20 Mar 2025
Accurate INT8 Training Through Dynamic Block-Level Fallback
Pengle Zhang
Jia wei
Jintao Zhang
Jun-Jie Zhu
Jianfei Chen
MQ
74
3
0
13 Mar 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
85
0
0
18 Feb 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
Jiajun Zhou
Yifan Yang
Kai Zhen
Z. Liu
Yequan Zhao
Ershad Banijamali
Athanasios Mouchtaris
Ngai Wong
Zheng Zhang
MQ
41
0
0
17 Feb 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Eric Aubinais
Philippe Formont
Pablo Piantanida
Elisabeth Gassiat
38
0
0
10 Feb 2025
Optimizing Large Language Model Training Using FP4 Quantization
Optimizing Large Language Model Training Using FP4 Quantization
Ruizhe Wang
Yeyun Gong
Xiao Liu
Guoshuai Zhao
Ziyue Yang
Baining Guo
Zhengjun Zha
Peng Cheng
MQ
67
4
0
28 Jan 2025
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
Alireza Ghaffari
Sharareh Younesian
Boxing Chen
Vahid Partovi Nia
M. Asgharian
MQ
61
0
0
17 Jan 2025
Fast and Slow Gradient Approximation for Binary Neural Network
  Optimization
Fast and Slow Gradient Approximation for Binary Neural Network Optimization
Xinquan Chen
Junqi Gao
Biqing Qi
Dong Li
Yiang Luo
Fangyuan Li
Pengfei Li
MQ
73
0
0
16 Dec 2024
Towards Accurate and Efficient Sub-8-Bit Integer Training
Wenjin Guo
Donglai Liu
Weiying Xie
Yunsong Li
Xuefei Ning
Zihan Meng
Shulin Zeng
Jie Lei
Zhenman Fang
Yu Wang
MQ
34
1
0
17 Nov 2024
Lossless KV Cache Compression to 2%
Lossless KV Cache Compression to 2%
Zhen Yang
Jizong Han
Kan Wu
Ruobing Xie
An Wang
X. Sun
Zhanhui Kang
VLM
MQ
31
2
0
20 Oct 2024
Error Diffusion: Post Training Quantization with Block-Scaled Number
  Formats for Neural Networks
Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi
K. Denolf
Eric Dellinger
MQ
32
0
0
15 Oct 2024
Differentiable Weightless Neural Networks
Differentiable Weightless Neural Networks
Alan T. L. Bacellar
Zachary Susskind
Mauricio Breternitz Jr.
E. John
L. John
P. Lima
F. M. G. França
22
3
0
14 Oct 2024
Compressing VAE-Based Out-of-Distribution Detectors for Embedded
  Deployment
Compressing VAE-Based Out-of-Distribution Detectors for Embedded Deployment
Aditya Bansal
Michael Yuhas
Arvind Easwaran
OODD
19
0
0
02 Sep 2024
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bit
Chang Gao
J. Chen
Kang Zhao
Jiaqi Wang
Liping Jing
MQ
35
2
0
26 Aug 2024
Robust Iterative Value Conversion: Deep Reinforcement Learning for
  Neurochip-driven Edge Robots
Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
Y. Kadokawa
Tomohito Kodera
Yoshihisa Tsurumine
Shinya Nishimura
Takamitsu Matsubara
29
1
0
23 Aug 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
45
0
0
22 Jul 2024
NITRO-D: Native Integer-only Training of Deep Convolutional Neural
  Networks
NITRO-D: Native Integer-only Training of Deep Convolutional Neural Networks
Alberto Pirillo
Luca Colombo
Manuel Roveri
MQ
29
0
0
16 Jul 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
45
0
0
29 Jun 2024
Compensate Quantization Errors: Make Weights Hierarchical to Compensate
  Each Other
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Yifei Gao
Jie Ou
Lei Wang
Yuting Xiao
Zhiyuan Xiang
Ruiting Dai
Jun Cheng
MQ
36
3
0
24 Jun 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing
  Backpropagation
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
32
1
0
24 Jun 2024
NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed
  Autonomous Navigation
NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation
Timothy K Johnsen
Ian Harshbarger
Zixia Xia
Marco Levorato
20
1
0
18 Jun 2024
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental
  Learning
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning
Dingwen Zhang
Yan Li
De-Chun Cheng
N. Wang
J. Han
CLL
34
0
0
13 Jun 2024
LoQT: Low Rank Adapters for Quantized Training
LoQT: Low Rank Adapters for Quantized Training
Sebastian Loeschcke
M. Toftrup
M. Kastoryano
Serge J. Belongie
Vésteinn Snæbjarnarson
MQ
34
0
0
26 May 2024
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization
  Method for LLMs
AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
Alireza Ghaffari
Sharareh Younesian
Vahid Partovi Nia
Boxing Chen
M. Asgharian
MQ
40
0
0
22 May 2024
Acceleration Algorithms in GNNs: A Survey
Acceleration Algorithms in GNNs: A Survey
Lu Ma
Zeang Sheng
Xunkai Li
Xin Gao
Zhezheng Hao
Ling Yang
Wentao Zhang
Bin Cui
GNN
36
3
0
07 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
49
3
0
06 May 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A
  Comprehensive Survey
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
29
6
0
09 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
48
0
08 Apr 2024
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data
  Flow and Per-Block Quantization
Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization
Haocheng Xi
Yuxiang Chen
Kang Zhao
Kaijun Zheng
Jianfei Chen
Jun Zhu
MQ
37
20
0
19 Mar 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
40
1
0
04 Mar 2024
Trainable Fixed-Point Quantization for Deep Learning Acceleration on
  FPGAs
Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs
Dingyi Dai
Yichi Zhang
Jiahao Zhang
Zhanqiu Hu
Yaohui Cai
Qi Sun
Zhiru Zhang
MQ
54
5
0
31 Jan 2024
Effect of Weight Quantization on Learning Models by Typical Case
  Analysis
Effect of Weight Quantization on Learning Models by Typical Case Analysis
Shuhei Kashiwamura
Ayaka Sakata
Masaaki Imaizumi
MQ
19
1
0
30 Jan 2024
Towards Cheaper Inference in Deep Networks with Lower Bit-Width
  Accumulators
Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Yaniv Blumenfeld
Itay Hubara
Daniel Soudry
37
3
0
25 Jan 2024
Enabling On-device Continual Learning with Binary Neural Networks
Enabling On-device Continual Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Guido Borghi
Stefano Santi
MQ
31
5
0
18 Jan 2024
Knowledge Translation: A New Pathway for Model Compression
Knowledge Translation: A New Pathway for Model Compression
Wujie Sun
Defang Chen
Jiawei Chen
Yan Feng
Chun-Yen Chen
Can Wang
23
0
0
11 Jan 2024
FP8-BERT: Post-Training Quantization for Transformer
FP8-BERT: Post-Training Quantization for Transformer
Jianwei Li
Tianchi Zhang
Ian En-Hsu Yen
Dongkuan Xu
MQ
10
5
0
10 Dec 2023
Low-Precision Mixed-Computation Models for Inference on Edge
Low-Precision Mixed-Computation Models for Inference on Edge
Seyedarmin Azizi
M. Nazemi
M. Kamal
Massoud Pedram
MQ
13
1
0
03 Dec 2023
Improving the Robustness of Quantized Deep Neural Networks to White-Box
  Attacks using Stochastic Quantization and Information-Theoretic Ensemble
  Training
Improving the Robustness of Quantized Deep Neural Networks to White-Box Attacks using Stochastic Quantization and Information-Theoretic Ensemble Training
Saurabh Farkya
Aswin Raghavan
Avi Ziskind
14
0
0
30 Nov 2023
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Cansu Demirkıran
Guowei Yang
D. Bunandar
Ajay Joshi
18
1
0
29 Nov 2023
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices
Minghao Yan
Hongyi Wang
Shivaram Venkataraman
10
0
0
30 Oct 2023
Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Efficient Low-rank Backpropagation for Vision Transformer Adaptation
Yuedong Yang
Hung-Yueh Chiang
Guihong Li
Diana Marculescu
R. Marculescu
32
9
0
26 Sep 2023
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian
Ofir Gordon
H. Habi
Arnon Netzer
MQ
33
1
0
20 Sep 2023
On-Device Learning with Binary Neural Networks
On-Device Learning with Binary Neural Networks
Lorenzo Vorabbi
Davide Maltoni
Stefano Santi
MQ
18
4
0
29 Aug 2023
Low-bit Quantization for Deep Graph Neural Networks with
  Smoothness-aware Message Propagation
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Shuang Wang
B. Eravcı
Rustam Guliyev
Hakan Ferhatosmanoglu
GNN
MQ
19
6
0
29 Aug 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization
  Using Floating-Point Formats
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
27
43
0
19 Jul 2023
Self-Distilled Quantization: Achieving High Compression Rates in
  Transformer-Based Language Models
Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models
James OÑeill
Sourav Dutta
VLM
MQ
32
1
0
12 Jul 2023
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
Jorn W. T. Peters
Marios Fournarakis
Markus Nagel
M. V. Baalen
Tijmen Blankevoort
MQ
11
5
0
10 Jul 2023
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair
Hongxu Yin
Maying Shen
Pavlo Molchanov
J. Álvarez
35
10
0
25 Jun 2023
Training Transformers with 4-bit Integers
Training Transformers with 4-bit Integers
Haocheng Xi
Changhao Li
Jianfei Chen
Jun Zhu
MQ
25
47
0
21 Jun 2023
1234
Next