ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.01526
  4. Cited By
Training DNNs with Hybrid Block Floating Point
v1v2v3v4 (latest)

Training DNNs with Hybrid Block Floating Point

4 April 2018
M. Drumond
Tao Lin
Martin Jaggi
Babak Falsafi
ArXiv (abs)PDFHTML

Papers citing "Training DNNs with Hybrid Block Floating Point"

38 / 38 papers shown
Title
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
Jungi Lee
Junyong Park
Soohyun Cha
Jaehoon Cho
Jaewoong Sim
92
1
0
16 Oct 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
MQALM
564
11
0
18 Feb 2025
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped
  Activation Data Format
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data FormatInternational Symposium on High-Performance Computer Architecture (HPCA), 2024
Chao Fang
Man Shi
Robin Geens
Arne Symons
Zhongfeng Wang
Marian Verhelst
406
9
0
24 Nov 2024
AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference
AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM InferenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Janghwan Lee
Jiwoong Park
Jinseok Kim
Yongjik Kim
Jungju Oh
Jinwook Oh
Jungwook Choi
326
8
0
15 Nov 2024
BitQ: Tailoring Block Floating Point Precision for Improved DNN
  Efficiency on Resource-Constrained Devices
BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices
Yongqi Xu
Yujian Lee
Gao Yi
Bosheng Liu
Yucong Chen
Peng Liu
Jigang Wu
Xiaoming Chen
Yinhe Han
MQ
273
3
0
25 Sep 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
345
19
0
31 May 2024
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video
  Analytics
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics
Yoonsung Kim
Changhun Oh
Jinwoo Hwang
Wonung Kim
Seongryong Oh
Yubin Lee
Hardik Sharma
Amir Yazdanbakhsh
Jongse Park
443
16
0
21 Mar 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang
Jianyi Cheng
George A. Constantinides
Yiren Zhao
MQ
418
23
0
04 Feb 2024
Mirage: An RNS-Based Photonic Accelerator for DNN Training
Mirage: An RNS-Based Photonic Accelerator for DNN TrainingInternational Symposium on Computer Architecture (ISCA), 2023
Cansu Demirkıran
Guowei Yang
D. Bunandar
Ajay Joshi
231
8
0
29 Nov 2023
Microscaling Data Formats for Deep Learning
Microscaling Data Formats for Deep Learning
B. Rouhani
Ritchie Zhao
Ankit More
Mathew Hall
Alireza Khodamoradi
...
Maxim Naumov
Colin Verilli
Ralph Wittig
Doug Burger
Eric S. Chung
MQ
389
115
0
16 Oct 2023
Number Systems for Deep Neural Network Architectures: A Survey
Number Systems for Deep Neural Network Architectures: A Survey
Ghada Alsuhli
Vasileios Sakellariou
H. Saleh
Mahmoud Al-Qutayri
Baker Mohammad
T. Stouraitis
199
5
0
11 Jul 2023
Training Transformers with 4-bit Integers
Training Transformers with 4-bit IntegersNeural Information Processing Systems (NeurIPS), 2023
Haocheng Xi
Changhao Li
Jianfei Chen
Jun Zhu
MQ
306
74
0
21 Jun 2023
Multiplication-Free Transformer Training via Piecewise Affine Operations
Multiplication-Free Transformer Training via Piecewise Affine OperationsNeural Information Processing Systems (NeurIPS), 2023
Atli Kosson
Martin Jaggi
227
8
0
26 May 2023
Stable and low-precision training for large-scale vision-language models
Stable and low-precision training for large-scale vision-language modelsNeural Information Processing Systems (NeurIPS), 2023
Mitchell Wortsman
Tim Dettmers
Luke Zettlemoyer
Ari S. Morcos
Ali Farhadi
Ludwig Schmidt
MQMLLMVLM
290
68
0
25 Apr 2023
Dynamic Stashing Quantization for Efficient Transformer Training
Dynamic Stashing Quantization for Efficient Transformer TrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Guofu Yang
Daniel Lo
Robert D. Mullins
Yiren Zhao
MQ
191
9
0
09 Mar 2023
With Shared Microexponents, A Little Shifting Goes a Long Way
With Shared Microexponents, A Little Shifting Goes a Long WayInternational Symposium on Computer Architecture (ISCA), 2023
Bita Darvish Rouhani
Ritchie Zhao
V. Elango
Rasoul Shafipour
Mathew Hall
...
Eric S. Chung
Zhaoxia Deng
S. Naghshineh
Jongsoo Park
Maxim Naumov
MQ
291
62
0
16 Feb 2023
Training with Mixed-Precision Floating-Point Assignments
Training with Mixed-Precision Floating-Point Assignments
Wonyeol Lee
Rahul Sharma
A. Aiken
MQ
207
7
0
31 Jan 2023
Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training
Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training
Simla Burcu Harma
Canberk Sonmez
Nicholas Sperry
Babak Falsafi
Martin Jaggi
Yunho Oh
MQ
297
5
0
19 Nov 2022
LightNorm: Area and Energy-Efficient Batch Normalization Hardware for
  On-Device DNN Training
LightNorm: Area and Energy-Efficient Batch Normalization Hardware for On-Device DNN TrainingICCD (ICCD), 2022
Seock-Hwan Noh
Junsang Park
Dahoon Park
Jahyun Koo
Jeik Choi
Jaeha Kung
84
10
0
04 Nov 2022
Approximating Continuous Convolutions for Deep Network Compression
Approximating Continuous Convolutions for Deep Network CompressionBritish Machine Vision Conference (BMVC), 2022
Theo W. Costain
V. Prisacariu
163
0
0
17 Oct 2022
Is Integer Arithmetic Enough for Deep Learning Training?
Is Integer Arithmetic Enough for Deep Learning Training?Neural Information Processing Systems (NeurIPS), 2022
Alireza Ghaffari
Marzieh S. Tahaei
Mohammadreza Tayaranian
M. Asgharian
V. Nia
MQ
215
21
0
18 Jul 2022
Adaptive Block Floating-Point for Analog Deep Learning Hardware
Adaptive Block Floating-Point for Analog Deep Learning Hardware
Ayon Basumallik
D. Bunandar
Nicholas Dronen
Nicholas Harris
Ludmila Levkova
Calvin McCarter
Lakshmi Nair
David Walter
David Widemann
153
10
0
12 May 2022
Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for
  Deep Learning Training
Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Milovs Nikolić
Enrique Torres Sanchez
Jia-Hui Wang
Ali Hadi Zadeh
Mostafa Mahmoud
Ameer Abdelhadi
Kareem Ibrahim
Andreas Moshovos
MQ
182
1
0
28 Apr 2022
FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems
FPGA-based AI Smart NICs for Scalable Distributed AI Training SystemsIEEE computer architecture letters (CAL), 2022
Rui Ma
E. Georganas
A. Heinecke
Andrew Boutros
Eriko Nurvitadhi
GNN
139
19
0
22 Apr 2022
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block
  Floating Point Support
FlexBlock: A Flexible DNN Training Accelerator with Multi-Mode Block Floating Point SupportIEEE transactions on computers (IEEE Trans. Comput.), 2022
Seock-Hwan Noh
Jahyun Koo
Seunghyun Lee
Jongse Park
Jaeha Kung
AI4CE
184
27
0
13 Mar 2022
Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and
  Implementation-Level Techniques
Resource-Efficient Deep Learning: A Survey on Model-, Arithmetic-, and Implementation-Level TechniquesACM Computing Surveys (CSUR), 2021
JunKyu Lee
L. Mukhanov
A. S. Molahosseini
U. Minhas
Yang Hua
Jesus Martinez del Rincon
K. Dichev
Cheol-Ho Hong
Hans Vandierendonck
181
36
0
30 Dec 2021
FAST: DNN Training Under Variable Precision Block Floating Point with
  Stochastic Rounding
FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic RoundingInternational Symposium on High-Performance Computer Architecture (HPCA), 2021
Shanghang Zhang
Bradley McDanel
H. T. Kung
MQ
133
86
0
28 Oct 2021
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
364
383
0
06 Oct 2021
EFloat: Entropy-coded Floating Point Format for Compressing Vector
  Embedding Models
EFloat: Entropy-coded Floating Point Format for Compressing Vector Embedding Models
R. Bordawekar
B. Abali
Ming-Hung Chen
MQ
147
3
0
04 Feb 2021
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
Rethinking Floating Point Overheads for Mixed Precision DNN AcceleratorsConference on Machine Learning and Systems (MLSys), 2021
Hamzah Abdel-Aziz
Ali Shafiee
J. Shin
A. Pedram
Joseph Hassoun
MQ
174
13
0
27 Jan 2021
A Statistical Framework for Low-bitwidth Training of Deep Neural
  Networks
A Statistical Framework for Low-bitwidth Training of Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2020
Jianfei Chen
Yujie Gai
Z. Yao
Michael W. Mahoney
Joseph E. Gonzalez
MQ
133
69
0
27 Oct 2020
FPRaker: A Processing Element For Accelerating Neural Network Training
FPRaker: A Processing Element For Accelerating Neural Network Training
Omar Mohamed Awad
Mostafa Mahmoud
Isak Edo Vivancos
Ali Hadi Zadeh
Ciaran Bannon
Anand Jayarajan
Gennady Pekhimenko
Andreas Moshovos
157
15
0
15 Oct 2020
An FPGA Accelerated Method for Training Feed-forward Neural Networks
  Using Alternating Direction Method of Multipliers and LSMR
An FPGA Accelerated Method for Training Feed-forward Neural Networks Using Alternating Direction Method of Multipliers and LSMR
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
118
3
0
06 Sep 2020
TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network
  Training and Inference
TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and InferenceMicro (MICRO), 2020
Mostafa Mahmoud
Isak Edo Vivancos
Ali Hadi Zadeh
Omar Mohamed Awad
Gennady Pekhimenko
Jorge Albericio
Andreas Moshovos
MoE
204
63
0
01 Sep 2020
Boosted and Differentially Private Ensembles of Decision Trees
Boosted and Differentially Private Ensembles of Decision Trees
Richard Nock
Wilko Henecka
177
2
0
26 Jan 2020
DSConv: Efficient Convolution Operator
DSConv: Efficient Convolution Operator
Marcelo Gennari
Roger Fawcett
V. Prisacariu
MQ
120
97
0
07 Jan 2019
Distributed Learning over Unreliable Networks
Distributed Learning over Unreliable Networks
Chen Yu
Hanlin Tang
Cédric Renggli
S. Kassing
Ankit Singla
Dan Alistarh
Ce Zhang
Ji Liu
OOD
244
66
0
17 Oct 2018
Communication Compression for Decentralized Training
Communication Compression for Decentralized Training
Hanlin Tang
Shaoduo Gan
Ce Zhang
Tong Zhang
Ji Liu
343
293
0
17 Mar 2018
1