ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1902.08153
  4. Cited By
Learned Step Size Quantization

Learned Step Size Quantization

21 February 2019
S. K. Esser
J. McKinstry
Deepika Bablani
R. Appuswamy
D. Modha
    MQ
ArXivPDFHTML

Papers citing "Learned Step Size Quantization"

50 / 142 papers shown
Title
Diffusion Model Quantization: A Review
Diffusion Model Quantization: A Review
Qian Zeng
Chenggong Hu
Mingli Song
Jie Song
MQ
45
0
0
08 May 2025
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
Lianbo Ma
Jianlun Ma
Yuee Zhou
Guoyang Xie
Qiang He
Zhichao Lu
MQ
45
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
A. P. Condurache
MQ
42
0
0
06 May 2025
Radio: Rate-Distortion Optimization for Large Language Model Compression
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
21
0
0
05 May 2025
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Changjun Li
Runqing Jiang
Zhuo Song
Pengpeng Yu
Ye Zhang
Yulan Guo
MQ
56
0
0
01 May 2025
Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
Lucas Maisonnave
Cyril Moineau
Olivier Bichler
Fabrice Rastello
MQ
40
0
0
18 Apr 2025
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou
Shuo Wang
Zhihang Yuan
Mingjia Shi
Yuzhang Shang
Dawei Yang
ALM
MQ
85
0
0
18 Feb 2025
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits
Zikai Zhou
Qizheng Zhang
Hermann Kumbong
Kunle Olukotun
MQ
226
0
0
12 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
74
2
0
24 Jan 2025
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
75
1
0
17 Jan 2025
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
41
1
0
10 Jan 2025
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
Tse-Wei Chen
Wei Tao
Dongyue Zhao
Kazuhiro Mima
Tadayuki Ito
Kinya Osa
Masami Kato
MQ
31
0
0
03 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Fei Chao
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
43
2
0
29 Dec 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
36
0
0
01 Nov 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
145
0
0
29 Oct 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
32
1
0
10 Oct 2024
JPEG Inspired Deep Learning
JPEG Inspired Deep Learning
Ahmed H. Salamah
Kaixiang Zheng
Yiwen Liu
E. Yang
27
0
0
09 Oct 2024
QT-DoG: Quantization-aware Training for Domain Generalization
QT-DoG: Quantization-aware Training for Domain Generalization
Saqib Javed
Hieu Le
Mathieu Salzmann
OOD
MQ
28
1
0
08 Oct 2024
Foundations of Large Language Model Compression -- Part 1: Weight
  Quantization
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
40
1
0
03 Sep 2024
Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile
  Devices
Realizing Unaligned Block-wise Pruning for DNN Acceleration on Mobile Devices
Hayun Lee
Dongkun Shin
MQ
26
0
0
29 Jul 2024
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
Kanghyun Choi
Hyeyoon Lee
Dain Kwon
Sunjong Park
Kyuyeun Kim
Noseong Park
Jinho Lee
Jinho Lee
MQ
40
1
0
29 Jul 2024
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Temporal Feature Matters: A Framework for Diffusion Model Quantization
Yushi Huang
Ruihao Gong
Xianglong Liu
Jing Liu
Yuhang Li
Jiwen Lu
Dacheng Tao
DiffM
MQ
49
0
0
28 Jul 2024
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
Yifei Gao
Jie Ou
Lei Wang
Fanhua Shang
Jaji Wu
MQ
45
0
0
22 Jul 2024
MetaAug: Meta-Data Augmentation for Post-Training Quantization
MetaAug: Meta-Data Augmentation for Post-Training Quantization
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
Dinh Q. Phung
Gustavo Carneiro
Thanh-Toan Do
MQ
40
0
0
20 Jul 2024
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Jung Hyun Lee
Jeonghoon Kim
J. Yang
S. Kwon
Eunho Yang
Kang Min Yoo
Dongsoo Lee
MQ
36
2
0
16 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
LI DU
Guoqi Li
Jiajun Zhang
50
5
0
05 Jul 2024
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint
  Shrinking
ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Wenshuo Li
Xinghao Chen
Han Shu
Yehui Tang
Yunhe Wang
MQ
31
2
0
17 Jun 2024
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
66
1
0
12 Jun 2024
xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based
  Edge Systems
xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems
Georg Rutishauser
Joan Mihali
Moritz Scherer
Luca Benini
24
1
0
29 May 2024
Selective Focus: Investigating Semantics Sensitivity in Post-training
  Quantization for Lane Detection
Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection
Yunqian Fan
Xiuying Wei
Ruihao Gong
Yuqing Ma
Xiangguo Zhang
Qi Zhang
Xianglong Liu
MQ
27
2
0
10 May 2024
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
23
2
0
22 Apr 2024
Instance-Aware Group Quantization for Vision Transformers
Instance-Aware Group Quantization for Vision Transformers
Jaehyeon Moon
Dohyung Kim
Junyong Cheon
Bumsub Ham
MQ
ViT
27
6
0
01 Apr 2024
Better Schedules for Low Precision Training of Deep Neural Networks
Better Schedules for Low Precision Training of Deep Neural Networks
Cameron R. Wolfe
Anastasios Kyrillidis
45
1
0
04 Mar 2024
Boosting Neural Representations for Videos with a Conditional Decoder
Boosting Neural Representations for Videos with a Conditional Decoder
Xinjie Zhang
Ren Yang
Dailan He
Xingtong Ge
Tongda Xu
Yan Wang
Hongwei Qin
Jun Zhang
34
15
0
28 Feb 2024
Effective Gradient Sample Size via Variation Estimation for Accelerating
  Sharpness aware Minimization
Effective Gradient Sample Size via Variation Estimation for Accelerating Sharpness aware Minimization
Jiaxin Deng
Junbiao Pang
Baochang Zhang
Tian Wang
40
1
0
24 Feb 2024
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
Arnav Chavan
Raghav Magazine
Shubham Kushwaha
M. Debbah
Deepak Gupta
16
18
0
02 Feb 2024
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech
  Recognition with Universal Speech Models
USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Shaojin Ding
David Qiu
David Rim
Yanzhang He
Oleg Rybakov
...
Tara N. Sainath
Zhonglin Han
Jian Li
Amir Yazdanbakhsh
Shivani Agrawal
MQ
26
9
0
13 Dec 2023
CBQ: Cross-Block Quantization for Large Language Models
CBQ: Cross-Block Quantization for Large Language Models
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
...
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
MQ
27
13
0
13 Dec 2023
SmoothQuant+: Accurate and Efficient 4-bit Post-Training
  WeightQuantization for LLM
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM
Jiayi Pan
Chengcan Wang
Kaifu Zheng
Yangguang Li
Zhenyu Wang
Bin Feng
MQ
35
7
0
06 Dec 2023
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
Yushi Huang
Ruihao Gong
Jing Liu
Tianlong Chen
Xianglong Liu
DiffM
MQ
17
37
0
27 Nov 2023
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Effortless Cross-Platform Video Codec: A Codebook-Based Method
Kuan Tian
Yonghang Guan
Jin-Peng Xiang
Jun Zhang
Xiao Han
Wei Yang
32
1
0
16 Oct 2023
MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
MobileNVC: Real-time 1080p Neural Video Compression on a Mobile Device
T. V. Rozendaal
Tushar Singhal
Hoang Le
Guillaume Sautière
Amir Said
...
Hitarth Mehta
Frank Mayer
Liang Zhang
Markus Nagel
Auke Wiggers
37
11
0
02 Oct 2023
Low-bit Quantization for Deep Graph Neural Networks with
  Smoothness-aware Message Propagation
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Shuang Wang
B. Eravcı
Rustam Guliyev
Hakan Ferhatosmanoglu
GNN
MQ
19
6
0
29 Aug 2023
Efficient Neural PDE-Solvers using Quantization Aware Training
Efficient Neural PDE-Solvers using Quantization Aware Training
W.V.S.O. van den Dool
Tijmen Blankevoort
Max Welling
Yuki M. Asano
MQ
27
3
0
14 Aug 2023
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara
Sankalp Dayal
Tarqi Afzal
Rahul Bakshi
Kahkuen Fu
MQ
20
0
0
01 Aug 2023
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution
  Networks
Overcoming Distribution Mismatch in Quantizing Image Super-Resolution Networks
Chee Hong
Kyoung Mu Lee
SupR
MQ
19
1
0
25 Jul 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
26
9
0
20 Jul 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads
  Do Nothing
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
13
88
0
22 Jun 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
46
187
0
29 May 2023
123
Next