ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09145
  4. Cited By
Outlier Suppression+: Accurate quantization of large language models by
  equivalent and optimal shifting and scaling
v1v2v3 (latest)

Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
18 April 2023
Xiuying Wei
Yunchen Zhang
Yuhang Li
Xiangguo Zhang
Yazhe Niu
Jian Ren
Zhengang Li
    MQ
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (46★)

Papers citing "Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling"

25 / 25 papers shown
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Energy-Efficient and Dequantization-Free Q-LLMs: A Spiking Neural Network Approach to Salient Value Mitigation
Chenyu Wang
Zhanglu Yan
Zhi Zhou
Xu Chen
Weng-Fai Wong
MQ
152
0
0
22 Oct 2025
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
227
0
0
19 Oct 2025
Interpreting the Effects of Quantization on LLMs
Interpreting the Effects of Quantization on LLMs
Manpreet Singh
Hassan Sajjad
MQMILM
377
3
0
22 Aug 2025
Neural Network Quantization for Microcontrollers: A Comprehensive Survey of Methods, Platforms, and Applications
Neural Network Quantization for Microcontrollers: A Comprehensive Survey of Methods, Platforms, and Applications
Hamza A. Abushahla
Dara Varam
Ariel J. N. Panopio
Mohamed I. AlHajri
MQ
368
1
0
20 Aug 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
Why Do Some Inputs Break Low-Bit LLM Quantization?
Ting-Yun Chang
Muru Zhang
Jesse Thomason
Robin Jia
MQ
259
1
0
24 May 2025
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
440
4
0
01 May 2025
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
Jinguang Wang
Jiangming Wang
Haifeng Sun
Tingting Yang
Zirui Zhuang
Wanyi Ning
Yuexi Yin
Q. Qi
Jianxin Liao
MQMoMe
199
3
0
07 Mar 2025
Compressing Language Models for Specialized Domains
Compressing Language Models for Specialized Domains
Miles Williams
G. Chrysostomou
Vitor Jeronymo
Nikolaos Aletras
MQ
300
1
0
25 Feb 2025
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Yining Qi
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
471
5
0
18 Dec 2024
The Super Weight in Large Language Models
The Super Weight in Large Language Models
Mengxia Yu
De Wang
Qi Shan
Colorado Reed
Alvin Wan
MQMILM
332
32
0
11 Nov 2024
FlatQuant: Flatness Matters for LLM Quantization
FlatQuant: Flatness Matters for LLM Quantization
Yuxuan Sun
Ruikang Liu
Haoli Bai
Han Bao
Kang Zhao
...
Lu Hou
Chun Yuan
Xin Jiang
Wen Liu
Jun Yao
MQ
586
28
0
12 Oct 2024
MesonGS: Post-training Compression of 3D Gaussians via Efficient
  Attribute Transformation
MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute TransformationEuropean Conference on Computer Vision (ECCV), 2024
Shuzhao Xie
Weixiang Zhang
Chen Tang
Yunpeng Bai
Rongwei Lu
Shijia Ge
Zhi Wang
3DGS
263
33
0
15 Sep 2024
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
Jung Hyun Lee
Jeonghoon Kim
J. Yang
S. Kwon
Eunho Yang
Kang Min Yoo
Dongsoo Lee
MQ
344
5
0
16 Jul 2024
OutlierTune: Efficient Channel-Wise Quantization for Large Language
  Models
OutlierTune: Efficient Channel-Wise Quantization for Large Language Models
Jinguang Wang
Yuexi Yin
Haifeng Sun
Qi Qi
Jingyu Wang
Zirui Zhuang
Tingting Yang
Jianxin Liao
187
2
0
27 Jun 2024
Prefixing Attention Sinks can Mitigate Activation Outliers for Large
  Language Model Quantization
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Seungwoo Son
Wonpyo Park
Woohyun Han
Kyuyeun Kim
Jaeho Lee
MQ
230
21
0
17 Jun 2024
ME-Switch: A Memory-Efficient Expert Switching Framework for Large
  Language Models
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models
Jing Liu
Yazhe Niu
Mingyang Zhang
Yefei He
Jianfei Cai
Bohan Zhuang
MoE
132
2
0
13 Jun 2024
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based
  LLMs
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
Jaewoo Yang
Hayun Kim
Younghoon Kim
207
20
0
23 May 2024
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language
  Models
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
Haojie Duanmu
Zhihang Yuan
Xiuhong Li
Jiangfei Duan
Xingcheng Zhang
Dahua Lin
MQ
272
29
0
10 May 2024
Cherry on Top: Parameter Heterogeneity and Quantization in Large
  Language Models
Cherry on Top: Parameter Heterogeneity and Quantization in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Wanyun Cui
Qianle Wang
MQ
185
10
0
03 Apr 2024
Minimize Quantization Output Error with Bias Compensation
Minimize Quantization Output Error with Bias Compensation
Cheng Gong
Haoshuai Zheng
Mengting Hu
Zheng Lin
Deng-Ping Fan
Yuzhi Zhang
Tao Li
MQ
148
3
0
02 Apr 2024
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven
  Fine Tuning
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
Jiun-Man Chen
Yu-Hsuan Chao
Yu-Jie Wang
Ming-Der Shieh
Chih-Chung Hsu
Wei-Fen Lin
MQ
258
2
0
11 Mar 2024
A Comprehensive Evaluation of Quantization Strategies for Large Language
  Models
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
Renren Jin
Jiangcun Du
Wuwei Huang
Wei Liu
Jian Luan
Sijin Yu
Deyi Xiong
MQ
287
67
0
26 Feb 2024
Optimize Weight Rounding via Signed Gradient Descent for the
  Quantization of LLMs
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Wenhua Cheng
Weiwei Zhang
Haihao Shen
Yiyang Cai
Xin He
Kaokao Lv
Yi. Liu
MQ
504
33
0
11 Sep 2023
A Survey on Model Compression for Large Language Models
A Survey on Model Compression for Large Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
378
352
0
15 Aug 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and AccelerationConference on Machine Learning and Systems (MLSys), 2023
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
831
946
0
01 Jun 2023
1