Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.14917
Cited By
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
23 May 2024
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Xianglong Liu
Luca Benini
Michele Magno
Xiaojuan Qi
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models"
12 / 12 papers shown
Title
Radio: Rate-Distortion Optimization for Large Language Model Compression
Sean I. Young
MQ
17
0
0
05 May 2025
SpinQuant: LLM quantization with learned rotations
Zechun Liu
Changsheng Zhao
Igor Fedorov
Bilge Soran
Dhruv Choudhary
Raghuraman Krishnamoorthi
Vikas Chandra
Yuandong Tian
Tijmen Blankevoort
MQ
127
76
0
21 Feb 2025
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
M. Rakka
Rachid Karami
A. Eltawil
M. Fouda
Fadi J. Kurdahi
MQ
24
1
0
03 Nov 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
36
3
0
08 Oct 2024
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
Sean I. Young
MQ
32
1
0
03 Sep 2024
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
49
36
0
22 Apr 2024
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
Aniruddha Nrusimha
Mayank Mishra
Naigang Wang
Dan Alistarh
Rameswar Panda
Yoon Kim
MQ
54
8
0
04 Apr 2024
AffineQuant: Affine Transformation Quantization for Large Language Models
Yuexiao Ma
Huixia Li
Xiawu Zheng
Feng Ling
Xuefeng Xiao
Rui Wang
Shilei Wen
Fei Chao
Rongrong Ji
MQ
38
16
0
19 Mar 2024
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Albert Tseng
Jerry Chee
Qingyao Sun
Volodymyr Kuleshov
Christopher De Sa
MQ
123
91
0
06 Feb 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang
Yangdong Liu
Haotong Qin
Ying Li
Shiming Zhang
Xianglong Liu
Michele Magno
Xiaojuan Qi
MQ
77
63
0
06 Feb 2024
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian
Andrei Panferov
Denis Kuznedelev
Elias Frantar
Artem Babenko
Dan Alistarh
MQ
98
87
0
11 Jan 2024
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
206
2,232
0
22 Mar 2023
1