Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.10032
Cited By
LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid
14 July 2024
Tianyi Zhang
Anshumali Shrivastava
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid"
6 / 6 papers shown
Title
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
38
0
0
23 Dec 2024
Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression
Zhichao Xu
Ashim Gupta
Tao Li
Oliver Bentham
Vivek Srikumar
40
8
0
06 Jul 2024
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Albert Tseng
Jerry Chee
Qingyao Sun
Volodymyr Kuleshov
Christopher De Sa
MQ
126
92
0
06 Feb 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
125
145
0
26 Jan 2024
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
144
366
0
13 Mar 2023
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
1