Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.06001
Cited By
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
9 May 2024
Ruihao Gong
Yang Yong
Shiqiao Gu
Yushi Huang
Chentao Lv
Yunchen Zhang
Xianglong Liu
Dacheng Tao
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit"
12 / 12 papers shown
Title
Stability in Single-Peaked Strategic Resource Selection Games
Henri Zeiler
21
3
0
09 May 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
V. Chaudhary
Xia Hu
Anshumali Shrivastava
MQ
32
0
0
15 Apr 2025
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models
Hung-Yueh Chiang
Chi-chih Chang
N. Frumkin
Kai-Chiang Wu
Mohamed S. Abdelfattah
Diana Marculescu
MQ
63
0
0
28 Mar 2025
PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
Zining Wnag
J. Guo
Ruihao Gong
Yang Yong
Aishan Liu
Yushi Huang
Jiaheng Liu
X. Liu
71
1
0
10 Dec 2024
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
Chao Fang
Man Shi
Robin Geens
Arne Symons
Zhongfeng Wang
Marian Verhelst
69
0
0
24 Nov 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Yu Qiao
Ping Luo
MQ
36
21
0
10 Jul 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Dayou Du
Yijia Zhang
Shijie Cao
Jiaqi Guo
Ting Cao
Xiaowen Chu
Ningyi Xu
MQ
41
28
0
16 Feb 2024
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian
Andrei Panferov
Denis Kuznedelev
Elias Frantar
Artem Babenko
Dan Alistarh
MQ
98
87
0
11 Jan 2024
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
242
1,070
0
05 Oct 2022
Diversifying Sample Generation for Accurate Data-Free Quantization
Xiangguo Zhang
Haotong Qin
Yifu Ding
Ruihao Gong
Qing Yan
Renshuai Tao
Yuhang Li
F. Yu
Xianglong Liu
MQ
52
89
0
01 Mar 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
571
0
12 Sep 2019
1