Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06385
Cited By
Low-Rank Quantization-Aware Training for LLMs
10 June 2024
Yelysei Bondarenko
Riccardo Del Chiaro
Markus Nagel
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Low-Rank Quantization-Aware Training for LLMs"
4 / 4 papers shown
Title
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
38
0
0
23 Dec 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Yujun Lin
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
77
71
0
07 May 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
60
64
0
27 Feb 2024
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge
Xuan Shen
Zhenglun Kong
Changdi Yang
Zhaoyang Han
Lei Lu
...
Zhihao Shu
Wei Niu
Miriam Leeser
Pu Zhao
Yanzhi Wang
MQ
46
17
0
16 Feb 2024
1