Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.12311
Cited By
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
18 June 2024
Dongwon Jo
Taesu Kim
Yulhwa Kim
Jae-Joon Kim
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models"
4 / 4 papers shown
Title
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
X. Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
26
2
0
04 Oct 2024
Evaluating Quantized Large Language Models
Shiyao Li
Xuefei Ning
Luning Wang
Tengxuan Liu
Xiangsheng Shi
Shengen Yan
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
MQ
41
42
0
28 Feb 2024
OneBit: Towards Extremely Low-bit Large Language Models
Yuzhuang Xu
Xu Han
Zonghan Yang
Shuo Wang
Qingfu Zhu
Zhiyuan Liu
Weidong Liu
Wanxiang Che
MQ
51
36
0
17 Feb 2024
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang
Yangdong Liu
Haotong Qin
Ying Li
Shiming Zhang
Xianglong Liu
Michele Magno
Xiaojuan Qi
MQ
77
63
0
06 Feb 2024
1