ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.11062
  4. Cited By
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

10 July 2024
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Yu Qiao
    MQ
ArXivPDFHTML

Papers citing "EfficientQAT: Efficient Quantization-Aware Training for Large Language Models"

24 / 24 papers shown
Title
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
Yinsicheng Jiang
Yao Fu
Yeqi Huang
Ping Nie
Zhan Lu
...
Dayou Du
Tairan Xu
Kai Zou
Edoardo Ponti
Luo Mai
MoE
12
0
0
16 May 2025
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Deyu Cao
Samin Aref
MQ
27
0
0
14 Apr 2025
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
Yuxuan Hu
Xiaodong Chen
C. Li
H. Chen
J. Zhang
MQ
60
0
0
25 Mar 2025
Binary Neural Networks for Large Language Model: A Survey
Binary Neural Networks for Large Language Model: A Survey
Liangdong Liu
Zhitong Zheng
Cong Wang
TianHuang Su
ZhenYu Yang
MQ
67
0
0
26 Feb 2025
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization
Zechun Liu
Changsheng Zhao
Hanxian Huang
Sijia Chen
Jing Zhang
...
Yuandong Tian
Bilge Soran
Raghuraman Krishnamoorthi
Tijmen Blankevoort
Vikas Chandra
MQ
73
3
0
04 Feb 2025
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
Mengzhao Chen
Yi Liu
Jiahao Wang
Yi Bin
Wenqi Shao
Ping Luo
MQ
61
2
0
28 Jan 2025
Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization
Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization
Dongwei Wang
Huanrui Yang
MQ
87
1
0
08 Dec 2024
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped
  Activation Data Format
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
Chao Fang
Man Shi
Robin Geens
Arne Symons
Zhongfeng Wang
Marian Verhelst
71
0
0
24 Nov 2024
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
Yuzong Chen
Ahmed F. AbouElhamayed
Xilai Dai
Yang Wang
Marta Andronic
G. Constantinides
Mohamed S. Abdelfattah
MQ
105
1
0
18 Nov 2024
A Comprehensive Study on Quantization Techniques for Large Language
  Models
A Comprehensive Study on Quantization Techniques for Large Language Models
Jiedong Lang
Zhehao Guo
Shuyu Huang
MQ
44
10
0
30 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
39
3
0
08 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
Xinyu Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
31
2
0
04 Oct 2024
Art and Science of Quantizing Large-Scale Models: A Comprehensive
  Overview
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
Yanshu Wang
Tong Yang
Xiyan Liang
Guoan Wang
Hanning Lu
Xu Zhe
Yaoming Li
Li Weitao
MQ
39
3
0
18 Sep 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
LI DU
Guoqi Li
Jiajun Zhang
52
5
0
05 Jul 2024
LLMC: Benchmarking Large Language Model Quantization with a Versatile
  Compression Toolkit
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
Ruihao Gong
Yang Yong
Shiqiao Gu
Yushi Huang
Chentao Lv
Yunchen Zhang
Xianglong Liu
Dacheng Tao
MQ
34
7
0
09 May 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Yujun Lin
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
79
76
0
07 May 2024
OneBit: Towards Extremely Low-bit Large Language Models
OneBit: Towards Extremely Low-bit Large Language Models
Yuzhuang Xu
Xu Han
Zonghan Yang
Shuo Wang
Qingfu Zhu
Zhiyuan Liu
Weidong Liu
Wanxiang Che
MQ
51
37
0
17 Feb 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via
  Self-Distillation
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Dayou Du
Yijia Zhang
Shijie Cao
Jiaqi Guo
Ting Cao
Xiaowen Chu
Ningyi Xu
MQ
46
30
0
16 Feb 2024
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and
  Lattice Codebooks
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Albert Tseng
Jerry Chee
Qingyao Sun
Volodymyr Kuleshov
Christopher De Sa
MQ
128
95
0
06 Feb 2024
Extreme Compression of Large Language Models via Additive Quantization
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian
Andrei Panferov
Denis Kuznedelev
Elias Frantar
Artem Babenko
Dan Alistarh
MQ
100
90
0
11 Jan 2024
A Speed Odyssey for Deployable Quantization of LLMs
A Speed Odyssey for Deployable Quantization of LLMs
Qingyuan Li
Ran Meng
Yiduo Li
Bo-Wen Zhang
Liang Li
Yifan Lu
Xiangxiang Chu
Yerui Sun
Yuchen Xie
MQ
58
7
0
16 Nov 2023
QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language
  Models
QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models
Saleh Ashkboos
Ilia Markov
Elias Frantar
Tingxuan Zhong
Xincheng Wang
Jie Ren
Torsten Hoefler
Dan Alistarh
MQ
SyDa
126
22
0
13 Oct 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
298
3,007
0
22 Mar 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
211
1,106
0
20 Sep 2022
1