ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.08554
  4. Cited By
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
v1v2v3 (latest)

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models

AAAI Conference on Artificial Intelligence (AAAI), 2024
16 August 2024
Chao Zeng
Songwei Liu
Yusheng Xie
Hong Liu
Xiaojian Wang
Miao Wei
Shu Yang
Fangmin Chen
Lean Fu
    MQ
ArXiv (abs)PDFHTMLGithub (243★)

Papers citing "ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models"

9 / 9 papers shown
Mixed-Precision Quantization for Language Models: Techniques and Prospects
Mixed-Precision Quantization for Language Models: Techniques and Prospects
M. Rakka
Marios Fournarakis
Olga Krestinskaya
Jinane Bazzi
K. Salama
Fadi J. Kurdahi
A. Eltawil
M. Fouda
MQ
294
2
0
19 Oct 2025
Error Propagation Mechanisms and Compensation Strategies for Quantized Diffusion
Error Propagation Mechanisms and Compensation Strategies for Quantized Diffusion
Songwei Liu
Hong Liu
Fangmin Chen
Xurui Peng
Chenqian Yan
Lean Fu
Xing Mei
MQ
338
2
0
16 Aug 2025
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Xiangchen Li
Dimitrios Spatharakis
Saeid Ghafouri
Jiakun Fan
Dimitrios Nikolopoulos
Deepu John
Bo Ji
Dimitrios S. Nikolopoulos
490
10
0
11 Jun 2025
Achieving binary weight and activation for LLMs using Post-Training Quantization
Achieving binary weight and activation for LLMs using Post-Training QuantizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Siqing Song
Chuang Wang
Ruiqi Wang
Yi Yang
Xuyao Zhang
MQ
447
1
0
07 Apr 2025
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Membership Inference Risks in Quantized Models: A Theoretical and Empirical Study
Eric Aubinais
Philippe Formont
Pablo Piantanida
Elisabeth Gassiat
362
2
0
10 Feb 2025
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Lean Fu
Xing Mei
MQ
486
5
0
23 Dec 2024
SKIM: Any-bit Quantization Pushing The Limits of Post-Training
  Quantization
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
Runsheng Bai
Qiang Liu
B. Liu
MQ
391
4
0
05 Dec 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMsInternational Conference on Learning Representations (ICLR), 2024
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
231
39
0
03 Aug 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Chengyue Wu
Haotian Tang
Shang Yang
Zhekai Zhang
Guangxuan Xiao
Chuang Gan
Song Han
796
182
0
07 May 2024
1
Page 1 of 1