ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05079
  4. Cited By
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM
  Inference?
v1v2 (latest)

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
8 October 2023
Cheng Zhang
Jianyi Cheng
Ilia Shumailov
George A. Constantinides
Yiren Zhao
    MQ
ArXiv (abs)PDFHTML

Papers citing "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"

8 / 8 papers shown
Title
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
Jungi Lee
Junyong Park
Soohyun Cha
Jaehoon Cho
Jaewoong Sim
88
0
0
16 Oct 2025
Exploring and Reshaping the Weight Distribution in LLM
Exploring and Reshaping the Weight Distribution in LLM
Chunming Ye
Songzhou Li
Xu Xu
141
0
0
24 Aug 2025
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom
Rishika Sen
Sujoy Roychowdhury
Sumit Soman
H. G. Ranjani
Srikhetra Mohanty
327
1
0
28 Apr 2025
Scaling Laws for Floating Point Quantization Training
Scaling Laws for Floating Point Quantization Training
Xingwu Sun
Shuaipeng Li
Ruobing Xie
Weidong Han
Kan Wu
...
Yangyu Tao
Zhanhui Kang
C. Xu
Di Wang
Jie Jiang
MQAIFin
424
5
0
05 Jan 2025
Scaling Laws For Mixed Quantization
Scaling Laws For Mixed Quantization
Zeyu Cao
Boyang Gu
Cheng Zhang
Pedro Gimenes
Jianqiao Lu
Jianyi Cheng
Xitong Gao
Yiren Zhao
MQ
289
1
0
09 Oct 2024
Exploring FPGA designs for MX and beyond
Exploring FPGA designs for MX and beyond
Ebby Samson
Naveen Mellempudi
Wayne Luk
George A. Constantinides
MQ
134
4
0
01 Jul 2024
Is Temperature the Creativity Parameter of Large Language Models?
Is Temperature the Creativity Parameter of Large Language Models?
Max Peeperkorn
Tom Kouwenhoven
Daniel G. Brown
Anna K. Jordanous
201
100
0
01 May 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang
Jianyi Cheng
George A. Constantinides
Yiren Zhao
MQ
390
23
0
04 Feb 2024
1