ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05079
  4. Cited By
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM
  Inference?

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

8 October 2023
Cheng Zhang
Jianyi Cheng
Ilia Shumailov
G. Constantinides
Yiren Zhao
    MQ
ArXivPDFHTML

Papers citing "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"

9 / 9 papers shown
Title
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom
Rishika Sen
Sujoy Roychowdhury
Sumit Soman
H. G. Ranjani
Srikhetra Mohanty
61
0
0
28 Apr 2025
Scaling Laws for Floating Point Quantization Training
Scaling Laws for Floating Point Quantization Training
X. Sun
Shuaipeng Li
Ruobing Xie
Weidong Han
Kan Wu
...
Yangyu Tao
Zhanhui Kang
C. Xu
Di Wang
Jie Jiang
MQ
AIFin
58
0
0
05 Jan 2025
Scaling Laws for Mixed quantization in Large Language Models
Scaling Laws for Mixed quantization in Large Language Models
Zeyu Cao
Cheng Zhang
Pedro Gimenes
Jianqiao Lu
Jianyi Cheng
Yiren Zhao
MQ
29
1
0
09 Oct 2024
Exploring FPGA designs for MX and beyond
Exploring FPGA designs for MX and beyond
Ebby Samson
Naveen Mellempudi
Wayne Luk
G. Constantinides
MQ
30
1
0
01 Jul 2024
Is Temperature the Creativity Parameter of Large Language Models?
Is Temperature the Creativity Parameter of Large Language Models?
Max Peeperkorn
Tom Kouwenhoven
Daniel G. Brown
Anna K. Jordanous
34
44
0
01 May 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang
Jianyi Cheng
G. Constantinides
Yiren Zhao
MQ
19
9
0
04 Feb 2024
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
138
221
0
31 Dec 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1