ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.07950
  4. Cited By
CBQ: Cross-Block Quantization for Large Language Models
v1v2v3v4v5 (latest)

CBQ: Cross-Block Quantization for Large Language Models

International Conference on Learning Representations (ICLR), 2023
13 December 2023
Xin Ding
Xiaoyu Liu
Zhijun Tu
Yun-feng Zhang
Wei Li
Jie Hu
Hanting Chen
Yehui Tang
Zhiwei Xiong
Baoqun Yin
Yunhe Wang
    MQ
ArXiv (abs)PDFHTMLGithub

Papers citing "CBQ: Cross-Block Quantization for Large Language Models"

50 / 68 papers shown
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Layer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language Models
Cuong Pham
Hoang Anh Dung
Cuong C. Nguyen
Trung Le
G. Carneiro
Thanh-Toan Do
MQ
181
0
0
21 Nov 2025
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
Xi Zhang
Xiaolin Wu
Jiamang Wang
W. Lin
MQ
268
2
0
23 Oct 2025
Neural Weight Compression for Language Models
Neural Weight Compression for Language Models
Jegwang Ryu
Minkyu Kim
Seungjun Shin
Hee Min Choi
Dokwan Oh
Jaeho Lee
193
0
0
13 Oct 2025
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Cat: Post-Training Quantization Error Reduction via Cluster-based Affine Transformation
Ali Zoljodi
Radu Timofte
Masoud Daneshtalab
MQ
213
0
0
30 Sep 2025
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
Yubo Gao
Renbo Tu
Gennady Pekhimenko
Nandita Vijaykumar
MQ
199
0
0
03 Sep 2025
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Zhijun Tu
Hanting Chen
Siqi Liu
Chuanjian Liu
Jian Li
Jie Hu
Yunhe Wang
MQ
187
0
0
09 Aug 2025
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
Jung Hyun Lee
Seungjae Shin
Vinnam Kim
Jaeseong You
An Chen
MQ
299
4
0
10 Jun 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
622
15
0
13 Apr 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Jiaqi Zhao
Miao Zhang
Ming Wang
Yuzhang Shang
Kaihao Zhang
Weili Guan
Yaowei Wang
Min Zhang
MQ
388
6
0
18 Feb 2025
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
Jiaqi Zhao
Ming Wang
Miao Zhang
Yuzhang Shang
Xuebo Liu
Yaowei Wang
Min Zhang
Liqiang Nie
MQ
777
7
0
18 Feb 2025
QMamba: Post-Training Quantization for Vision State Space Models
QMamba: Post-Training Quantization for Vision State Space Models
Yinglong Li
Xiaoyu Liu
Jiacheng Li
Ruikang Xu
Yinda Chen
Zhiwei Xiong
MambaMQ
258
3
0
23 Jan 2025
Interactions Across Blocks in Post-Training Quantization of Large
  Language Models
Interactions Across Blocks in Post-Training Quantization of Large Language Models
Khasmamad Shabanovi
Lukas Wiest
Vladimir Golkov
Daniel Cremers
Thomas Pfeil
MQ
198
1
0
06 Nov 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
Q-VLM: Post-training Quantization for Large Vision-Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Changyuan Wang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
MQ
577
29
0
10 Oct 2024
Multi-Granularity Semantic Revision for Large Language Model
  Distillation
Multi-Granularity Semantic Revision for Large Language Model Distillation
Xiaoyu Liu
Yun-feng Zhang
Wei Li
Simiao Li
Xu Huang
Hanting Chen
Yehui Tang
Jie Hu
Zhiwei Xiong
Yunhe Wang
231
3
0
14 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Shiyang Feng
Kaipeng Zhang
Ping Luo
MQ
674
112
0
10 Jul 2024
Rethinking Pruning Large Language Models: Benefits and Pitfalls of
  Reconstruction Error Minimization
Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization
Sungbin Shin
Wonpyo Park
Jaeho Lee
Namhoon Lee
216
6
0
21 Jun 2024
BiSup: Bidirectional Quantization Error Suppression for Large Language
  Models
BiSup: Bidirectional Quantization Error Suppression for Large Language Models
Minghui Zou
Ronghui Guo
Sai Zhang
Xiaowang Zhang
Zhiyong Feng
MQ
255
1
0
24 May 2024
How to Parameterize Asymmetric Quantization Ranges for
  Quantization-Aware Training
How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training
Jaeseong You
Minseop Park
Kyunggeun Lee
Seokjun An
Chirag I. Patel
Markus Nagel
MQ
246
3
0
25 Apr 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
584
73
0
05 Feb 2024
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large
  Language Models
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Jing Liu
Yazhe Niu
Xiuying Wei
Zhiwei Dong
Jianfei Cai
Bohan Zhuang
MQ
375
75
0
12 Oct 2023
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language
  Models
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Wenqi Shao
Mengzhao Chen
Zhaoyang Zhang
Peng Xu
Lirui Zhao
Zhiqiang Li
Kaipeng Zhang
Shiyang Feng
Yu Qiao
Ping Luo
MQ
638
378
0
25 Aug 2023
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
QuIP: 2-Bit Quantization of Large Language Models With GuaranteesNeural Information Processing Systems (NeurIPS), 2023
Jerry Chee
Yaohui Cai
Volodymyr Kuleshov
Chris De Sa
MQ
426
362
0
25 Jul 2023
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization
  Using Floating-Point Formats
ZeroQuant-FP: A Leap Forward in LLMs Post-Training W4A8 Quantization Using Floating-Point Formats
Xiaoxia Wu
Z. Yao
Yuxiong He
MQ
258
55
0
19 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
12.5K
16,448
0
18 Jul 2023
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
LLM-QAT: Data-Free Quantization Aware Training for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zechun Liu
Barlas Oğuz
Changsheng Zhao
Ernie Chang
Pierre Stock
Yashar Mehdad
Yangyang Shi
Raghuraman Krishnamoorthi
Vikas Chandra
MQ
334
349
0
29 May 2023
QLoRA: Efficient Finetuning of Quantized LLMs
QLoRA: Efficient Finetuning of Quantized LLMsNeural Information Processing Systems (NeurIPS), 2023
Tim Dettmers
Artidoro Pagnoni
Ari Holtzman
Luke Zettlemoyer
ALM
800
4,357
0
23 May 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head
  Checkpoints
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head CheckpointsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Joshua Ainslie
James Lee-Thorp
Michiel de Jong
Yury Zemlyanskiy
Federico Lebrón
Sumit Sanghai
555
1,313
0
22 May 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
653
120
0
03 Apr 2023
Token Merging for Fast Stable Diffusion
Token Merging for Fast Stable Diffusion
Daniel Bolya
Judy Hoffman
418
220
0
30 Mar 2023
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual DatasetNeural Information Processing Systems (NeurIPS), 2023
Hugo Laurenccon
Lucile Saulnier
Thomas Wang
Christopher Akiki
Albert Villanova del Moral
...
Violette Lepercq
Suzana Ilić
Margaret Mitchell
Sasha Luccioni
Yacine Jernite
AI4CEAILaw
275
209
0
07 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
20.2K
19,547
0
27 Feb 2023
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language ModelsInternational Conference on Machine Learning (ICML), 2022
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
1.0K
1,441
0
18 Nov 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
697
1,896
0
31 Oct 2022
Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer
Q-ViT: Accurate and Fully Quantized Low-bit Vision TransformerNeural Information Processing Systems (NeurIPS), 2022
Yanjing Li
Sheng Xu
Baochang Zhang
Xianbin Cao
Penglei Gao
Guodong Guo
MQViT
252
147
0
13 Oct 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language
  Models
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Yazhe Niu
Shanghang Zhang
Tao Gui
F. Yu
Xianglong Liu
MQ
443
212
0
27 Sep 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and PruningNeural Information Processing Systems (NeurIPS), 2022
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
522
360
0
24 Aug 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
613
975
0
15 Aug 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Abigail Z. Jacobs
J. Dean
W. Fedus
ELMReLMLRM
712
3,434
0
15 Jun 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for
  Large-Scale Transformers
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale TransformersNeural Information Processing Systems (NeurIPS), 2022
Z. Yao
Reza Yazdani Aminabadi
Minjia Zhang
Xiaoxia Wu
Conglong Li
Yuxiong He
VLMMQ
511
672
0
04 Jun 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
1.1K
4,656
0
02 May 2022
Post-Training Quantization for Vision Transformer
Post-Training Quantization for Vision TransformerNeural Information Processing Systems (NeurIPS), 2021
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViTMQ
476
473
0
27 Jun 2021
A White Paper on Neural Network Quantization
A White Paper on Neural Network Quantization
Markus Nagel
Marios Fournarakis
Rana Ali Amjad
Yelysei Bondarenko
M. V. Baalen
Tijmen Blankevoort
MQ
334
823
0
15 Jun 2021
BRECQ: Pushing the Limit of Post-Training Quantization by Block
  Reconstruction
BRECQ: Pushing the Limit of Post-Training Quantization by Block ReconstructionInternational Conference on Learning Representations (ICLR), 2021
Yuhang Li
Yazhe Niu
Xu Tan
Yang Yang
Peng Hu
Tao Gui
F. Yu
Wei Wang
Shi Gu
MQ
497
611
0
10 Feb 2021
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language UnderstandingInternational Conference on Learning Representations (ICLR), 2020
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
4.1K
7,570
0
07 Sep 2020
Aligning AI With Shared Human Values
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Haibin Zhang
Basel Alomair
Jacob Steinhardt
890
858
0
05 Aug 2020
EasyQuant: Post-training Quantization via Scale Optimization
EasyQuant: Post-training Quantization via Scale Optimization
Di Wu
Qingming Tang
Yongle Zhao
Ming Zhang
Ying Fu
Debing Zhang
MQ
298
95
0
30 Jun 2020
Improving Post Training Neural Quantization: Layer-wise Calibration and
  Integer Programming
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Itay Hubara
Yury Nahshan
Y. Hanani
Ron Banner
Daniel Soudry
MQ
373
153
0
14 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.4K
57,120
0
28 May 2020
Up or Down? Adaptive Rounding for Post-Training Quantization
Up or Down? Adaptive Rounding for Post-Training Quantization
Markus Nagel
Rana Ali Amjad
M. V. Baalen
Christos Louizos
Tijmen Blankevoort
MQ
720
804
0
22 Apr 2020
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
MuTual: A Dataset for Multi-Turn Dialogue ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Leyang Cui
Yu-Huan Wu
Shujie Liu
Yue Zhang
Ming Zhou
LRM
223
168
0
09 Apr 2020
12
Next
Page 1 of 2