ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.14717
  4. Cited By
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
v1v2 (latest)

QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

International Conference on Learning Representations (ICLR), 2023
26 September 2023
Yuhui Xu
Lingxi Xie
Xiaotao Gu
Xin Chen
Heng Chang
Hengheng Zhang
Zhensu Chen
Xiaopeng Zhang
Qi Tian
    MQ
ArXiv (abs)PDFHTMLHuggingFace (44 upvotes)Github (136★)

Papers citing "QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models"

39 / 89 papers shown
Title
ThinK: Thinner Key Cache by Query-Driven Pruning
ThinK: Thinner Key Cache by Query-Driven PruningInternational Conference on Learning Representations (ICLR), 2024
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
436
38
0
30 Jul 2024
Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance
Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance
Ao Shen
Qiang Wang
Zhiquan Lai
Xionglve Li
Dongsheng Li
MQALM
209
1
0
24 Jul 2024
Fault Diagnosis in Power Grids with Large Language Model
Fault Diagnosis in Power Grids with Large Language Model
Liu Jing
Amirul Rahman
AI4CE
162
8
0
11 Jul 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Shiyang Feng
Kaipeng Zhang
Ping Luo
MQ
469
76
0
10 Jul 2024
A Survey on LoRA of Large Language Models
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
566
90
0
08 Jul 2024
SBoRA: Low-Rank Adaptation with Regional Weight Updates
SBoRA: Low-Rank Adaptation with Regional Weight Updates
L. Po
Yuyang Liu
Haoxuan Wu
Tianqi Zhang
Weikang Yu
Zeyu Jiang
Kun Li
209
3
0
07 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
337
17
0
05 Jul 2024
GPTQT: Quantize Large Language Models Twice to Push the Efficiency
GPTQT: Quantize Large Language Models Twice to Push the Efficiency
Yipin Guo
Yilin Lang
Qinyuan Ren
MQ
117
3
0
03 Jul 2024
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and
  Aleatoric Awareness
Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Khyathi Chandu
Linjie Li
Anas Awadalla
Ximing Lu
Jae Sung Park
Jack Hessel
Lijuan Wang
Yejin Choi
293
6
0
02 Jul 2024
HLQ: Fast and Efficient Backpropagation via Hadamard Low-rank
  Quantization
HLQ: Fast and Efficient Backpropagation via Hadamard Low-rank Quantization
Seonggon Kim
Eunhyeok Park
184
2
0
21 Jun 2024
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal
  Quantization levels and Rank Values trough Differentiable Bayesian Gates
Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates
Cristian Meo
Ksenia Sycheva
Anirudh Goyal
Justin Dauwels
MQ
227
9
0
18 Jun 2024
Evaluating the Generalization Ability of Quantized LLMs: Benchmark,
  Analysis, and Toolbox
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
Yijun Liu
Yuan Meng
Fang Wu
Shenhao Peng
Hang Yao
Chaoyu Guan
Chen Tang
Cheng Wang
Zhi Wang
Wenwu Zhu
MQ
283
9
0
15 Jun 2024
Low-Rank Quantization-Aware Training for LLMs
Low-Rank Quantization-Aware Training for LLMs
Yelysei Bondarenko
Riccardo Del Chiaro
Markus Nagel
MQ
276
37
0
10 Jun 2024
Federated LoRA with Sparse Communication
Federated LoRA with Sparse Communication
Kevin Kuo
Arian Raje
Kousik Rajesh
Virginia Smith
325
22
0
07 Jun 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuningNeural Information Processing Systems (NeurIPS), 2024
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
Shuaiwen Leon Song
Yue Yu
Liqiang Nie
Guohao Li
339
24
0
07 Jun 2024
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge
  Devices
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices
Ruiyang Qin
Dancheng Liu
Zheyu Yan
Zhaoxuan Tan
Zixuan Pan
Zhenge Jia
Meng Jiang
Ahmed Abbasi
Jinjun Xiong
Yiyu Shi
235
26
0
06 Jun 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
296
0
0
31 May 2024
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane
  Reflections
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Massimo Bini
Karsten Roth
Zeynep Akata
Anna Khoreva
150
8
0
30 May 2024
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient
  Deployments
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi
Yuhui Xu
Heng Chang
Chen Tang
Yuan Meng
Tong Zhang
Jia Li
MQ
153
2
0
30 May 2024
HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization
HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization
Wenxuan Liu
Saiqian Zhang
MQ
249
8
0
30 May 2024
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit
  Large Language Models
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Yan Chen
Yuan Cheng
Dawei Yang
Zhihang Yuan
Jiangyong Yu
Chen Xu
Sifan Zhou
MQ
333
16
0
28 May 2024
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
Haoyu Wang
Bei Liu
Hang Shao
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
MQ
170
2
0
27 May 2024
Bridging The Gap between Low-rank and Orthogonal Adaptation via
  Householder Reflection Adaptation
Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation
Shen Yuan
Haotian Liu
Hongteng Xu
242
9
0
24 May 2024
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
MQ
310
33
0
23 May 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu Wang
380
164
0
22 Apr 2024
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
An empirical study of LLaMA3 quantization: from LLMs to MLLMs
Wei Huang
Xingyu Zheng
Xudong Ma
Haotong Qin
Chengtao Lv
Hong Chen
Jie Luo
Xiaojuan Qi
Xianglong Liu
Michele Magno
MQ
465
64
0
22 Apr 2024
decoupleQ: Towards 2-bit Post-Training Uniform Quantization via
  decoupling Parameters into Integer and Floating Points
decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points
Yi Guo
Fanliu Kong
Xiaoyang Li
Hui Li
Wei Chen
Xiaogang Tian
Jinping Cai
Yang Zhang
Shouda Liu
MQ
123
7
0
19 Apr 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision
  Transformers
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
251
1
0
14 Apr 2024
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Fanxu Meng
Zhaohui Wang
Muhan Zhang
VLM
572
192
0
03 Apr 2024
Mixed-precision Supernet Training from Vision Foundation Models using
  Low Rank Adapter
Mixed-precision Supernet Training from Vision Foundation Models using Low Rank Adapter
Yuiko Sakuma
Masakazu Yoshimura
Junji Otsuka
Atsushi Irie
Takeshi Ohashi
MQ
260
0
0
29 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
684
674
0
21 Mar 2024
Introducing Routing Functions to Vision-Language Parameter-Efficient
  Fine-Tuning with Low-Rank Bottlenecks
Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank BottlenecksEuropean Conference on Computer Vision (ECCV), 2024
Tingyu Qu
Tinne Tuytelaars
Marie-Francine Moens
MoE
149
4
0
14 Mar 2024
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
Sunghyeon Woo
Baeseong Park
Byeongwook Kim
Minjung Jo
S. Kwon
Dongsuk Jeon
Dongsoo Lee
364
4
0
27 Feb 2024
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
264
85
0
15 Feb 2024
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large
  Language Models
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
Hyesung Jeon
Yulhwa Kim
Jae-Joon Kim
MQ
230
8
0
07 Feb 2024
Separable Multi-Concept Erasure from Diffusion Models
Separable Multi-Concept Erasure from Diffusion Models
Mengnan Zhao
Lihe Zhang
Tianhang Zheng
Yuqiu Kong
Baocai Yin
195
16
0
03 Feb 2024
Zero-shot Generative Large Language Models for Systematic Review
  Screening Automation
Zero-shot Generative Large Language Models for Systematic Review Screening AutomationEuropean Conference on Information Retrieval (ECIR), 2024
Shuai Wang
Harrisen Scells
Shengyao Zhuang
Martin Potthast
Bevan Koopman
Guido Zuccon
149
26
0
12 Jan 2024
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models:
  A Critical Review and Assessment
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment
Lingling Xu
Haoran Xie
S. J. Qin
Xiaohui Tao
F. Wang
257
260
0
19 Dec 2023
NOLA: Compressing LoRA using Linear Combination of Random Basis
NOLA: Compressing LoRA using Linear Combination of Random BasisInternational Conference on Learning Representations (ICLR), 2023
Soroush Abbasi Koohpayegani
K. Navaneet
Parsa Nooralinejad
Soheil Kolouri
Hamed Pirsiavash
286
26
0
04 Oct 2023
Previous
12