Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.16319
Cited By
Data-free Weight Compress and Denoise for Large Language Models
26 February 2024
Runyu Peng
Yunhua Zhou
Qipeng Guo
Yang Gao
Hang Yan
Xipeng Qiu
Dahua Lin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Data-free Weight Compress and Denoise for Large Language Models"
3 / 3 papers shown
Title
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
Changwoo Lee
Soo Min Kwon
Qing Qu
Hun-Seok Kim
25
0
0
28 Oct 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
245
695
0
27 Aug 2021
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
1