ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.07378
  4. Cited By
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

12 March 2024
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
    MQ
ArXivPDFHTML

Papers citing "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

38 / 38 papers shown
Title
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition
Zhiyuan Chen
Keyi Li
Yifan Jia
Le Ye
Yufei Ma
DiffM
17
0
0
09 May 2025
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations
Dmitriy Shopkhoev
Ammar Ali
Magauiya Zhussip
Valentin Malykh
Stamatios Lefkimmiatis
N. Komodakis
Sergey Zagoruyko
VLM
24
0
0
05 May 2025
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
Yan Yang
Yixia Li
Hongru Wang
Xuetao Wei
Jianqiao Yu
Yun-Nung Chen
Guanhua Chen
MoMe
24
0
0
17 Apr 2025
Compression Laws for Large Language Models
Compression Laws for Large Language Models
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
21
0
0
06 Apr 2025
Large Language Model Compression via the Nested Activation-Aware Decomposition
Large Language Model Compression via the Nested Activation-Aware Decomposition
Jun Lu
Tianyi Xu
Bill Ding
David Li
Yu Kang
32
0
0
21 Mar 2025
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process
Triad: Empowering LMM-based Anomaly Detection with Vision Expert-guided Visual Tokenizer and Manufacturing Process
Yuanze Li
Shihao Yuan
Haolin Wang
Qizhang Li
Ming-Yu Liu
Chen Xu
Guangming Shi
Wangmeng Zuo
48
0
0
17 Mar 2025
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression
Xin Wang
Samiul Alam
Zhongwei Wan
H. Shen
M. Zhang
MQ
55
0
0
16 Mar 2025
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami
Mohammad Akbari
Kevin Cannons
Yong Zhang
61
0
0
07 Mar 2025
Optimizing Singular Spectrum for Large Language Model Compression
Dengjie Li
Tiancheng Shen
Yao Zhou
Baisong Yang
Zhongying Liu
Masheng Yang
Bernard Ghanem
Yibo Yang
Yujie Zhong
Ming-Hsuan Yang
58
0
0
24 Feb 2025
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
Hong Yankun
Li Xing
Zhen Hui-Ling
Yu Xianzhi
Liu Wulong
Yuan Mingxuan
MQ
69
0
0
24 Feb 2025
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
Zhongwei Wan
H. Shen
Xin Wang
C. Liu
Zheda Mai
M. Zhang
VLM
49
3
0
24 Feb 2025
LESA: Learnable LLM Layer Scaling-Up
LESA: Learnable LLM Layer Scaling-Up
Yifei Yang
Zouying Cao
Xinbei Ma
Yao Yao
L. Qin
Z. Chen
Hai Zhao
54
0
0
20 Feb 2025
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures
Jiayu Qin
Jianchao Tan
K. Zhang
Xunliang Cai
Wei Wang
35
0
0
19 Feb 2025
Choose Your Model Size: Any Compression by a Single Gradient Descent
Choose Your Model Size: Any Compression by a Single Gradient Descent
Martin Genzel
Patrick Putzky
Pengfei Zhao
S.
Mattes Mollenhauer
Robert Seidel
Stefan Dietzel
Thomas Wollmann
36
0
0
03 Feb 2025
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Can Jin
Ying Li
Mingyu Zhao
Shiyu Zhao
Z. Wang
Xiaoxiao He
Ligong Han
Tong Che
Dimitris N. Metaxas
VPVLM
VLM
91
1
0
02 Feb 2025
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
40
3
0
25 Jan 2025
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Fan Wang
Juyong Jiang
Chansung Park
Sunghun Kim
Jing Tang
83
0
0
08 Dec 2024
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models
  using Soft-Thresholding Mechanism
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism
Priyansh Bhatnagar
Linfeng Wen
Mingu Kang
26
0
0
15 Nov 2024
ASER: Activation Smoothing and Error Reconstruction for Large Language
  Model Quantization
ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
Weibo Zhao
Yubin Shi
Xinyu Lyu
Wanchen Sui
Shen Li
Yong Li
MQ
37
1
0
12 Nov 2024
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE
  Inference
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inference
Peng Tang
Jiacheng Liu
X. Hou
Yifei Pu
Jing Wang
Pheng-Ann Heng
C. Li
M. Guo
MoE
55
6
0
03 Nov 2024
MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert
  Pruning and Intra-Expert Low-Rank Decomposition
MoE-I2^22: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang
Yang Sui
Jinqi Xiao
Lingyi Huang
Yu Gong
Yuanlin Duan
Wenqi Jia
Miao Yin
Yu Cheng
Bo Yuan
MoE
60
3
0
01 Nov 2024
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
Xinghao Wang
Pengyu Wang
Bo Wang
Dong Zhang
Yunhua Zhou
Xipeng Qiu
MQ
31
2
0
31 Oct 2024
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation
Shih-yang Liu
Huck Yang
Nai Chit Fung
Nai Chit Fung
Hongxu Yin
...
Jan Kautz
Yu-Chun Wang
Pavlo Molchanov
Min-Hung Chen
Min-Hung Chen
MQ
26
0
0
28 Oct 2024
Beware of Calibration Data for Pruning Large Language Models
Beware of Calibration Data for Pruning Large Language Models
Yixin Ji
Yang Xiang
Juntao Li
Qingrong Xia
Ping Li
Xinyu Duan
Zhefeng Wang
Min Zhang
26
2
0
23 Oct 2024
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model
  Compression
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang
Yu-Guang Chen
Ing-Chao Lin
Bing Li
Grace Li Zhang
22
4
0
02 Oct 2024
Sparse Attention Decomposition Applied to Circuit Tracing
Sparse Attention Decomposition Applied to Circuit Tracing
Gabriel Franco
Mark Crovella
26
0
0
01 Oct 2024
Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion
Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion
Hui Shen
Zhongwei Wan
Xin Wang
Mi Zhang
Mamba
29
6
0
15 Sep 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
65
6
0
19 Aug 2024
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large
  Language Models
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao
Menghang Dong
Rongyu Zhang
Wenzhao Zheng
Yunpeng Zhang
Huanrui Yang
Dalong Du
Kurt Keutzer
Shanghang Zhang
40
0
0
15 Aug 2024
Palu: Compressing KV-Cache with Low-Rank Projection
Palu: Compressing KV-Cache with Low-Rank Projection
Chi-Chih Chang
Wei-Cheng Lin
Chien-Yu Lin
Chong-Yan Chen
Yu-Fang Hu
Pei-Shuo Wang
N. Huang
Luis Ceze
Kai-Chiang Wu
38
0
0
30 Jul 2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from
  Low-Rank Gradients
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Ajay Jaiswal
Lu Yin
Zhenyu (Allen) Zhang
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
28
14
0
15 Jul 2024
SLIP: Securing LLMs IP Using Weights Decomposition
SLIP: Securing LLMs IP Using Weights Decomposition
Yehonathan Refael
Adam Hakim
Lev Greenberg
T. Aviv
S. Lokam
Ben Fishman
Shachar Seidman
33
3
0
15 Jul 2024
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models
Jack Merullo
Carsten Eickhoff
Ellie Pavlick
38
2
0
13 Jun 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
38
78
0
22 Apr 2024
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language
  Models for Report Generation
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan
Che Liu
Xin Wang
Chaofan Tao
Hui Shen
Zhenwu Peng
Jie Fu
Rossella Arcucci
Huaxiu Yao
Mi Zhang
32
1
0
07 Mar 2024
ASVD: Activation-aware Singular Value Decomposition for Compressing
  Large Language Models
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Qiang Wu
Yan Yan
Guangyu Sun
MQ
21
41
0
10 Dec 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
198
283
0
03 May 2023
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Maolin Wang
Y. Pan
Zenglin Xu
Xiangli Yang
Guangxi Li
A. Cichocki
Andrzej Cichocki
33
19
0
22 Jan 2023
1