ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.06950
  4. Cited By
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model
  Shrinking

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

18 June 2018
Patrick H. Chen
Si Si
Yang Li
Ciprian Chelba
Cho-Jui Hsieh
ArXiv (abs)PDFHTML

Papers citing "GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking"

44 / 44 papers shown
TropNNC: Structured Neural Network Compression Using Tropical Geometry
TropNNC: Structured Neural Network Compression Using Tropical Geometry
Konstantinos Fotopoulos
Petros Maragos
Panagiotis Misiakos
344
4
0
24 Dec 2025
CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Dmitriy Shopkhoev
Denis Makhov
Magauiya Zhussip
Ammar Ali
Stamatios Lefkimmiatis
244
3
0
26 Sep 2025
Importance-Aware Activation Space Reconstruction
Importance-Aware Activation Space Reconstruction
Md Mokarram Chowdhury
Daniel Agyei Asante
E. Chang
Yang Li
199
0
0
04 Jul 2025
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices
Mingxue Xu
Y. Xu
Danilo Mandic
225
0
0
16 Jun 2025
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations
ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal TransformationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Ekaterina Grishina
Mikhail Gorbunov
Maxim Rakhuba
237
0
0
03 Jun 2025
Zero-Trust Mobility-Aware Authentication Framework for Secure Vehicular Fog Computing Networks
Zero-Trust Mobility-Aware Authentication Framework for Secure Vehicular Fog Computing Networks
Taimoor Ahmad
172
0
0
21 May 2025
RWKV-edge: Deeply Compressed RWKV for Resource-Constrained Devices
RWKV-edge: Deeply Compressed RWKV for Resource-Constrained Devices
Wonkyo Choe
Yangfeng Ji
F. Lin
612
1
0
14 Dec 2024
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
MCUBERT: Memory-Efficient BERT Inference on Commodity MicrocontrollersInternational Conference on Computer Aided Design (ICCAD), 2024
Zebin Yang
Renze Chen
Taiqiang Wu
Ngai Wong
Yun Liang
Runsheng Wang
R. Huang
Meng Li
MQ
333
3
0
23 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor
  Factorization for Compression of Generative Language Models
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo Mandic
352
3
0
03 Oct 2024
Reweighted Solutions for Weighted Low Rank Approximation
Reweighted Solutions for Weighted Low Rank Approximation
David P. Woodruff
T. Yasuda
266
3
0
04 Jun 2024
Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications
Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications
Yang Li
Changsheng Zhao
Changsheng Zhao
Ernie Chang
Yangyang Shi
Vikas Chandra
374
1
0
24 May 2024
Characterizing the Accuracy - Efficiency Trade-off of Low-rank
  Decomposition in Language Models
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models
Chakshu Moar
Michael Pellauer
Hyoukjun Kwon
180
6
0
10 May 2024
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse
  Weight Factorization
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
223
5
0
20 Dec 2023
Experimental Analysis of Large-scale Learnable Vector Storage
  Compression
Experimental Analysis of Large-scale Learnable Vector Storage CompressionProceedings of the VLDB Endowment (PVLDB), 2023
Hailin Zhang
Penghao Zhao
Xupeng Miao
Yingxia Shao
Zirui Liu
Tong Yang
Tengjiao Wang
351
19
0
27 Nov 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
PELA: Learning Parameter-Efficient Models with Low-Rank ApproximationComputer Vision and Pattern Recognition (CVPR), 2023
Yangyang Guo
Guangzhi Wang
Mohan S. Kankanhalli
270
12
0
16 Oct 2023
TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on
  the Tensor-Train Decomposition
TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition
Mingxue Xu
Y. Xu
Danilo Mandic
328
30
0
02 Jul 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Low-Rank Prune-And-Factorize for Language Model CompressionInternational Conference on Language Resources and Evaluation (LREC), 2023
Siyu Ren
Kenny Q. Zhu
324
18
0
25 Jun 2023
Towards energy-efficient Deep Learning: An overview of energy-efficient
  approaches along the Deep Learning Lifecycle
Towards energy-efficient Deep Learning: An overview of energy-efficient approaches along the Deep Learning Lifecycle
Vanessa Mehlin
Sigurd Schacht
Carsten Lanquillon
HAIMedIm
285
28
0
05 Feb 2023
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer
  Compression
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression
Jiaqi Gu
Ben Keller
Jean Kossaifi
Anima Anandkumar
Brucek Khailany
David Z. Pan
ViT
223
9
0
30 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language
  Model
Numerical Optimizations for Weighted Low-rank Estimation on Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
261
21
0
02 Nov 2022
MorphTE: Injecting Morphology in Tensorized Embeddings
MorphTE: Injecting Morphology in Tensorized EmbeddingsNeural Information Processing Systems (NeurIPS), 2022
Guobing Gan
Peng Zhang
Sunzhu Li
Xiuqing Lu
Benyou Wang
182
8
0
27 Oct 2022
Language model compression with weighted low-rank factorization
Language model compression with weighted low-rank factorizationInternational Conference on Learning Representations (ICLR), 2022
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
400
202
0
30 Jun 2022
Bottleneck Low-rank Transformers for Low-resource Spoken Language
  Understanding
Bottleneck Low-rank Transformers for Low-resource Spoken Language UnderstandingInterspeech (Interspeech), 2022
Pu Wang
Hugo Van hamme
VLM
253
8
0
28 Jun 2022
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient
  Inference in Large-Scale Generative Language Models
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language ModelsInternational Conference on Learning Representations (ICLR), 2022
Gunho Park
Baeseong Park
Minsub Kim
Sungjae Lee
Jeonghoon Kim
Beomseok Kwon
S. Kwon
Byeongwook Kim
Youngjoo Lee
Dongsoo Lee
MQ
557
130
0
20 Jun 2022
Rank Diminishing in Deep Neural Networks
Rank Diminishing in Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2022
Ruili Feng
Kecheng Zheng
Yukun Huang
Deli Zhao
Michael I. Jordan
Zhengjun Zha
294
49
0
13 Jun 2022
Efficient Mixed Dimension Embeddings for Matrix Factorization
Efficient Mixed Dimension Embeddings for Matrix Factorization
D. Beloborodov
Andrei Zimovnov
Petr Molodyk
Dmitrii Kirillov
175
2
0
18 May 2022
A Survey on Green Deep Learning
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
510
102
0
08 Nov 2021
Compressing Neural Networks: Towards Determining the Optimal Layer-wise
  Decomposition
Compressing Neural Networks: Towards Determining the Optimal Layer-wise DecompositionNeural Information Processing Systems (NeurIPS), 2021
Lucas Liebenwein
Alaa Maalouf
O. Gal
Dan Feldman
Daniela Rus
305
54
0
23 Jul 2021
From Fully Trained to Fully Random Embeddings: Improving Neural Machine
  Translation with Compact Word Embedding Tables
From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding TablesAAAI Conference on Artificial Intelligence (AAAI), 2021
Krtin Kumar
Peyman Passban
Mehdi Rezagholizadeh
Yiu Sing Lau
Qun Liu
250
3
0
18 Apr 2021
Extremely Low Bit Transformer Quantization for On-Device Neural Machine
  Translation
Extremely Low Bit Transformer Quantization for On-Device Neural Machine TranslationFindings (Findings), 2020
Insoo Chung
Byeongwook Kim
Yoonjung Choi
S. Kwon
Yongkweon Jeon
Baeseong Park
Sangha Kim
Dongsoo Lee
MQ
273
29
0
16 Sep 2020
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
  Distillation
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down DistillationEuropean Conference on Computer Vision (ECCV), 2020
Benlin Liu
Yongming Rao
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
218
43
0
27 Aug 2020
DeLighT: Deep and Light-weight Transformer
DeLighT: Deep and Light-weight Transformer
Sachin Mehta
Marjan Ghazvininejad
Srini Iyer
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
340
36
0
03 Aug 2020
DeFormer: Decomposing Pre-trained Transformers for Faster Question
  Answering
DeFormer: Decomposing Pre-trained Transformers for Faster Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Qingqing Cao
H. Trivedi
A. Balasubramanian
Niranjan Balasubramanian
219
71
0
02 May 2020
A Generic Network Compression Framework for Sequential Recommender
  Systems
A Generic Network Compression Framework for Sequential Recommender Systems
Yang Sun
Fajie Yuan
Ming Yang
Guoao Wei
Zhou Zhao
Duo Liu
287
58
0
21 Apr 2020
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model
  Compression
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model CompressionInternational Conference on Computational Linguistics (COLING), 2020
Yihuan Mao
Yujing Wang
Chufan Wu
Chen Zhang
Yang-Feng Wang
Yaming Yang
Quanlu Zhang
Yunhai Tong
Jing Bai
218
81
0
08 Apr 2020
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum
  Evaluation
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum EvaluationInternational Conference on Learning Representations (ICLR), 2019
Matthew Shunshi Zhang
Bradly C. Stadie
169
34
0
30 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence
  Modeling
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence ModelingInternational Conference on Learning Representations (ICLR), 2019
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
AI4TS
367
28
0
27 Nov 2019
Fully Quantized Transformer for Machine Translation
Fully Quantized Transformer for Machine TranslationFindings (Findings), 2019
Gabriele Prato
Ella Charlaix
Mehdi Rezagholizadeh
MQ
404
72
0
17 Oct 2019
Improving Word Embedding Factorization for Compression Using Distilled
  Nonlinear Neural Decomposition
Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition
Vasileios Lioutas
Ahmad Rashid
Krtin Kumar
Md. Akmal Haidar
Mehdi Rezagholizadeh
265
9
0
02 Oct 2019
A Tensorized Transformer for Language Modeling
A Tensorized Transformer for Language ModelingNeural Information Processing Systems (NeurIPS), 2019
Xindian Ma
Peng Zhang
Shuai Zhang
Nan Duan
Yuexian Hou
D. Song
M. Zhou
409
193
0
24 Jun 2019
Learning Low-Rank Approximation for CNNs
Learning Low-Rank Approximation for CNNs
Dongsoo Lee
S. Kwon
Byeongwook Kim
Gu-Yeon Wei
331
25
0
24 May 2019
Network Pruning for Low-Rank Binary Indexing
Network Pruning for Low-Rank Binary Indexing
Dongsoo Lee
S. Kwon
Byeongwook Kim
Parichay Kapoor
Gu-Yeon Wei
228
6
0
14 May 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
327
76
0
30 Jan 2019
WEST: Word Encoded Sequence Transducers
WEST: Word Encoded Sequence TransducersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Ehsan Variani
A. Suresh
M. Weintraub
196
9
0
20 Nov 2018
1
Page 1 of 1