ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.06950
  4. Cited By
GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model
  Shrinking

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

18 June 2018
Patrick H. Chen
Si Si
Yang Li
Ciprian Chelba
Cho-Jui Hsieh
ArXivPDFHTML

Papers citing "GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking"

41 / 41 papers shown
Title
RWKV-Lite: Deeply Compressed RWKV for Resource-Constrained Devices
RWKV-Lite: Deeply Compressed RWKV for Resource-Constrained Devices
Wonkyo Choe
Yangfeng Ji
F. Lin
77
0
0
14 Dec 2024
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Zebin Yang
Renze Chen
Taiqiang Wu
Ngai Wong
Yun Liang
Runsheng Wang
R. Huang
Meng Li
MQ
31
1
0
23 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor
  Factorization for Compression of Generative Language Models
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo Mandic
35
2
0
03 Oct 2024
TropNNC: Structured Neural Network Compression Using Tropical Geometry
TropNNC: Structured Neural Network Compression Using Tropical Geometry
Konstantinos Fotopoulos
Petros Maragos
Panagiotis Misiakos
38
0
0
05 Sep 2024
Reweighted Solutions for Weighted Low Rank Approximation
Reweighted Solutions for Weighted Low Rank Approximation
David P. Woodruff
T. Yasuda
42
1
0
04 Jun 2024
Basis Selection: Low-Rank Decomposition of Pretrained Large Language
  Models for Target Applications
Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications
Yang Li
Changsheng Zhao
Hyungtak Lee
Ernie Chang
Yangyang Shi
Vikas Chandra
40
0
0
24 May 2024
Characterizing the Accuracy - Efficiency Trade-off of Low-rank
  Decomposition in Language Models
Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models
Chakshu Moar
Michael Pellauer
Hyoukjun Kwon
38
1
0
10 May 2024
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse
  Weight Factorization
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
20
3
0
20 Dec 2023
Experimental Analysis of Large-scale Learnable Vector Storage
  Compression
Experimental Analysis of Large-scale Learnable Vector Storage Compression
Hailin Zhang
Penghao Zhao
Xupeng Miao
Yingxia Shao
Zirui Liu
Tong Yang
Bin Cui
34
11
0
27 Nov 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
Yangyang Guo
Guangzhi Wang
Mohan S. Kankanhalli
21
3
0
16 Oct 2023
TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on
  the Tensor-Train Decomposition
TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition
Mingxue Xu
Y. Xu
Danilo Mandic
10
2
0
02 Jul 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Low-Rank Prune-And-Factorize for Language Model Compression
Siyu Ren
Kenny Q. Zhu
14
9
0
25 Jun 2023
Towards energy-efficient Deep Learning: An overview of energy-efficient
  approaches along the Deep Learning Lifecycle
Towards energy-efficient Deep Learning: An overview of energy-efficient approaches along the Deep Learning Lifecycle
Vanessa Mehlin
Sigurd Schacht
Carsten Lanquillon
HAI
MedIm
33
19
0
05 Feb 2023
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer
  Compression
HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression
Jiaqi Gu
Ben Keller
Jean Kossaifi
Anima Anandkumar
Brucek Khailany
David Z. Pan
ViT
35
8
0
30 Nov 2022
Numerical Optimizations for Weighted Low-rank Estimation on Language
  Model
Numerical Optimizations for Weighted Low-rank Estimation on Language Model
Ting Hua
Yen-Chang Hsu
Felicity Wang
Qiang Lou
Yilin Shen
Hongxia Jin
27
13
0
02 Nov 2022
MorphTE: Injecting Morphology in Tensorized Embeddings
MorphTE: Injecting Morphology in Tensorized Embeddings
Guobing Gan
Peng Zhang
Sunzhu Li
Xiuqing Lu
Benyou Wang
36
5
0
27 Oct 2022
Language model compression with weighted low-rank factorization
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
16
93
0
30 Jun 2022
Bottleneck Low-rank Transformers for Low-resource Spoken Language
  Understanding
Bottleneck Low-rank Transformers for Low-resource Spoken Language Understanding
Pu Wang
Hugo Van hamme
VLM
29
4
0
28 Jun 2022
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient
  Inference in Large-Scale Generative Language Models
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
Gunho Park
Baeseong Park
Minsub Kim
Sungjae Lee
Jeonghoon Kim
Beomseok Kwon
S. Kwon
Byeongwook Kim
Youngjoo Lee
Dongsoo Lee
MQ
21
74
0
20 Jun 2022
Rank Diminishing in Deep Neural Networks
Rank Diminishing in Deep Neural Networks
Ruili Feng
Kecheng Zheng
Yukun Huang
Deli Zhao
Michael I. Jordan
Zhengjun Zha
34
28
0
13 Jun 2022
Efficient Mixed Dimension Embeddings for Matrix Factorization
Efficient Mixed Dimension Embeddings for Matrix Factorization
D. Beloborodov
Andrei Zimovnov
Petr Molodyk
Dmitrii Kirillov
20
2
0
18 May 2022
A Survey on Green Deep Learning
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
81
83
0
08 Nov 2021
Compressing Neural Networks: Towards Determining the Optimal Layer-wise
  Decomposition
Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition
Lucas Liebenwein
Alaa Maalouf
O. Gal
Dan Feldman
Daniela Rus
40
46
0
23 Jul 2021
From Fully Trained to Fully Random Embeddings: Improving Neural Machine
  Translation with Compact Word Embedding Tables
From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables
Krtin Kumar
Peyman Passban
Mehdi Rezagholizadeh
Yiu Sing Lau
Qun Liu
11
2
0
18 Apr 2021
Extremely Low Bit Transformer Quantization for On-Device Neural Machine
  Translation
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Insoo Chung
Byeongwook Kim
Yoonjung Choi
S. Kwon
Yongkweon Jeon
Baeseong Park
Sangha Kim
Dongsoo Lee
MQ
29
27
0
16 Sep 2020
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
  Distillation
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
Benlin Liu
Yongming Rao
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
6
37
0
27 Aug 2020
DeLighT: Deep and Light-weight Transformer
DeLighT: Deep and Light-weight Transformer
Sachin Mehta
Marjan Ghazvininejad
Srini Iyer
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
33
32
0
03 Aug 2020
DeFormer: Decomposing Pre-trained Transformers for Faster Question
  Answering
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
Qingqing Cao
H. Trivedi
A. Balasubramanian
Niranjan Balasubramanian
32
66
0
02 May 2020
A Generic Network Compression Framework for Sequential Recommender
  Systems
A Generic Network Compression Framework for Sequential Recommender Systems
Yang Sun
Fajie Yuan
Ming Yang
Guoao Wei
Zhou Zhao
Duo Liu
26
54
0
21 Apr 2020
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model
  Compression
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression
Yihuan Mao
Yujing Wang
Chufan Wu
Chen Zhang
Yang-Feng Wang
Yaming Yang
Quanlu Zhang
Yunhai Tong
Jing Bai
22
72
0
08 Apr 2020
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum
  Evaluation
One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
Matthew Shunshi Zhang
Bradly C. Stadie
8
32
0
30 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence
  Modeling
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
AI4TS
38
23
0
27 Nov 2019
Fully Quantized Transformer for Machine Translation
Fully Quantized Transformer for Machine Translation
Gabriele Prato
Ella Charlaix
Mehdi Rezagholizadeh
MQ
13
68
0
17 Oct 2019
Improving Word Embedding Factorization for Compression Using Distilled
  Nonlinear Neural Decomposition
Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition
Vasileios Lioutas
Ahmad Rashid
Krtin Kumar
Md. Akmal Haidar
Mehdi Rezagholizadeh
29
9
0
02 Oct 2019
A Tensorized Transformer for Language Modeling
A Tensorized Transformer for Language Modeling
Xindian Ma
Peng Zhang
Shuai Zhang
Nan Duan
Yuexian Hou
D. Song
M. Zhou
18
163
0
24 Jun 2019
Learning Low-Rank Approximation for CNNs
Learning Low-Rank Approximation for CNNs
Dongsoo Lee
S. Kwon
Byeongwook Kim
Gu-Yeon Wei
32
19
0
24 May 2019
Network Pruning for Low-Rank Binary Indexing
Network Pruning for Low-Rank Binary Indexing
Dongsoo Lee
S. Kwon
Byeongwook Kim
Parichay Kapoor
Gu-Yeon Wei
22
6
0
14 May 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
32
73
0
30 Jan 2019
WEST: Word Encoded Sequence Transducers
WEST: Word Encoded Sequence Transducers
Ehsan Variani
A. Suresh
M. Weintraub
17
9
0
20 Nov 2018
Universal Deep Neural Network Compression
Universal Deep Neural Network Compression
Yoojin Choi
Mostafa El-Khamy
Jungwon Lee
MQ
86
85
0
07 Feb 2018
OpenNMT: Open-Source Toolkit for Neural Machine Translation
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein
Yoon Kim
Yuntian Deng
Jean Senellart
Alexander M. Rush
273
1,896
0
10 Jan 2017
1