Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.02782
Cited By
Block-Sparse Recurrent Neural Networks
8 November 2017
Sharan Narang
Eric Undersander
G. Diamos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Block-Sparse Recurrent Neural Networks"
17 / 17 papers shown
Title
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Zihao Ye
Lequn Chen
Ruihang Lai
Wuwei Lin
Yineng Zhang
...
Tianqi Chen
Baris Kasikci
Vinod Grover
Arvind Krishnamurthy
Luis Ceze
65
21
0
02 Jan 2025
Rethinking the Relationship between Recurrent and Non-Recurrent Neural Networks: A Study in Sparsity
Quincy Hershey
Randy Paffenroth
Harsh Nilesh Pathak
Simon Tavener
61
1
0
01 Apr 2024
A One-Shot Reparameterization Method for Reducing the Loss of Tile Pruning on DNNs
Yancheng Li
Qingzhong Ai
Fumihiko Ino
25
0
0
29 Jul 2022
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
Sharath Girish
Kamal Gupta
Saurabh Singh
Abhinav Shrivastava
28
11
0
06 Apr 2022
1xN Pattern for Pruning Convolutional Neural Networks
Mingbao Lin
Yu-xin Zhang
Yuchao Li
Bohong Chen
Fei Chao
Mengdi Wang
Shen Li
Yonghong Tian
Rongrong Ji
3DPC
31
40
0
31 May 2021
Spectral Pruning for Recurrent Neural Networks
Takashi Furuya
Kazuma Suetake
K. Taniguchi
Hiroyuki Kusumoto
Ryuji Saiin
Tomohiro Daimon
22
4
0
23 May 2021
BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification
Seyed Abolfazl Ghasemzadeh
E. Tavakoli
M. Kamal
A. Afzali-Kusha
Massoud Pedram
8
13
0
07 Jan 2021
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning
Bingbing Li
Zhenglun Kong
Tianyun Zhang
Ji Li
Z. Li
Hang Liu
Caiwen Ding
VLM
24
64
0
17 Sep 2020
Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise Sparsity
Cong Guo
B. Hsueh
Jingwen Leng
Yuxian Qiu
Yue Guan
Zehuan Wang
Xiaoying Jia
Xipeng Li
M. Guo
Yuhao Zhu
32
82
0
29 Aug 2020
Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
H. T. Kung
Bradley McDanel
S. Zhang
MQ
13
9
0
13 Jul 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
14
22
0
12 May 2020
CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured Blocks
Runbin Shi
Peiyan Dong
Tong Geng
Yuhao Ding
Xiaolong Ma
Hayden Kwok-Hay So
Martin C. Herbordt
Ang Li
Yanzhi Wang
MQ
10
13
0
11 May 2020
Fully Quantized Transformer for Machine Translation
Gabriele Prato
Ella Charlaix
Mehdi Rezagholizadeh
MQ
13
68
0
17 Oct 2019
Fast Training of Sparse Graph Neural Networks on Dense Hardware
Matej Balog
B. V. Merrienboer
Subhodeep Moitra
Yujia Li
Daniel Tarlow
GNN
31
10
0
27 Jun 2019
Hardware-Guided Symbiotic Training for Compact, Accurate, yet Execution-Efficient LSTM
Hongxu Yin
Guoyang Chen
Yingmin Li
Shuai Che
Weifeng Zhang
N. Jha
22
10
0
30 Jan 2019
Efficient Neural Audio Synthesis
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
21
863
0
23 Feb 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
1