Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00595
Cited By
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
1 April 2022
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Monarch: Expressive Structured Matrices for Efficient and Accurate Training"
16 / 66 papers shown
Title
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
J. Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
19
6
0
16 Dec 2022
Guiding continuous operator learning through Physics-based boundary constraints
Nadim Saad
Gaurav Gupta
S. Alizadeh
Danielle C. Maddix
AI4CE
40
20
0
14 Dec 2022
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations
Zirui Liu
Sheng-Wei Chen
Kaixiong Zhou
Daochen Zha
Xiao Huang
Xia Hu
29
14
0
19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
29
47
0
13 Oct 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
29
56
0
20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
A Structured Sparse Neural Network and Its Matrix Calculations Algorithm
S. Sarayi
M. Bahrami
9
0
0
02 Jul 2022
Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Atri Rudra
11
1
0
24 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
56
2,020
0
27 May 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
Xuxi Chen
Tianlong Chen
Weizhu Chen
Ahmed Hassan Awadallah
Zhangyang Wang
Yu Cheng
MoE
ALM
14
10
0
30 Oct 2021
Efficient Identification of Butterfly Sparse Matrix Factorizations
Léon Zheng
E. Riccietti
Rémi Gribonval
28
6
0
04 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
247
2,600
0
04 May 2021
Initialization and Regularization of Factorized Neural Layers
M. Khodak
Neil A. Tenenholtz
Lester W. Mackey
Nicolò Fusi
63
56
0
03 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations
Zong-Yi Li
Nikola B. Kovachki
Kamyar Azizzadenesheli
Burigede Liu
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
203
2,281
0
18 Oct 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,460
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
Previous
1
2