Monarch: Expressive Structured Matrices for Efficient and Accurate
Training

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

1 April 2022

Christopher Ré

Papers citing "Monarch: Expressive Structured Matrices for Efficient and Accurate Training"

16 / 66 papers shown

Title
Convolution-enhanced Evolving Attention Networks Yujing Wang Yaming Yang Zhuowan Li Jiangang Bai Mingliang Zhang Xiangtai Li J. Yu Ce Zhang Gao Huang Yu Tong ViT 19 6 0 16 Dec 2022
Guiding continuous operator learning through Physics-based boundary constraints Nadim Saad Gaurav Gupta S. Alizadeh Danielle C. Maddix AI4CE 40 20 0 14 Dec 2022
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations Zirui Liu Sheng-Wei Chen Kaixiong Zhou Daochen Zha Xiao Huang Xia Hu 29 14 0 19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities Brian Bartoldson B. Kailkhura Davis W. Blalock 29 47 0 13 Oct 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design Hongxiang Fan Thomas C. P. Chau Stylianos I. Venieris Royson Lee Alexandros Kouris Wayne Luk Nicholas D. Lane Mohamed S. Abdelfattah 29 56 0 20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey Marcos Vinícius Treviso Ji-Ung Lee Tianchu Ji Betty van Aken Qingqing Cao ... Emma Strubell Niranjan Balasubramanian Leon Derczynski Iryna Gurevych Roy Schwartz 28 109 0 31 Aug 2022
A Structured Sparse Neural Network and Its Matrix Calculations Algorithm S. Sarayi M. Bahrami 9 0 0 02 Jul 2022
Arithmetic Circuits, Structured Matrices and (not so) Deep Learning Atri Rudra 11 1 0 24 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Tri Dao Daniel Y. Fu Stefano Ermon Atri Rudra Christopher Ré VLM 56 2,020 0 27 May 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Xuxi Chen Tianlong Chen Weizhu Chen Ahmed Hassan Awadallah Zhangyang Wang Yu Cheng MoE ALM 14 10 0 30 Oct 2021
Efficient Identification of Butterfly Sparse Matrix Factorizations Léon Zheng E. Riccietti Rémi Gribonval 28 6 0 04 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 247 2,600 0 04 May 2021
Initialization and Regularization of Factorized Neural Layers M. Khodak Neil A. Tenenholtz Lester W. Mackey Nicolò Fusi 63 56 0 03 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations Zong-Yi Li Nikola B. Kovachki Kamyar Azizzadenesheli Burigede Liu K. Bhattacharya Andrew M. Stuart Anima Anandkumar AI4CE 203 2,281 0 18 Oct 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 226 4,460 0 23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 243 1,817 0 17 Sep 2019