Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.06277
Cited By
Towards Structured Dynamic Sparse Pre-Training of BERT
13 August 2021
A. Dietrich
Frithjof Gressmann
Douglas Orr
Ivan Chelombiev
Daniel Justus
Carlo Luschi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Structured Dynamic Sparse Pre-Training of BERT"
8 / 8 papers shown
Title
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
33
0
0
05 Nov 2024
Reducing Memory Requirements for the IPU using Butterfly Factorizations
S. Shekofteh
Christian Alles
Holger Fröning
22
0
0
16 Sep 2023
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance
Shiwei Liu
Yuesong Tian
Tianlong Chen
Li Shen
34
8
0
05 Mar 2022
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
D. Mocanu
OOD
28
49
0
28 Jun 2021
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
244
643
0
21 Apr 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
141
684
0
31 Jan 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
150
345
0
23 Jul 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
231
4,469
0
23 Jan 2020
1