ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.00063
  4. Cited By
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
v1v2 (latest)

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

31 December 2020
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
ArXiv (abs)PDFHTMLGithub (18★)

Papers citing "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

14 / 64 papers shown
Title
NormFormer: Improved Transformer Pretraining with Extra Normalization
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
73
76
0
18 Oct 2021
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with
  Variable Hidden Dimensions
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
Vinod Ganesan
Gowtham Ramesh
Pratyush Kumar
63
9
0
10 Oct 2021
The Low-Resource Double Bind: An Empirical Study of Pruning for
  Low-Resource Machine Translation
The Low-Resource Double Bind: An Empirical Study of Pruning for Low-Resource Machine Translation
Orevaoghene Ahia
Julia Kreutzer
Sara Hooker
188
55
0
06 Oct 2021
Shatter: An Efficient Transformer Encoder with Single-Headed
  Self-Attention and Relative Sequence Partitioning
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
56
2
0
30 Aug 2021
Towards Structured Dynamic Sparse Pre-Training of BERT
Towards Structured Dynamic Sparse Pre-Training of BERT
A. Dietrich
Frithjof Gressmann
Douglas Orr
Ivan Chelombiev
Daniel Justus
Carlo Luschi
66
17
0
13 Aug 2021
How much pre-training is enough to discover a good subnetwork?
How much pre-training is enough to discover a good subnetwork?
Cameron R. Wolfe
Fangshuo Liao
Qihan Wang
Junhyung Lyle Kim
Anastasios Kyrillidis
90
3
0
31 Jul 2021
Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win
  the Jackpot?
Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?
Xiaolong Ma
Geng Yuan
Xuan Shen
Tianlong Chen
Xuxi Chen
...
Ning Liu
Minghai Qin
Sijia Liu
Zhangyang Wang
Yanzhi Wang
159
64
0
01 Jul 2021
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Chasing Sparsity in Vision Transformers: An End-to-End Exploration
Tianlong Chen
Yu Cheng
Zhe Gan
Lu Yuan
Lei Zhang
Zhangyang Wang
ViT
70
224
0
08 Jun 2021
Playing Lottery Tickets with Vision and Language
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
142
56
0
23 Apr 2021
The Elastic Lottery Ticket Hypothesis
The Elastic Lottery Ticket Hypothesis
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Jingjing Liu
Zhangyang Wang
OOD
90
34
0
30 Mar 2021
Early-Bird GCNs: Graph-Network Co-Optimization Towards More Efficient GCN Training and Inference via Drawing Early-Bird Lottery Tickets
Early-Bird GCNs: Graph-Network Co-Optimization Towards More Efficient GCN Training and Inference via Drawing Early-Bird Lottery Tickets
Haoran You
Zhihan Lu
Zijian Zhou
Y. Fu
Yingyan Lin
GNN
109
33
0
01 Mar 2021
Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery
  Ticket Perspective
Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective
Tianlong Chen
Yu Cheng
Zhe Gan
Jingjing Liu
Zhangyang Wang
82
52
0
28 Feb 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate
  Fine-tuning
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
80
56
0
02 Feb 2021
Spending Your Winning Lottery Better After Drawing It
Spending Your Winning Lottery Better After Drawing It
Ajay Jaiswal
Haoyu Ma
Tianlong Chen
Ying Ding
Zhangyang Wang
48
6
0
08 Jan 2021
Previous
12