Insights into Pre-training via Simpler Synthetic Tasks

Insights into Pre-training via Simpler Synthetic Tasks

Neural Information Processing Systems (NeurIPS), 2022

Papers citing "Insights into Pre-training via Simpler Synthetic Tasks"

17 / 17 papers shown
Title
The Distributional Hypothesis Does Not Fully Explain the Benefits of
  Masked Language Model Pretraining
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model PretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
130
1
0
25 Oct 2023
SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by
  Simulation
SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by SimulationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
347
6
0
01 Oct 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-CrisisNeural Information Processing Systems (NeurIPS), 2023
270
116
0
22 May 2023
A Survey of Deep Learning for Mathematical Reasoning
A Survey of Deep Learning for Mathematical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
211
169
0
20 Dec 2022
Synthetic Pre-Training Tasks for Neural Machine Translation
Synthetic Pre-Training Tasks for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
205
6
0
19 Dec 2022
Downstream Datasets Make Surprisingly Good Pretraining Corpora
Downstream Datasets Make Surprisingly Good Pretraining CorporaAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
186
37
0
28 Sep 2022
On the Importance and Applicability of Pre-Training for Federated
  Learning
On the Importance and Applicability of Pre-Training for Federated LearningInternational Conference on Learning Representations (ICLR), 2022
280
102
0
23 Jun 2022