Insights into Pre-training via Simpler Synthetic Tasks

Insights into Pre-training via Simpler Synthetic Tasks

Neural Information Processing Systems (NeurIPS), 2022

21 June 2022

Abigail Z. Jacobs

ArXiv (abs)PDF HTML Github (39★)

Papers citing "Insights into Pre-training via Simpler Synthetic Tasks"

17 / 17 papers shown

Title
Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning Zachary Shinnick Liangze Jiang Hemanth Saratchandran Anton Van Den Hengel Damien Teney 134 1 0 28 May 2025
General Intelligence Requires Reward-based Pretraining Seungwook Han Jyothish Pari Samuel J. Gershman Pulkit Agrawal LRM 749 2 0 26 Feb 2025
Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates A. Narayan Mayee F. Chen Kush S. Bhatia Christopher Ré SyDa 130 3 0 07 Oct 2024
Federated Document Visual Question Answering: A Pilot Study Khanh Nguyen Dimosthenis Karatzas FedML 261 0 0 10 May 2024
Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games Arjun Vaithilingam Sudhakar Prasanna Parthasarathi Janarthanan Rajendran Sarath Chandar 153 4 0 13 Nov 2023
The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model PretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Ting-Rui Chiang Dani Yogatama 130 1 0 25 Oct 2023
SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by SimulationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Matthias Lindemann Alexander Koller Ivan Titov AI4CE 347 6 0 01 Oct 2023
Pre-training with Synthetic Data Helps Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023 Zecheng Wang Che Wang Zixuan Dong George Andriopoulos OffRL 322 9 0 01 Oct 2023
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-CrisisNeural Information Processing Systems (NeurIPS), 2023 Fuzhao Xue Yao Fu Wangchunshu Zhou Zangwei Zheng Yang You 270 116 0 22 May 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & ToxicityNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Shayne Longpre Gregory Yauney Emily Reif Katherine Lee Adam Roberts ... Denny Zhou Jason W. Wei Kevin Robinson David M. Mimno Daphne Ippolito 284 205 0 22 May 2023
Understanding Emergent In-Context Learning from a Kernel Regression Perspective Chi Han Ziqi Wang Haiying Zhao Mengyue Yang LRM 230 15 0 22 May 2023
From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning Qian Liu Fan Zhou Zhengbao Jiang Longxu Dou Min Lin 171 17 0 17 Apr 2023
A Survey of Deep Learning for Mathematical ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Pan Lu Liang Qiu Wenhao Yu Sean Welleck Kai-Wei Chang ReLM LRM 211 169 0 20 Dec 2022
Synthetic Pre-Training Tasks for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Zexue He Graeme W. Blackwood Yikang Shen Julian McAuley Rogerio Feris 205 6 0 19 Dec 2022
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart DerenderingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Fangyu Liu Francesco Piccinno Syrine Krichene Chenxi Pang Kenton Lee Mandar Joshi Yasemin Altun Nigel Collier Julian Martin Eisenschlos VLM LRM 193 127 0 19 Dec 2022
Downstream Datasets Make Surprisingly Good Pretraining CorporaAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Kundan Krishna Saurabh Garg Jeffrey P. Bigham Zachary Chase Lipton 186 37 0 28 Sep 2022
On the Importance and Applicability of Pre-Training for Federated LearningInternational Conference on Learning Representations (ICLR), 2022 Hong-You Chen Cheng-Hao Tu Zi-hua Li Hang Shen Wei-Lun Chao FedML 280 102 0 23 Jun 2022