Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

27 January 2023

Papers citing "Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus"

43 / 43 papers shown

Title
Towards Developmentally Plausible Rewards: Communicative Success as a Learning Signal for Interactive Language Models Lennart Stöpler Rufat Asadli Mitja Nikolaus Ryan Cotterell Alex Warstadt LRM 37 0 0 09 May 2025
Model Connectomes: A Generational Approach to Data-Efficient Language Models Klemen Kotar Greta Tuckute 49 0 0 29 Apr 2025
Position: The Most Expensive Part of an LLM should be its Training Data Nikhil Kandpal Colin Raffel 26 0 0 16 Apr 2025
BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning Shengao Wang Arjun Chandra Aoming Liu Venkatesh Saligrama Boqing Gong MLLM VLM 45 0 0 13 Apr 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction Nicola Horst Davide Mazzaccara Antonia Schmidt Michael Sullivan Filippo Momentè ... Alexander Koller Oliver Lemon David Schlangen Mario Giulianelli Alessandro Suglia OffRL 32 0 0 11 Apr 2025
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora Alex Warstadt Aaron Mueller Leshem Choshen E. Wilcox Chengxu Zhuang ... Rafael Mosquera Bhargavi Paranjape Adina Williams Tal Linzen Ryan Cotterell 38 106 0 10 Apr 2025
Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance Nirvan Patil Malhar Abhay Inamdar Agnivo Gosai Guruprasad Pathak Anish Joshi Aryan Sagavekar Anish Joshirao Raj Abhijit Dandekar Rajat Dandekar Sreedath Panat 41 0 0 07 Apr 2025
Syntactic Learnability of Echo State Neural Language Models at Scale Ryo Ueda Tatsuki Kuribayashi Shunsuke Kando Kentaro Inui 51 0 0 03 Mar 2025
Scaling LLM Pre-training with Vocabulary Curriculum Fangyuan Yu 73 2 0 25 Feb 2025
BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training Nikitas Theodoropoulos Giorgos Filandrianos Vassilis Lyberatos Maria Lymperaiou Giorgos Stamou SyDa 52 1 0 24 Feb 2025
AntLM: Bridging Causal and Masked Language Models Xinru Yu Bin Guo Shiwei Luo J. Wang Tao Ji Yuanbin Wu CLL 77 1 0 04 Dec 2024
BudgetMLAgent: A Cost-Effective LLM Multi-Agent system for Automating Machine Learning Tasks Shubham Gandhi Manasi S. Patwardhan L. Vig Gautam M. Shroff LLMAG 40 0 0 12 Nov 2024
From Babble to Words: Pre-Training Language Models on Continuous Streams of Phonemes Zébulon Goriely Richard Diehl Martinez Andrew Caines Lisa Beinborn P. Buttery CLL 42 5 0 30 Oct 2024
Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences Shuchen Wu Mirko Thalmann Peter Dayan Zeynep Akata Eric Schulz VLM 16 0 0 27 Oct 2024
Tracking Universal Features Through Fine-Tuning and Model Merging Niels Horn Desmond Elliott MoMe 31 0 0 16 Oct 2024
Gradient-based inference of abstract task representations for generalization in neural networks Ali Hummos Felipe del-Rio Brabeeba Mien Wang Julio Hurtado Cristian B. Calderon G. Yang 18 3 0 24 Jul 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks Jacob Russin Sam Whitman McGrath Danielle J. Williams Lotem Elber-Dorozko AI4CE 61 3 0 24 May 2024
From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency Xenia Ohmer Elia Bruni Dieuwke Hupkes AI4CE 31 6 0 18 Apr 2024
[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus Leshem Choshen Ryan Cotterell Michael Y. Hu Tal Linzen Aaron Mueller Candace Ross Alex Warstadt Ethan Gotlieb Wilcox Adina Williams Chengxu Zhuang 26 22 0 09 Apr 2024
Investigating grammatical abstraction in language models using few-shot learning of novel noun gender Priyanka Sukumaran Conor Houghton N. Kazanina 25 0 0 15 Mar 2024
Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding Benjamin Bergner Andrii Skliar Amelie Royer Tijmen Blankevoort Yuki Markus Asano B. Bejnordi 58 5 0 26 Feb 2024
Emergent Word Order Universals from Cognitively-Motivated Language Models Tatsuki Kuribayashi Ryo Ueda Ryosuke Yoshida Yohei Oseki Ted Briscoe Timothy Baldwin 28 2 0 19 Feb 2024
Limits of Transformer Language Models on Learning to Compose Algorithms Jonathan Thomm Aleksandar Terzić Giacomo Camposampiero Michael Hersche Bernhard Schölkopf Abbas Rahimi 34 3 0 08 Feb 2024
Mission: Impossible Language Models Julie Kallini Isabel Papadimitriou Richard Futrell Kyle Mahowald Christopher Potts ELM LRM 42 19 0 12 Jan 2024
DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layers S. Chandy Varun Gangal Yi Yang Gabriel Maggiotti 30 0 0 11 Dec 2023
WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words Lukas Wolf Greta Tuckute Klemen Kotar Eghbal Hosseini Tamar I. Regev Ethan Gotlieb Wilcox Alex Warstadt 41 3 0 05 Dec 2023
Bit Cipher -- A Simple yet Powerful Word Representation System that Integrates Efficiently with Language Models Haoran Zhao Jake Ryland Williams 11 0 0 18 Nov 2023
Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units Jake Ryland Williams Haoran Zhao 21 0 0 13 Nov 2023
Reducing the Need for Backpropagation and Discovering Better Optima With Explicit Optimizations of Neural Networks Jake Ryland Williams Haoran Zhao 21 0 0 13 Nov 2023
Visual Grounding Helps Learn Word Meanings in Low-Data Regimes Chengxu Zhuang Evelina Fedorenko Jacob Andreas 20 10 0 20 Oct 2023
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation Qian Huang Jian Vora Percy Liang J. Leskovec ELM LLMAG 17 67 0 05 Oct 2023
A Benchmark for Learning to Translate a New Language from One Grammar Book Garrett Tanzer Mirac Suzgun Chenguang Xi Dan Jurafsky Luke Melas-Kyriazi 24 51 0 28 Sep 2023
ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding Omer Veysel Cagatan 14 2 0 30 Aug 2023
Efficient Benchmarking of Language Models Yotam Perlitz Elron Bandel Ariel Gera Ofir Arviv L. Ein-Dor Eyal Shnarch Noam Slonim Michal Shmueli-Scheuer Leshem Choshen ALM 11 24 0 22 Aug 2023
On the Unexpected Abilities of Large Language Models S. Nolfi LRM 22 11 0 09 Aug 2023
Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty I. Timiryasov J. Tastet 10 44 0 03 Aug 2023
Baby's CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models Zheyu Zhang Han Yang Bolei Ma David Rügamer Ercong Nie LRM 11 2 0 03 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? Ari Holtzman Peter West Luke Zettlemoyer AI4CE 21 13 0 31 Jul 2023
ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey S. Mohamadi G. Mujtaba Ngan Le Gianfranco Doretto Don Adjeroh LM&MA AI4MH 21 21 0 09 Jul 2023
BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models Marvin Lavechin Yaya Sy Hadrien Titeux María Andrea Cruz Blandón Okko Rasanen H. Bredin Emmanuel Dupoux Alejandrina Cristià AuLLM 17 12 0 02 Jun 2023
Injecting structural hints: Using language models to study inductive biases in language learning Isabel Papadimitriou Dan Jurafsky 16 12 0 25 Apr 2023
The MiniPile Challenge for Data-Efficient Language Models Jean Kaddour MoE ALM 24 41 0 17 Apr 2023
GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language Models at Almost No Cost Qingcheng Zeng Lucas Garay Peilin Zhou Dading Chong Yining Hua Jiageng Wu Yi-Cheng Pan Han Zhou Rob Voigt Jie Yang VLM 19 22 0 13 Nov 2022