Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

17 March 2022

Aaron Mueller

Papers citing "Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models"

26 / 26 papers shown

Title
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora Alex Warstadt Aaron Mueller Leshem Choshen E. Wilcox Chengxu Zhuang ... Rafael Mosquera Bhargavi Paranjape Adina Williams Tal Linzen Ryan Cotterell 38 106 0 10 Apr 2025
Tree Transformers are an Ineffective Model of Syntactic Constituency Michael Ginn 62 0 0 25 Nov 2024
Can Language Models Induce Grammatical Knowledge from Indirect Evidence? Miyu Oba Yohei Oseki Akiyo Fukatsu Akari Haga Hiroki Ouchi Taro Watanabe Saku Sugawara 32 1 0 08 Oct 2024
How Does Code Pretraining Affect Language Model Task Performance? Jackson Petty Sjoerd van Steenkiste Tal Linzen 60 8 0 06 Sep 2024
Strengthening Structural Inductive Biases by Pre-training to Perform Syntactic Transformations Matthias Lindemann Alexander Koller Ivan Titov AI4CE NAI 25 2 0 05 Jul 2024
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models Peng Wang Zexi Li Ningyu Zhang Ziwen Xu Yunzhi Yao Yong-jia Jiang Pengjun Xie Fei Huang Huajun Chen KELM CLL 45 20 0 23 May 2024
Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers Kabir Ahuja Vidhisha Balachandran Madhur Panwar Tianxing He Noah A. Smith Navin Goyal Yulia Tsvetkov 34 8 0 25 Apr 2024
Experimental Contexts Can Facilitate Robust Semantic Property Inference in Language Models, but Inconsistently Kanishka Misra Allyson Ettinger Kyle Mahowald 16 4 0 12 Jan 2024
Exploiting Representation Bias for Data Distillation in Abstractive Text Summarization Yash Kumar Atri Vikram Goyal Tanmoy Chakraborty 19 1 0 10 Dec 2023
In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax Aaron Mueller Albert Webson Jackson Petty Tal Linzen ReLM LRM 19 13 0 13 Nov 2023
How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure Michael Wilson Jackson Petty Robert Frank 24 15 0 08 Nov 2023
The Impact of Depth on Compositional Generalization in Transformer Language Models Jackson Petty Sjoerd van Steenkiste Ishita Dasgupta Fei Sha Daniel H Garrette Tal Linzen AI4CE VLM 12 16 0 30 Oct 2023
Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns Brian DuSell David Chiang 28 12 0 03 Oct 2023
How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases Aaron Mueller Tal Linzen AI4CE 8 20 0 31 May 2023
Grokking of Hierarchical Structure in Vanilla Transformers Shikhar Murty Pratyusha Sharma Jacob Andreas Christopher D. Manning 15 43 0 30 May 2023
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations Chenglei Si Dan Friedman Nitish Joshi Shi Feng Danqi Chen He He 8 42 0 22 May 2023
Injecting structural hints: Using language models to study inductive biases in language learning Isabel Papadimitriou Dan Jurafsky 16 12 0 25 Apr 2023
A Theory of Emergent In-Context Learning as Implicit Structure Induction Michael Hahn Navin Goyal LRM 8 73 0 14 Mar 2023
Does Vision Accelerate Hierarchical Generalization of Neural Language Learners? Tatsuki Kuribayashi VLM 11 1 0 01 Feb 2023
How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech Aditya Yedetore Tal Linzen Robert Frank R. Thomas McCoy 20 16 0 26 Jan 2023
Language model acceptability judgements are not always robust to context Koustuv Sinha Jon Gauthier Aaron Mueller Kanishka Misra Keren Fuentes R. Levy Adina Williams 11 17 0 18 Dec 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron Mueller Yudi Xia Tal Linzen MILM 34 9 0 25 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review Dieuwke Hupkes Mario Giulianelli Verna Dankers Mikel Artetxe Yanai Elazar ... Leila Khalatbari Maria Ryskina Rita Frieske Ryan Cotterell Zhijing Jin 106 92 0 06 Oct 2022
Unit Testing for Concepts in Neural Networks Charles Lovering Ellie Pavlick 13 28 0 28 Jul 2022
Transformers Generalize Linearly Jackson Petty Robert Frank AI4CE 208 16 0 24 Sep 2021
How Can We Accelerate Progress Towards Human-like Linguistic Generalization? Tal Linzen 218 188 0 03 May 2020