Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.19956
Cited By
The Impact of Depth on Compositional Generalization in Transformer Language Models
30 October 2023
Jackson Petty
Sjoerd van Steenkiste
Ishita Dasgupta
Fei Sha
Daniel H Garrette
Tal Linzen
AI4CE
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Impact of Depth on Compositional Generalization in Transformer Language Models"
6 / 6 papers shown
Title
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill
Ashish Sabharwal
53
4
0
05 Mar 2025
Distributional Scaling Laws for Emergent Capabilities
Rosie Zhao
Tian Qin
David Alvarez-Melis
Sham Kakade
Naomi Saphra
LRM
37
0
0
24 Feb 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
107
0
0
27 Jan 2025
How Does Code Pretraining Affect Language Model Task Performance?
Jackson Petty
Sjoerd van Steenkiste
Tal Linzen
60
8
0
06 Sep 2024
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
188
110
0
22 Sep 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,453
0
23 Jan 2020
1