Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.12673
Cited By
A Constructive Prediction of the Generalization Error Across Scales
27 September 2019
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Constructive Prediction of the Generalization Error Across Scales"
50 / 159 papers shown
Title
Impact of dataset size and long-term ECoG-based BCI usage on deep learning decoders performance
Maciej Śliwowski
Matthieu Martin
Antoine Souloumiac
P. Blanchart
T. Aksenova
24
6
0
08 Sep 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
21
0
0
25 Jul 2022
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Rafid Mahmood
James Lucas
David Acuna
Daiqing Li
Jonah Philion
Jose M. Alvarez
Zhiding Yu
Sanja Fidler
M. Law
14
26
0
04 Jul 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
17
417
0
29 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
30
31
0
20 Jun 2022
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
29
52
0
09 Jun 2022
Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages
Simran Khanuja
Sebastian Ruder
Partha P. Talukdar
32
16
0
25 May 2022
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
AI4CE
LRM
25
52
0
24 May 2022
Investigating classification learning curves for automatically generated and labelled plant images
Michael A. Beck
C. Bidinosti
Christopher J. Henry
Manisha Ajmani
12
0
0
22 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
29
185
0
22 May 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
72
2,318
0
12 Apr 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
26
148
0
07 Mar 2022
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun
Nikhil Ghosh
Saurabh Garg
Boaz Barak
Preetum Nakkiran
OOD
25
21
0
20 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
11
17
0
13 Feb 2022
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
20
269
0
11 Feb 2022
Data Scaling Laws in NMT: The Effect of Noise and Architecture
Yamini Bansal
Behrooz Ghorbani
Ankush Garg
Biao Zhang
M. Krikun
Colin Cherry
Behnam Neyshabur
Orhan Firat
37
47
0
04 Feb 2022
Unified Scaling Laws for Routed Language Models
Aidan Clark
Diego de Las Casas
Aurelia Guy
A. Mensch
Michela Paganini
...
Oriol Vinyals
Jack W. Rae
Erich Elsen
Koray Kavukcuoglu
Karen Simonyan
MoE
27
177
0
02 Feb 2022
Error Scaling Laws for Kernel Classification under Source and Capacity Conditions
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
46
10
0
29 Jan 2022
Auto-Compressing Subset Pruning for Semantic Image Segmentation
Konstantin Ditschuneit
Johannes Otterbach
21
5
0
26 Jan 2022
A General Language Assistant as a Laboratory for Alignment
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
...
Tom B. Brown
Jack Clark
Sam McCandlish
C. Olah
Jared Kaplan
ALM
11
714
0
01 Dec 2021
Turing-Universal Learners with Optimal Scaling Laws
Preetum Nakkiran
21
2
0
09 Nov 2021
Is the Number of Trainable Parameters All That Actually Matters?
A. Chatelain
Amine Djeghri
Daniel Hesslow
Julien Launay
Iacopo Poli
51
7
0
24 Sep 2021
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
32
99
0
16 Sep 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami
Kenji Fukumizu
Shogo Murai
Shuji Suzuki
Yuta Kikuchi
Taiji Suzuki
S. Maeda
Kohei Hayashi
40
12
0
25 Aug 2021
Dataset Distillation with Infinitely Wide Convolutional Networks
Timothy Nguyen
Roman Novak
Lechao Xiao
Jaehoon Lee
DD
35
229
0
27 Jul 2021
Redundant representations help generalization in wide neural networks
Diego Doimo
Aldo Glielmo
Sebastian Goldt
A. Laio
AI4CE
17
9
0
07 Jun 2021
Self-Supervision is All You Need for Solving Rubik's Cube
Kyo Takano
13
1
0
06 Jun 2021
Benchmarking down-scaled (not so large) pre-trained language models
Matthias Aßenmacher
P. Schulze
C. Heumann
6
1
0
11 May 2021
Scaling Scaling Laws with Board Games
Andrew Jones
8
38
0
07 Apr 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
22
137
0
24 Mar 2021
The Shape of Learning Curves: a Review
T. Viering
Marco Loog
18
122
0
19 Mar 2021
Is it enough to optimize CNN architectures on ImageNet?
Lukas Tuggener
Jürgen Schmidhuber
Thilo Stadelmann
25
23
0
16 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Tsung-Yi Lin
Jonathon Shlens
Barret Zoph
29
297
0
13 Mar 2021
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
27
250
0
12 Feb 2021
Towards More Fine-grained and Reliable NLP Performance Prediction
Zihuiwen Ye
Pengfei Liu
Jinlan Fu
Graham Neubig
11
33
0
10 Feb 2021
Learning Curve Theory
Marcus Hutter
135
58
0
08 Feb 2021
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
18
237
0
02 Feb 2021
Meta-learning with negative learning rates
A. Bernacchia
17
17
0
01 Feb 2021
Analysis of the Scalability of a Deep-Learning Network for Steganography "Into the Wild"
Hugo Ruiz
Marc Chaumont
Mehdi Yedroudj
A. Amara
Frédéric Comby
Gérard Subsol
21
9
0
29 Dec 2020
*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Dmitry Tsarkov
Tibor Tihon
Nathan Scales
Nikola Momchev
Danila Sinopalnikov
Nathanael Scharli
16
17
0
15 Dec 2020
Generalization bounds for deep learning
Guillermo Valle Pérez
A. Louis
BDL
13
44
0
07 Dec 2020
Learning Curves for Drug Response Prediction in Cancer Cell Lines
A. Partin
Thomas Brettin
Yvonne A. Evrard
Yitan Zhu
H. Yoo
...
Austin R. Clyde
Maulik Shukla
Michael Fonstein
J. Doroshow
Rick L. Stevens
10
19
0
25 Nov 2020
Scaling Laws for Autoregressive Generative Modeling
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
...
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
25
405
0
28 Oct 2020
Domain Divergences: a Survey and Empirical Analysis
Abhinav Ramesh Kashyap
Devamanyu Hazarika
Min-Yen Kan
Roger Zimmermann
170
37
0
23 Oct 2020
Learning Curves for Analysis of Deep Networks
Derek Hoiem
Tanmay Gupta
Zhizhong Li
Michal Shlapentokh-Rothman
8
24
0
21 Oct 2020
Automatic Feasibility Study via Data Quality Analysis for ML: A Case-Study on Label Noise
Cédric Renggli
Luka Rimanic
Luka Kolar
Wentao Wu
Ce Zhang
27
3
0
16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
29
11
0
16 Oct 2020
On Power Laws in Deep Ensembles
E. Lobacheva
Nadezhda Chirkova
M. Kodryan
Dmitry Vetrov
UQCV
10
40
0
16 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
28
130
0
30 Jun 2020
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
15
51
0
26 Jun 2020
Previous
1
2
3
4
Next