ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.13575
  4. Cited By
Scaling MLPs: A Tale of Inductive Bias

Scaling MLPs: A Tale of Inductive Bias

23 June 2023
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
ArXivPDFHTML

Papers citing "Scaling MLPs: A Tale of Inductive Bias"

15 / 15 papers shown
Title
Exploring Kolmogorov-Arnold Networks for Interpretable Time Series Classification
Exploring Kolmogorov-Arnold Networks for Interpretable Time Series Classification
Irina Barašin
Blaž Bertalanič
M. Mohorčič
Carolina Fortuna
AI4TS
94
2
0
22 Nov 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
50
19
0
27 Jun 2024
Neural Redshift: Random Networks are not Random Functions
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
89
18
0
04 Mar 2024
GLIMPSE: Generalized Local Imaging with MLPs
GLIMPSE: Generalized Local Imaging with MLPs
AmirEhsan Khorashadizadeh
Valentin Debarnot
Tianlin Liu
Ivan Dokmanić
28
0
0
01 Jan 2024
Transformer Fusion with Optimal Transport
Transformer Fusion with Optimal Transport
Moritz Imfeld
Jacopo Graldi
Marco Giordano
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
ViT
MoMe
22
16
0
09 Oct 2023
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and
  Luck
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
43
7
0
07 Sep 2023
The Curious Case of Benign Memorization
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
32
7
0
25 Oct 2022
Patches Are All You Need?
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
214
400
0
24 Jan 2022
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,592
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
286
5,723
0
29 Apr 2021
ImageNet-21K Pretraining for the Masses
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
166
676
0
22 Apr 2021
Towards Learning Convolutions from Scratch
Towards Learning Convolutions from Scratch
Behnam Neyshabur
SSL
214
70
0
27 Jul 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
Convolution by Evolution: Differentiable Pattern Producing Networks
Convolution by Evolution: Differentiable Pattern Producing Networks
Chrisantha Fernando
Dylan Banarse
Malcolm Reynolds
F. Besse
David Pfau
Max Jaderberg
Marc Lanctot
Daan Wierstra
191
102
0
08 Jun 2016
1