ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13035
  4. Cited By
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design

22 May 2023
Ibrahim M. Alabdulmohsin
Xiaohua Zhai
Alexander Kolesnikov
Lucas Beyer
    VLM
ArXivPDFHTML

Papers citing "Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design"

25 / 25 papers shown
Title
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
37
0
0
02 May 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Z. Wang
Senthil Purushwalkam
Caiming Xiong
S.
Heng Ji
R. Xu
38
0
0
23 Apr 2025
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies
Wei Song
Y. Wang
Zijia Song
Yadong Li
Haoze Sun
Weipeng Chen
Zenan Zhou
Jianhua Xu
Jiaqi Wang
Kaicheng Yu
60
2
0
18 Mar 2025
Mixtera: A Data Plane for Foundation Model Training
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
78
0
0
27 Feb 2025
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Andreas Koukounas
Georgios Mastrapas
Bo Wang
Mohammad Kalim Akram
Sedigheh Eslami
Michael Gunther
Isabelle Mohr
Saba Sturua
Scott Martens
Nan Wang
VLM
97
6
0
11 Dec 2024
TIPS: Text-Image Pretraining with Spatial awareness
TIPS: Text-Image Pretraining with Spatial awareness
Kevis-Kokitsi Maninis
Kaifeng Chen
Soham Ghosh
Arjun Karpur
Koert Chen
...
Jan Dlabal
Dan Gnanapragasam
Mojtaba Seyedhosseini
Howard Zhou
Andre Araujo
VLM
30
3
0
21 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Scaling Optimal LR Across Token Horizons
Scaling Optimal LR Across Token Horizons
Johan Bjorck
Alon Benhaim
Vishrav Chaudhary
Furu Wei
Xia Song
46
4
0
30 Sep 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
48
19
0
27 Jun 2024
ColPali: Efficient Document Retrieval with Vision Language Models
ColPali: Efficient Document Retrieval with Vision Language Models
Manuel Faysse
Hugues Sibille
Tony Wu
Bilel Omrani
Gautier Viaud
C´eline Hudelot
Pierre Colombo
VLM
57
21
0
27 Jun 2024
Just How Flexible are Neural Networks in Practice?
Just How Flexible are Neural Networks in Practice?
Ravid Shwartz-Ziv
Micah Goldblum
Arpit Bansal
C. B. Bruss
Yann LeCun
Andrew Gordon Wilson
28
4
0
17 Jun 2024
Infinite Limits of Multi-head Transformer Dynamics
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
C. Pehlevan
AI4CE
37
9
0
24 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
34
2
0
22 May 2024
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
14
917
0
27 Mar 2023
The effectiveness of MAE pre-pretraining for billion-scale pretraining
The effectiveness of MAE pre-pretraining for billion-scale pretraining
Mannat Singh
Quentin Duval
Kalyan Vasudev Alwala
Haoqi Fan
Vaibhav Aggarwal
...
Piotr Dollár
Christoph Feichtenhofer
Ross B. Girshick
Rohit Girdhar
Ishan Misra
LRM
102
62
0
23 Mar 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training
  Efficiency
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
21
3
0
21 Mar 2023
Revisiting Neural Scaling Laws in Language and Vision
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
148
101
0
13 Sep 2022
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Alexander Kolesnikov
André Susano Pinto
Lucas Beyer
Xiaohua Zhai
Jeremiah Harmsen
N. Houlsby
103
67
0
20 May 2022
Scale Efficiently: Insights from Pre-training and Fine-tuning
  Transformers
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
185
110
0
22 Sep 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
239
626
0
21 Apr 2021
Learning Curve Theory
Learning Curve Theory
Marcus Hutter
128
56
0
08 Feb 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Zifeng Wu
Chunhua Shen
A. Hengel
SSeg
243
1,476
0
30 Nov 2016
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1