ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.05442
  4. Cited By
Scaling Vision Transformers to 22 Billion Parameters

Scaling Vision Transformers to 22 Billion Parameters

10 February 2023
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
Justin Gilmer
Andreas Steiner
Mathilde Caron
Robert Geirhos
Ibrahim M. Alabdulmohsin
Rodolphe Jenatton
Lucas Beyer
Michael Tschannen
Anurag Arnab
Xiao Wang
C. Riquelme
Matthias Minderer
J. Puigcerver
Utku Evci
Manoj Kumar
Sjoerd van Steenkiste
Gamaleldin F. Elsayed
Aravindh Mahendran
F. I. F. Richard Yu
Avital Oliver
Fantine Huot
Jasmijn Bastings
Mark Collier
A. Gritsenko
Vighnesh Birodkar
C. N. Vasconcelos
Yi Tay
Thomas Mensink
Alexander Kolesnikov
Filip Pavetić
Dustin Tran
Thomas Kipf
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
    MLLM
ArXivPDFHTML

Papers citing "Scaling Vision Transformers to 22 Billion Parameters"

50 / 416 papers shown
Title
ADAPT to Robustify Prompt Tuning Vision Transformers
ADAPT to Robustify Prompt Tuning Vision Transformers
Masih Eskandar
Tooba Imtiaz
Zifeng Wang
Jennifer Dy
VPVLM
VLM
AAML
36
0
0
19 Mar 2024
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT
  Adaptation
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
Wangbo Zhao
Jiasheng Tang
Yizeng Han
Yibing Song
Kai Wang
Gao Huang
F. Wang
Yang You
35
11
0
18 Mar 2024
Frozen Feature Augmentation for Few-Shot Image Classification
Frozen Feature Augmentation for Few-Shot Image Classification
Andreas Bär
N. Houlsby
Mostafa Dehghani
Manoj Kumar
VLM
20
4
0
15 Mar 2024
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's
  Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements
Yu Liu
Wenlin Zhang
Shaochu Wang
Fangyu Zuo
Peiguang Jing
Yong Ji
24
0
0
15 Mar 2024
Generalizing Denoising to Non-Equilibrium Structures Improves
  Equivariant Force Fields
Generalizing Denoising to Non-Equilibrium Structures Improves Equivariant Force Fields
Yi-Lun Liao
Tess E. Smidt
Abhishek Das
DiffM
AI4CE
25
10
0
14 Mar 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
103
40
0
13 Mar 2024
Not just Birds and Cars: Generic, Scalable and Explainable Models for
  Professional Visual Recognition
Not just Birds and Cars: Generic, Scalable and Explainable Models for Professional Visual Recognition
Junde Wu
Jiayuan Zhu
Min Xu
Yueming Jin
25
0
0
08 Mar 2024
Rule-driven News Captioning
Rule-driven News Captioning
Ning Xu
Tingting Zhang
Hongshuo Tian
An-An Liu
49
0
0
08 Mar 2024
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Nabil Ibtehaz
Ning Yan
Masood S. Mortazavi
Daisuke Kihara
ViT
19
3
0
07 Mar 2024
Batch size invariant Adam
Batch size invariant Adam
Xi Wang
Laurence Aitchison
33
2
0
29 Feb 2024
Disentangling the Causes of Plasticity Loss in Neural Networks
Disentangling the Causes of Plasticity Loss in Neural Networks
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
H. V. Hasselt
Razvan Pascanu
James Martens
Will Dabney
AI4CE
53
31
0
29 Feb 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
65
254
0
27 Feb 2024
Why Transformers Need Adam: A Hessian Perspective
Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang
Congliang Chen
Tian Ding
Ziniu Li
Ruoyu Sun
Zhimin Luo
21
39
0
26 Feb 2024
Pretrained Visual Uncertainties
Pretrained Visual Uncertainties
Michael Kirchhof
Mark Collier
Seong Joon Oh
Enkelejda Kasneci
UQCV
387
8
1
26 Feb 2024
StochCA: A Novel Approach for Exploiting Pretrained Models with
  Cross-Attention
StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention
SeungWon Seo
Suho Lee
Sangheum Hwang
25
0
0
25 Feb 2024
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Parameter-efficient Prompt Learning for 3D Point Cloud Understanding
Hongyu Sun
Yongcai Wang
Wang Chen
Haoran Deng
Deying Li
VPVLM
39
5
0
24 Feb 2024
Genie: Generative Interactive Environments
Genie: Generative Interactive Environments
Jake Bruce
Michael Dennis
Ashley D. Edwards
Jack Parker-Holder
Yuge Shi
...
Konrad Zolna
Jeff Clune
Nando de Freitas
Satinder Singh
Tim Rocktaschel
VGen
VLM
61
139
0
23 Feb 2024
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and
  Scalability
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability
Xue-Qing Qian
Yu Wang
Simian Luo
Yinda Zhang
Ying Tai
...
Xiangyang Xue
Bo Zhao
Tiejun Huang
Yunsheng Wu
Yanwei Fu
27
6
0
19 Feb 2024
Linear Transformers with Learnable Kernel Functions are Better
  In-Context Models
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Yaroslav Aksenov
Nikita Balagansky
Sofia Maria Lo Cicero Vaina
Boris Shaposhnikov
Alexey Gorbatovski
Daniil Gavrilov
KELM
28
5
0
16 Feb 2024
Bridging Associative Memory and Probabilistic Modeling
Bridging Associative Memory and Probabilistic Modeling
Rylan Schaeffer
Nika Zahedi
Mikail Khona
Dhruv Pai
Sang T. Truong
...
Sarthak Chandra
Andres Carranza
Ila Rani Fiete
Andrey Gromov
Oluwasanmi Koyejo
DiffM
43
4
0
15 Feb 2024
HEAL-ViT: Vision Transformers on a spherical mesh for medium-range
  weather forecasting
HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting
Vivek Ramavajjala
14
2
0
14 Feb 2024
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
Muthuraman Chidambaram
Rong Ge
AAML
13
0
0
10 Feb 2024
Time-, Memory- and Parameter-Efficient Visual Adaptation
Time-, Memory- and Parameter-Efficient Visual Adaptation
Otniel-Bogdan Mercea
Alexey Gritsenko
Cordelia Schmid
Anurag Arnab
VLM
35
13
0
05 Feb 2024
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars
  for Write Noise Mitigation
ClipFormer: Key-Value Clipping of Transformers on Memristive Crossbars for Write Noise Mitigation
Abhiroop Bhattacharjee
Abhishek Moitra
Priyadarshini Panda
CLIP
11
6
0
04 Feb 2024
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
A Graph is Worth KKK Words: Euclideanizing Graph using Pure Transformer
Zhangyang Gao
Daize Dong
Cheng Tan
Jun-Xiong Xia
Bozhen Hu
Stan Z. Li
41
6
0
04 Feb 2024
Revisiting the Power of Prompt for Visual Tuning
Revisiting the Power of Prompt for Visual Tuning
Yuzhu Wang
Lechao Cheng
Chaowei Fang
Dingwen Zhang
Manni Duan
Meng Wang
VLM
46
14
0
04 Feb 2024
A General Framework for Learning from Weak Supervision
A General Framework for Learning from Weak Supervision
Hao Chen
Jindong Wang
Lei Feng
Xiang Li
Yidong Wang
Xing Xie
Masashi Sugiyama
Rita Singh
Bhiksha Raj
19
1
0
02 Feb 2024
Leveraging Large Language Models for Analyzing Blood Pressure Variations
  Across Biological Sex from Scientific Literature
Leveraging Large Language Models for Analyzing Blood Pressure Variations Across Biological Sex from Scientific Literature
Yuting Guo
Seyedeh Somayyeh Mousavi
Reza Sameni
Abeed Sarker
11
0
0
02 Feb 2024
Simulation of Graph Algorithms with Looped Transformers
Simulation of Graph Algorithms with Looped Transformers
Artur Back de Luca
K. Fountoulakis
43
14
0
02 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment
  Anything Model
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
33
40
0
31 Jan 2024
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large
  Models
VIALM: A Survey and Benchmark of Visually Impaired Assistance with Large Models
Yi Zhao
Yilin Zhang
Rong Xiang
Jing Li
Hillming Li
26
16
0
29 Jan 2024
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass
  Diffusion Transformers
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson
Stefan Andreas Baumann
Alex Birch
Tanishq Mathew Abraham
Daniel Z. Kaplan
Enrico Shippole
18
48
0
21 Jan 2024
Accelerating Heterogeneous Tensor Parallelism via Flexible Workload
  Control
Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control
Zhigang Wang
Xu Zhang
Ning Wang
Chuanfei Xu
Jie Nie
Zhiqiang Wei
Yu Gu
Ge Yu
11
0
0
21 Jan 2024
Exploring scalable medical image encoders beyond text supervision
Exploring scalable medical image encoders beyond text supervision
Fernando Pérez-García
Harshita Sharma
Sam Bond-Taylor
Kenza Bouzid
Valentina Salvatelli
...
Maria T. A. Wetscherek
Noel Codella
Stephanie L. Hyland
Javier Alvarez-Valle
Ozan Oktay
LM&MA
MedIm
48
9
0
19 Jan 2024
Scalable Pre-training of Large Autoregressive Image Models
Scalable Pre-training of Large Autoregressive Image Models
Alaaeldin El-Nouby
Michal Klein
Shuangfei Zhai
Miguel Angel Bautista
Alexander Toshev
Vaishaal Shankar
J. Susskind
Armand Joulin
VLM
12
70
0
16 Jan 2024
Transformer for Object Re-Identification: A Survey
Transformer for Object Re-Identification: A Survey
Mang Ye
Shuo Chen
Chenyue Li
Wei-Shi Zheng
David J. Crandall
Bo Du
ViT
90
13
0
13 Jan 2024
OTAS: An Elastic Transformer Serving System via Token Adaptation
OTAS: An Elastic Transformer Serving System via Token Adaptation
Jinyu Chen
Wenchao Xu
Zicong Hong
Song Guo
Haozhao Wang
Jie Zhang
Deze Zeng
14
4
0
10 Jan 2024
Revisiting Adversarial Training at Scale
Revisiting Adversarial Training at Scale
Zeyu Wang
Xianhang Li
Hongru Zhu
Cihang Xie
21
15
0
09 Jan 2024
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as
  Programmers
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Aleksandar Stanić
Sergi Caelles
Michael Tschannen
LRM
VLM
23
9
0
03 Jan 2024
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular
  Value Penalization
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
Xixu Hu
Runkai Zheng
Jindong Wang
Cheuk Hang Leung
Qi Wu
Xing Xie
11
1
0
02 Jan 2024
Analyzing Local Representations of Self-supervised Vision Transformers
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
28
2
0
31 Dec 2023
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining
Jacob P. Portes
Alex Trott
Sam Havens
Daniel King
Abhinav Venigalla
Moin Nadeem
Nikhil Sardana
D. Khudia
Jonathan Frankle
13
16
0
29 Dec 2023
An Empirical Study of Scaling Law for OCR
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
23
6
0
29 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
13
43
0
28 Dec 2023
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision,
  Language, Audio, and Action
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Jiasen Lu
Christopher Clark
Sangho Lee
Zichen Zhang
Savya Khosla
Ryan Marten
Derek Hoiem
Aniruddha Kembhavi
VLM
MLLM
27
144
0
28 Dec 2023
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty
  from Pre-trained Models
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi
Olivier Laurent
Maxence Leguéry
Andrei Bursuc
Andrea Pilzer
Angela Yao
UQCV
BDL
15
3
0
23 Dec 2023
How Smooth Is Attention?
How Smooth Is Attention?
Valérie Castin
Pierre Ablin
Gabriel Peyré
AAML
24
9
0
22 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
156
895
0
21 Dec 2023
Layerwise complexity-matched learning yields an improved model of
  cortical area V2
Layerwise complexity-matched learning yields an improved model of cortical area V2
Nikhil Parthasarathy
Olivier J. Hénaff
Eero P. Simoncelli
27
1
0
18 Dec 2023
Data-Efficient Multimodal Fusion on a Single GPU
Data-Efficient Multimodal Fusion on a Single GPU
Noël Vouitsis
Zhaoyan Liu
S. Gorti
Valentin Villecroze
Jesse C. Cresswell
Guangwei Yu
G. Loaiza-Ganem
M. Volkovs
35
3
0
15 Dec 2023
Previous
123456789
Next