ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04560
  4. Cited By
Scaling Vision Transformers

Scaling Vision Transformers

8 June 2021
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
    ViT
ArXivPDFHTML

Papers citing "Scaling Vision Transformers"

50 / 243 papers shown
Title
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta
Kuang-Huei Lee
Ofir Nachum
Yutaka Matsuo
Aleksandra Faust
S. Gu
Izzeddin Gur
LM&Ro
36
90
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
28
114
0
18 May 2023
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
A Comprehensive Survey on Segment Anything Model for Vision and Beyond
Chunhui Zhang
Li Liu
Yawen Cui
Guanjie Huang
Weilin Lin
Yiqian Yang
Yuehong Hu
VLM
34
90
0
14 May 2023
Finding Meaningful Distributions of ML Black-boxes under Forensic
  Investigation
Finding Meaningful Distributions of ML Black-boxes under Forensic Investigation
Jiyi Zhang
Hansheng Fang
Hwee Kuan Lee
E. Chang
16
1
0
10 May 2023
DPSeq: A Novel and Efficient Digital Pathology Classifier for Predicting
  Cancer Biomarkers using Sequencer Architecture
DPSeq: A Novel and Efficient Digital Pathology Classifier for Predicting Cancer Biomarkers using Sequencer Architecture
M. Cen
Xingyu Li
Bangwei Guo
J. Jonnagaddala
Hong Zhang
Xuesong Xu
MedIm
22
0
0
03 May 2023
A Strong and Reproducible Object Detector with Only Public Datasets
A Strong and Reproducible Object Detector with Only Public Datasets
Tianhe Ren
Jianwei Yang
Siyi Liu
Ailing Zeng
Feng Li
Hao Zhang
Hongyang Li
Zhaoyang Zeng
Lei Zhang
ObjD
28
11
0
25 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
101
3,017
0
14 Apr 2023
On the Opportunities and Challenges of Foundation Models for Geospatial
  Artificial Intelligence
On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai
Weiming Huang
Jin Sun
Suhang Song
Deepak Mishra
...
Yingjie Hu
Chris Cundy
Ziyuan Li
Rui Zhu
Ni Lao
AI4CE
24
121
0
13 Apr 2023
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Ahmet Iscen
Alireza Fathi
Cordelia Schmid
VLM
3DV
33
25
0
11 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
28
40
0
07 Apr 2023
Training Strategies for Vision Transformers for Object Detection
Training Strategies for Vision Transformers for Object Detection
Apoorv Singh
23
4
0
05 Apr 2023
Effective Theory of Transformers at Initialization
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
20
14
0
04 Apr 2023
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li
Yali Wang
Yizhuo Li
Yi Wang
Yinan He
Limin Wang
Yu Qiao
VGen
43
154
0
28 Mar 2023
Sigmoid Loss for Language Image Pre-Training
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
19
935
0
27 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Kevin Clark
P. Jaini
DiffM
VLM
22
106
0
27 Mar 2023
Spatio-Temporal driven Attention Graph Neural Network with Block
  Adjacency matrix (STAG-NN-BA)
Spatio-Temporal driven Attention Graph Neural Network with Block Adjacency matrix (STAG-NN-BA)
U. Nazir
W. Islam
M. Taj
16
3
0
25 Mar 2023
An Extended Study of Human-like Behavior under Adversarial Training
An Extended Study of Human-like Behavior under Adversarial Training
Paul Gavrikov
J. Keuper
M. Keuper
AAML
28
9
0
22 Mar 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet
  Models?
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
24
17
0
07 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu
Tianlong Chen
Zhenyu (Allen) Zhang
Xuxi Chen
Tianjin Huang
Ajay Jaiswal
Zhangyang Wang
26
29
0
03 Mar 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Ghada Sokar
Rishabh Agarwal
P. S. Castro
Utku Evci
CLL
40
88
0
24 Feb 2023
Scaling Laws for Multilingual Neural Machine Translation
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
30
28
0
19 Feb 2023
Tuning computer vision models with task rewards
Tuning computer vision models with task rewards
André Susano Pinto
Alexander Kolesnikov
Yuge Shi
Lucas Beyer
Xiaohua Zhai
VLM
25
40
0
16 Feb 2023
Symbolic Discovery of Optimization Algorithms
Symbolic Discovery of Optimization Algorithms
Xiangning Chen
Chen Liang
Da Huang
Esteban Real
Kaiyuan Wang
...
Xuanyi Dong
Thang Luong
Cho-Jui Hsieh
Yifeng Lu
Quoc V. Le
50
350
0
13 Feb 2023
Quantum Neuron Selection: Finding High Performing Subnetworks With
  Quantum Algorithms
Quantum Neuron Selection: Finding High Performing Subnetworks With Quantum Algorithms
Tim Whitaker
25
1
0
12 Feb 2023
Scaling Vision Transformers to 22 Billion Parameters
Scaling Vision Transformers to 22 Billion Parameters
Mostafa Dehghani
Josip Djolonga
Basil Mustafa
Piotr Padlewski
Jonathan Heek
...
Mario Luvcić
Xiaohua Zhai
Daniel Keysers
Jeremiah Harmsen
N. Houlsby
MLLM
61
569
0
10 Feb 2023
SimCon Loss with Multiple Views for Text Supervised Semantic
  Segmentation
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
27
4
0
07 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
C. L. P. Chen
Mu Li
ViT
44
144
0
06 Feb 2023
Adaptive Computation with Elastic Input Sequence
Adaptive Computation with Elastic Input Sequence
Fuzhao Xue
Valerii Likhosherstov
Anurag Arnab
N. Houlsby
Mostafa Dehghani
Yang You
29
18
0
30 Jan 2023
A Closer Look at Few-shot Classification Again
A Closer Look at Few-shot Classification Again
Xu Luo
Hao Wu
Ji Zhang
Lianli Gao
Jing Xu
Jingkuan Song
24
48
0
28 Jan 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly
  Communication-Efficient
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
22
31
0
27 Jan 2023
Enhancing Self-Training Methods
Enhancing Self-Training Methods
Aswathnarayan Radhakrishnan
Jim Davis
Zachary Rabin
Benjamin Lewis
Matthew Scherreik
R. Ilin
19
1
0
18 Jan 2023
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous
  Structured Pruning for Vision Transformer
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer
Miao Yin
Burak Uzkent
Yilin Shen
Hongxia Jin
Bo Yuan
ViT
24
13
0
13 Jan 2023
Principled and Efficient Transfer Learning of Deep Models via Neural
  Collapse
Principled and Efficient Transfer Learning of Deep Models via Neural Collapse
Xiao Li
Sheng Liu
Jin-li Zhou
Xin Lu
C. Fernandez‐Granda
Zhihui Zhu
Q. Qu
AAML
23
18
0
23 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with
  Multi-Source Multimodal Knowledge Memory
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Yizhou Sun
Cordelia Schmid
David A. Ross
Alireza Fathi
RALM
VLM
38
88
0
10 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Deep Incubation: Training Large Models by Divide-and-Conquering
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
18
11
0
08 Dec 2022
Differentially Private Image Classification from Features
Differentially Private Image Classification from Features
Harsh Mehta
Walid Krichene
Abhradeep Thakurta
Alexey Kurakin
Ashok Cutkosky
46
7
0
24 Nov 2022
Multi-Environment Pretraining Enables Transfer to Action Limited
  Datasets
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Doina Precup
Igor Mordatch
Ofir Nachum
OffRL
20
5
0
23 Nov 2022
Powderworld: A Platform for Understanding Generalization via Rich Task
  Distributions
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
Kevin Frans
Phillip Isola
OffRL
39
9
0
23 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
28
129
0
22 Nov 2022
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
  Information
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su
Xizhou Zhu
Chenxin Tao
Lewei Lu
Bin Li
Gao Huang
Yu Qiao
Xiaogang Wang
Jie Zhou
Jifeng Dai
34
41
0
17 Nov 2022
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
GLAMI-1M: A Multilingual Image-Text Fashion Dataset
Vaclav Kosar
A. Hoskovec
Milan Šulc
Radek Bartyzal
VLM
26
3
0
17 Nov 2022
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Marc Fischer
Alexander Bartler
Bin Yang
SSeg
14
18
0
16 Nov 2022
Contextual Transformer for Offline Meta Reinforcement Learning
Contextual Transformer for Offline Meta Reinforcement Learning
Runji Lin
Ye Li
Xidong Feng
Zhaowei Zhang
Xian Hong Wu Fung
Haifeng Zhang
Jun Wang
Yali Du
Yaodong Yang
OffRL
18
6
0
15 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
61
673
0
14 Nov 2022
Language models are good pathologists: using attention-based sequence
  reduction and text-pretrained transformers for efficient WSI classification
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
28
3
0
14 Nov 2022
Harmonizing the object recognition strategies of deep neural networks
  with humans
Harmonizing the object recognition strategies of deep neural networks with humans
Thomas Fel
Ivan Felipe
Drew Linsley
Thomas Serre
30
71
0
08 Nov 2022
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Qiang Chen
Jian Wang
Chuchu Han
Shangang Zhang
Zexian Li
...
Haocheng Feng
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
ViT
VLM
31
45
0
07 Nov 2022
Broken Neural Scaling Laws
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
19
74
0
26 Oct 2022
The Curious Case of Benign Memorization
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
41
8
0
25 Oct 2022
The Robustness Limits of SoTA Vision Models to Natural Variation
The Robustness Limits of SoTA Vision Models to Natural Variation
Mark Ibrahim
Q. Garrido
Ari S. Morcos
Diane Bouchacourt
VLM
35
16
0
24 Oct 2022
Previous
12345
Next