ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,132 papers shown
Title
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
15
268
0
13 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
Dual Vision Transformer
Dual Vision Transformer
Ting Yao
Yehao Li
Yingwei Pan
Yu Wang
Xiaoping Zhang
Tao Mei
ViT
141
75
0
11 Jul 2022
Facilitated machine learning for image-based fruit quality assessment
Facilitated machine learning for image-based fruit quality assessment
Manuel Knott
F. Pérez-Cruz
T. Defraeye
16
47
0
10 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
30
8
0
08 Jul 2022
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
Chuong H. Nguyen
Su Huynh
Vinh Nguyen
Ngoc-Khanh Nguyen
ViT
27
3
0
08 Jul 2022
Efficient Lung Cancer Image Classification and Segmentation Algorithm
  Based on Improved Swin Transformer
Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on Improved Swin Transformer
Ruinan Sun
Yu Pang
ViT
MedIm
14
18
0
04 Jul 2022
I-ViT: Integer-only Quantization for Efficient Vision Transformer
  Inference
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Zhikai Li
Qingyi Gu
MQ
48
95
0
04 Jul 2022
DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
Zhuo Chen
Yufen Huang
Jiaoyan Chen
Yuxia Geng
Wen Zhang
Yin Fang
Jeff Z. Pan
Huajun Chen
VLM
29
64
0
04 Jul 2022
Benchmarking the Robustness of Deep Neural Networks to Common
  Corruptions in Digital Pathology
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology
Yunlong Zhang
Yuxuan Sun
Honglin Li
S. Zheng
Chenglu Zhu
L. Yang
OOD
59
27
0
30 Jun 2022
Toward Clinically Assisted Colorectal Polyp Recognition via Structured
  Cross-modal Representation Consistency
Toward Clinically Assisted Colorectal Polyp Recognition via Structured Cross-modal Representation Consistency
Weijie Ma
Ye Zhu
Ruimao Zhang
Jie-jin Yang
Yiwen Hu
Zhuguo Li
Lijuan Xiang
ViT
MedIm
9
3
0
23 Jun 2022
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
  Mobile Vision Applications
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications
Muhammad Maaz
Abdelrahman M. Shaker
Hisham Cholakkal
Salman Khan
Syed Waqas Zamir
Rao Muhammad Anwer
F. Khan
ViT
27
184
0
21 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
16
25
0
17 Jun 2022
Learning Implicit Feature Alignment Function for Semantic Segmentation
Learning Implicit Feature Alignment Function for Semantic Segmentation
Hanzhe Hu
Yinbo Chen
Jiarui Xu
Shubhankar Borse
H. Cai
Fatih Porikli
X. Wang
31
47
0
17 Jun 2022
Rectify ViT Shortcut Learning by Visual Saliency
Rectify ViT Shortcut Learning by Visual Saliency
Chong Ma
Lin Zhao
Yuzhong Chen
David Liu
Xi Jiang
Tuo Zhang
Xintao Hu
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
30
20
0
17 Jun 2022
Backdoor Attacks on Vision Transformers
Backdoor Attacks on Vision Transformers
Akshayvarun Subramanya
Aniruddha Saha
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
ViT
AAML
8
16
0
16 Jun 2022
OmniMAE: Single Model Masked Pretraining on Images and Videos
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
35
97
0
16 Jun 2022
Patch-level Representation Learning for Self-supervised Vision
  Transformers
Patch-level Representation Learning for Self-supervised Vision Transformers
Sukmin Yun
Hankook Lee
Jaehyung Kim
Jinwoo Shin
ViT
22
64
0
16 Jun 2022
Masked Siamese ConvNets
Masked Siamese ConvNets
L. Jing
Jiachen Zhu
Yann LeCun
SSL
35
34
0
15 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
C. Li
Biao Wang
Xihan Wei
Lei Zhang
M. Keuper
Xia Hua
ViT
29
15
0
15 Jun 2022
Rethinking Generalization in Few-Shot Classification
Rethinking Generalization in Few-Shot Classification
Markus Hiller
Rongkai Ma
Mehrtash Harandi
Tom Drummond
OCL
VLM
17
55
0
15 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
50
525
0
13 Jun 2022
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Zhaofan Qiu
Ting Yao
Chong-Wah Ngo
Tao Mei
ViT
24
15
0
13 Jun 2022
INDIGO: Intrinsic Multimodality for Domain Generalization
INDIGO: Intrinsic Multimodality for Domain Generalization
Puneet Mangla
Shivam Chandhok
Milan Aggarwal
V. Balasubramanian
Balaji Krishnamurthy
VLM
35
2
0
13 Jun 2022
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression
  Recognition
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition
Hanting Li
Ming-Fa Sui
Zhaoqing Zhu
Feng Zhao
25
27
0
10 Jun 2022
Masked Autoencoders are Robust Data Augmentors
Masked Autoencoders are Robust Data Augmentors
Haohang Xu
Shuangrui Ding
Xiaopeng Zhang
H. Xiong
32
27
0
10 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online
  Action Detection
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
36
33
0
09 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual
  Representations
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
32
22
0
09 Jun 2022
SimVP: Simpler yet Better Video Prediction
SimVP: Simpler yet Better Video Prediction
Zhangyang Gao
Cheng Tan
Lirong Wu
Stan Z. Li
33
212
0
09 Jun 2022
Which models are innately best at uncertainty estimation?
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
26
5
0
05 Jun 2022
CVNets: High Performance Library for Computer Vision
CVNets: High Performance Library for Computer Vision
Sachin Mehta
Farzad Abdolhosseini
Mohammad Rastegari
21
18
0
04 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
18
347
0
02 Jun 2022
Optimizing Relevance Maps of Vision Transformers Improves Robustness
Optimizing Relevance Maps of Vision Transformers Improves Robustness
Hila Chefer
Idan Schwartz
Lior Wolf
ViT
32
37
0
02 Jun 2022
CVM-Cervix: A Hybrid Cervical Pap-Smear Image Classification Framework
  Using CNN, Visual Transformer and Multilayer Perceptron
CVM-Cervix: A Hybrid Cervical Pap-Smear Image Classification Framework Using CNN, Visual Transformer and Multilayer Perceptron
Wanli Liu
Chen Li
N. Xu
Tao Jiang
M. Rahaman
...
Weiming Hu
Hao Chen
Changhao Sun
Yudong Yao
M. Grzegorzek
7
132
0
02 Jun 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
19
99
0
02 Jun 2022
XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
Jiacheng Wang
Fei Chen
Yuxi Ma
Liansheng Wang
Zhaodong Fei
Jia Shuai
Xiangdong Tang
Qichao Zhou
Jing Qin
ViT
MedIm
19
63
0
02 Jun 2022
Exact Feature Collisions in Neural Networks
Exact Feature Collisions in Neural Networks
Utku Ozbulak
Manvel Gasparyan
Shodhan Rao
W. D. Neve
Arnout Van Messem
AAML
19
1
0
31 May 2022
ViT-BEVSeg: A Hierarchical Transformer Network for Monocular
  Birds-Eye-View Segmentation
ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Pramit Dutta
Ganesh Sistu
S. Yogamani
E. López
J. McDonald
ViT
11
16
0
31 May 2022
Few-Shot Diffusion Models
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
183
49
0
30 May 2022
Self-Supervised Pre-training of Vision Transformers for Dense Prediction
  Tasks
Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks
Jaonary Rabarisoa
Velentin Belissen
Florian Chabot
Q. C. Pham
VLM
ViT
SSL
MDE
15
2
0
30 May 2022
GMML is All you Need
GMML is All you Need
Sara Atito
Muhammad Awais
J. Kittler
ViT
VLM
46
18
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
61
26
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
51
22
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
67
41
0
28 May 2022
WaveMix: A Resource-efficient Neural Network for Image Analysis
WaveMix: A Resource-efficient Neural Network for Image Analysis
Pranav Jeevan
Kavitha Viswanathan
S. AnanduA
A. Sethi
15
20
0
28 May 2022
Multi-Task Learning with Multi-Query Transformer for Dense Prediction
Multi-Task Learning with Multi-Query Transformer for Dense Prediction
Yangyang Xu
Xiangtai Li
Haobo Yuan
Yibo Yang
Lefei Zhang
ViT
23
45
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
17
15
0
28 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
58
2,023
0
27 May 2022
Spartan: Differentiable Sparsity via Regularized Transportation
Spartan: Differentiable Sparsity via Regularized Transportation
Kai Sheng Tai
Taipeng Tian
Ser-Nam Lim
23
11
0
27 May 2022
Previous
123...131415...212223
Next