ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,004 papers shown
Title
A Comparative Study of Deep Learning Classification Methods on a Small
  Environmental Microorganism Image Dataset (EMDS-6): from Convolutional Neural
  Networks to Visual Transformers
A Comparative Study of Deep Learning Classification Methods on a Small Environmental Microorganism Image Dataset (EMDS-6): from Convolutional Neural Networks to Visual Transformers
Penghui Zhao
Chen Li
M. Rahaman
Hao Xu
Hechen Yang
Hongzan Sun
Tao Jiang
M. Grzegorzek
VLM
22
39
0
16 Jul 2021
Align before Fuse: Vision and Language Representation Learning with
  Momentum Distillation
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation
Junnan Li
Ramprasaath R. Selvaraju
Akhilesh Deepak Gotmare
Shafiq R. Joty
Caiming Xiong
S. Hoi
FaML
48
1,881
0
16 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
53
254
0
14 Jul 2021
Transformer with Peak Suppression and Knowledge Guidance for
  Fine-grained Image Recognition
Transformer with Peak Suppression and Knowledge Guidance for Fine-grained Image Recognition
Xinda Liu
Lili Wang
Xiaoguang Han
ViT
34
66
0
14 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip H. S. Torr
50
27
0
13 Jul 2021
TransClaw U-Net: Claw U-Net with Transformers for Medical Image
  Segmentation
TransClaw U-Net: Claw U-Net with Transformers for Medical Image Segmentation
Yao Chang
Menghan Hu
Zhai Guangtao
Xiao-Ping Zhang
MedIm
ViT
68
96
0
12 Jul 2021
Trans4Trans: Efficient Transformer for Transparent Object Segmentation
  to Help Visually Impaired People Navigate in the Real World
Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World
Jiaming Zhang
Kailun Yang
Angela Constantinescu
Kunyu Peng
Karin Muller
Rainer Stiefelhagen
ViT
31
61
0
07 Jul 2021
Learning Vision Transformer with Squeeze and Excitation for Facial
  Expression Recognition
Learning Vision Transformer with Squeeze and Excitation for Facial Expression Recognition
Mouath Aouayeb
W. Hamidouche
Catherine Soladié
K. Kpalma
Renaud Séguier
ViT
28
57
0
07 Jul 2021
GLiT: Neural Architecture Search for Global and Local Image Transformer
GLiT: Neural Architecture Search for Global and Local Image Transformer
Boyu Chen
Peixia Li
Chuming Li
Baopu Li
Lei Bai
Chen Lin
Ming-hui Sun
Junjie Yan
Wanli Ouyang
ViT
24
85
0
07 Jul 2021
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization
Jun Wang
Xiaohan Yu
Yongsheng Gao
ViT
25
105
0
06 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
Improving the Efficiency of Transformers for Resource-Constrained
  Devices
Improving the Efficiency of Transformers for Resource-Constrained Devices
Hamid Tabani
Ajay Balasubramaniam
Shabbir Marzban
Elahe Arani
Bahram Zonooz
33
20
0
30 Jun 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
40
26
0
28 Jun 2021
Deep Ensembling with No Overhead for either Training or Testing: The
  All-Round Blessings of Dynamic Sparsity
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
D. Mocanu
OOD
23
49
0
28 Jun 2021
Post-Training Quantization for Vision Transformer
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
39
321
0
27 Jun 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
13
1,606
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for
  Vision-Language Pre-training
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
22
88
0
25 Jun 2021
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and
  Retrieval
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval
Giorgos Kordopatis-Zilos
Christos Tzelepis
Symeon Papadopoulos
I. Kompatsiaris
Ioannis Patras
22
33
0
24 Jun 2021
IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision
  Transformers
IA-RED2^22: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Rameswar Panda
Yifan Jiang
Zhangyang Wang
Rogerio Feris
A. Oliva
VLM
ViT
39
153
0
23 Jun 2021
Co-advise: Cross Inductive Bias Distillation
Co-advise: Cross Inductive Bias Distillation
Sucheng Ren
Zhengqi Gao
Tianyu Hua
Zihui Xue
Yonglong Tian
Shengfeng He
Hang Zhao
42
53
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual
  Recognition
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
24
205
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
24
218
0
22 Jun 2021
Towards Biologically Plausible Convolutional Networks
Towards Biologically Plausible Convolutional Networks
Roman Pogodin
Yash Mehta
Timothy Lillicrap
P. Latham
24
22
0
22 Jun 2021
Structured Sparse R-CNN for Direct Scene Graph Generation
Structured Sparse R-CNN for Direct Scene Graph Generation
Yao Teng
Limin Wang
3DPC
GNN
16
53
0
21 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision
  Transformers
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
34
613
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation
  Learning
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
30
208
0
17 Jun 2021
Shuffle Transformer with Feature Alignment for Video Face Parsing
Shuffle Transformer with Feature Alignment for Video Face Parsing
Rui Zhang
Yang Han
Zilong Huang
Pei Cheng
Guozhong Luo
Gang Yu
Bin-Bin Fu
CVBM
ViT
22
1
0
16 Jun 2021
Physion: Evaluating Physical Prediction from Vision in Humans and
  Machines
Physion: Evaluating Physical Prediction from Vision in Humans and Machines
Daniel M. Bear
E. Wang
Damian Mrowca
Felix Binder
Hsiau-Yu Fish Tung
...
Li Fei-Fei
Nancy Kanwisher
J. Tenenbaum
Daniel L. K. Yamins
Judith E. Fan
OOD
45
86
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
12
2,742
0
15 Jun 2021
Survey: Image Mixing and Deleting for Data Augmentation
Survey: Image Mixing and Deleting for Data Augmentation
Humza Naveed
Saeed Anwar
Munawar Hayat
Kashif Javed
Ajmal Mian
26
78
0
13 Jun 2021
Space-time Mixing Attention for Video Transformer
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
25
124
0
10 Jun 2021
Scaling Vision with Sparse Mixture of Experts
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
12
574
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
27
149
0
10 Jun 2021
MST: Masked Self-Supervised Transformer for Visual Representation
MST: Masked Self-Supervised Transformer for Visual Representation
Zhaowen Li
Zhiyang Chen
Fan Yang
Wei Li
Yousong Zhu
...
Rui Deng
Liwei Wu
Rui Zhao
Ming Tang
Jinqiao Wang
ViT
30
161
0
10 Jun 2021
Towards Training Stronger Video Vision Transformers for
  EPIC-KITCHENS-100 Action Recognition
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
17
11
0
09 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
37
1,167
0
09 Jun 2021
MVT: Mask Vision Transformer for Facial Expression Recognition in the
  wild
MVT: Mask Vision Transformer for Facial Expression Recognition in the wild
Hanting Li
Ming-Fa Sui
Feng Zhao
Zhengjun Zha
Feng Wu
ViT
29
75
0
08 Jun 2021
Person Re-Identification with a Locally Aware Transformer
Person Re-Identification with a Locally Aware Transformer
Charu Sharma
S. R. Kapil
David Chapman
ViT
24
45
0
07 Jun 2021
Self-supervised Depth Estimation Leveraging Global Perception and
  Geometric Smoothness Using On-board Videos
Self-supervised Depth Estimation Leveraging Global Perception and Geometric Smoothness Using On-board Videos
Shaocheng Jia
Xin Pei
W. Yao
S. Wong
3DPC
MDE
33
19
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
48
329
0
07 Jun 2021
Rethinking Training from Scratch for Object Detection
Rethinking Training from Scratch for Object Detection
Yang Li
Hong Zhang
Yu Zhang
VLM
OnRL
ObjD
14
5
0
06 Jun 2021
Few-Shot Segmentation via Cycle-Consistent Transformer
Few-Shot Segmentation via Cycle-Consistent Transformer
Gengwei Zhang
Guoliang Kang
Yi Yang
Yunchao Wei
ViT
11
177
0
04 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
27
22
0
31 May 2021
Dual-stream Network for Visual Recognition
Dual-stream Network for Visual Recognition
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Peng Gao
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
18
63
0
31 May 2021
What Is Considered Complete for Visual Recognition?
What Is Considered Complete for Visual Recognition?
Lingxi Xie
Xiaopeng Zhang
Longhui Wei
Jianlong Chang
Qi Tian
VLM
18
4
0
28 May 2021
ResT: An Efficient Transformer for Visual Recognition
ResT: An Efficient Transformer for Visual Recognition
Qing-Long Zhang
Yubin Yang
ViT
13
229
0
28 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang
Xue Wang
F. Wang
Ming Lin
Shuning Chang
Hao Li
R. L. Jin
ViT
32
105
0
28 May 2021
Intriguing Properties of Vision Transformers
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
251
619
0
21 May 2021
Vision Transformer for Fast and Efficient Scene Text Recognition
Vision Transformer for Fast and Efficient Scene Text Recognition
Rowel Atienza
ViT
9
144
0
18 May 2021
Previous
123...18192021
Next