ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,093 papers shown
Title
AudioTagging Done Right: 2nd comparison of deep learning methods for
  environmental sound classification
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Billy Li
Shuhui Qu
Po-Yao (Bernie) Huang
Florian Metze
VLM
24
9
0
25 Mar 2022
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Transformers Meet Visual Learning Understanding: A Comprehensive Review
Yuting Yang
Licheng Jiao
Xuantong Liu
F. Liu
Shuyuan Yang
Zhixi Feng
Xu Tang
ViT
MedIm
27
28
0
24 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
22
32
0
24 Mar 2022
Unsupervised Salient Object Detection with Spectral Cluster Voting
Unsupervised Salient Object Detection with Spectral Cluster Voting
Gyungin Shin
Samuel Albanie
Weidi Xie
13
65
0
23 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for
  Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
137
1,122
0
23 Mar 2022
Focal Modulation Networks
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
22
263
0
22 Mar 2022
Meta-attention for ViT-backed Continual Learning
Meta-attention for ViT-backed Continual Learning
Mengqi Xue
Haofei Zhang
Jie Song
Mingli Song
CLL
22
41
0
22 Mar 2022
Transformer-based HTR for Historical Documents
Transformer-based HTR for Historical Documents
Phillip Benjamin Strobel
Simon Clematide
M. Volk
Tobias Hodel
15
10
0
21 Mar 2022
Hyperbolic Vision Transformers: Combining Improvements in Metric
  Learning
Hyperbolic Vision Transformers: Combining Improvements in Metric Learning
Aleksandr Ermolov
L. Mirvakhabova
Valentin Khrulkov
N. Sebe
Ivan V. Oseledets
25
100
0
21 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
19
53
0
21 Mar 2022
Harnessing Hard Mixed Samples with Decoupled Regularizer
Harnessing Hard Mixed Samples with Decoupled Regularizer
Zicheng Liu
Siyuan Li
Ge Wang
Cheng Tan
Lirong Wu
Stan Z. Li
53
18
0
21 Mar 2022
Disentangling Architecture and Training for Optical Flow
Disentangling Architecture and Training for Optical Flow
Deqing Sun
Charles Herrmann
F. Reda
Michael Rubinstein
David Fleet
William T. Freeman
3DPC
OOD
58
34
0
21 Mar 2022
Delta Keyword Transformer: Bringing Transformers to the Edge through
  Dynamically Pruned Multi-Head Self-Attention
Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Zuzana Jelčicová
Marian Verhelst
26
5
0
20 Mar 2022
MatchFormer: Interleaving Attention in Transformers for Feature Matching
MatchFormer: Interleaving Attention in Transformers for Feature Matching
Qing Wang
Jiaming Zhang
Kailun Yang
Kunyu Peng
Rainer Stiefelhagen
ViT
33
141
0
17 Mar 2022
Towards Data-Efficient Detection Transformers
Towards Data-Efficient Detection Transformers
Wen Wang
Jing Zhang
Yang Cao
Yongliang Shen
Dacheng Tao
ViT
18
59
0
17 Mar 2022
Learning Audio Representations with MLPs
Learning Audio Representations with MLPs
Mashrur M. Morshed
Ahmad Omar Ahsan
H. Mahmud
Md. Kamrul Hasan
24
4
0
16 Mar 2022
WegFormer: Transformers for Weakly Supervised Semantic Segmentation
WegFormer: Transformers for Weakly Supervised Semantic Segmentation
Chunmeng Liu
Enze Xie
Wenjia Wang
Wenhai Wang
Guangya Li
Ping Luo
ViT
24
6
0
16 Mar 2022
Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic
  Segmentation
Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation
Runfa Chen
Yu Rong
Shangmin Guo
Jiaqi Han
Fuchun Sun
Tingyang Xu
Wenbing Huang
ViT
15
20
0
15 Mar 2022
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose
  Estimation
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation
Wenkang Shan
Zhenhua Liu
Xinfeng Zhang
Shanshe Wang
Siwei Ma
Wen Gao
3DH
23
121
0
15 Mar 2022
All in One: Exploring Unified Video-Language Pre-training
All in One: Exploring Unified Video-Language Pre-training
Alex Jinpeng Wang
Yixiao Ge
Rui Yan
Yuying Ge
Xudong Lin
Guanyu Cai
Jianping Wu
Ying Shan
Xiaohu Qie
Mike Zheng Shou
16
200
0
14 Mar 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
25
0
0
14 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
49
528
0
13 Mar 2022
Efficient Long-Range Attention Network for Image Super-resolution
Efficient Long-Range Attention Network for Image Super-resolution
Xindong Zhang
Huiyu Zeng
Shi Guo
Lei Zhang
ViT
19
276
0
13 Mar 2022
Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
Shankar Kanthara
Rixie Tiffany Ko Leong
Xiang Lin
Ahmed Masry
Megh Thakkar
Enamul Hoque
Shafiq R. Joty
14
135
0
12 Mar 2022
The Principle of Diversity: Training Stronger Vision Transformers Calls
  for Reducing All Levels of Redundancy
The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy
Tianlong Chen
Zhenyu (Allen) Zhang
Yu Cheng
Ahmed Hassan Awadallah
Zhangyang Wang
ViT
35
37
0
12 Mar 2022
Backbone is All Your Need: A Simplified Architecture for Visual Object
  Tracking
Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking
Boyu Chen
Peixia Li
Lei Bai
Leixian Qiao
Qiuhong Shen
Bo-wen Li
Weihao Gan
Wei Wu
Wanli Ouyang
ViT
VOT
22
182
0
10 Mar 2022
MVP: Multimodality-guided Visual Pre-training
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
28
105
0
10 Mar 2022
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain
  Analysis: From Theory to Practice
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
Peihao Wang
Wenqing Zheng
Tianlong Chen
Zhangyang Wang
ViT
22
127
0
09 Mar 2022
NLX-GPT: A Model for Natural Language Explanations in Vision and
  Vision-Language Tasks
NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks
Fawaz Sammani
Tanmoy Mukherjee
Nikos Deligiannis
MILM
ELM
LRM
16
67
0
09 Mar 2022
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with
  Transformers
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
23
297
0
09 Mar 2022
FlexIT: Towards Flexible Semantic Image Translation
FlexIT: Towards Flexible Semantic Image Translation
Guillaume Couairon
Asya Grechka
Jakob Verbeek
Holger Schwenk
Matthieu Cord
DiffM
31
35
0
09 Mar 2022
Memory Efficient Continual Learning with Transformers
Memory Efficient Continual Learning with Transformers
B. Ermiş
Giovanni Zappella
Martin Wistuba
Aditya Rawal
Cédric Archambeau
CLL
21
42
0
09 Mar 2022
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity
  Prediction
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction
Zhuoran Song
Yihong Xu
Zhezhi He
Li Jiang
Naifeng Jing
Xiaoyao Liang
ViT
18
39
0
09 Mar 2022
ChiTransformer:Towards Reliable Stereo from Cues
ChiTransformer:Towards Reliable Stereo from Cues
Qing Su
Shihao Ji
MDE
ViT
16
12
0
09 Mar 2022
Dynamic Group Transformer: A General Vision Transformer Backbone with
  Dynamic Group Attention
Dynamic Group Transformer: A General Vision Transformer Backbone with Dynamic Group Attention
Kai Liu
Tianyi Wu
Cong Liu
Guodong Guo
ViT
33
17
0
08 Mar 2022
Exploring Dual-task Correlation for Pose Guided Person Image Generation
Exploring Dual-task Correlation for Pose Guided Person Image Generation
Peng Zhang
Lingxiao Yang
Jianhuang Lai
Xiaohua Xie
ViT
21
81
0
06 Mar 2022
Multi-class Token Transformer for Weakly Supervised Semantic
  Segmentation
Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
Lian Xu
Wanli Ouyang
Bennamoun
F. Boussaïd
Dan Xu
ViT
28
209
0
06 Mar 2022
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Qishuai Diao
Yi-Xin Jiang
Bin Wen
Jianxiang Sun
Zehuan Yuan
31
60
0
05 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
35
159
0
04 Mar 2022
Correlation-Aware Deep Tracking
Correlation-Aware Deep Tracking
Fei Xie
Chunyu Wang
Guangting Wang
Yue Cao
Wankou Yang
Wenjun Zeng
VOT
26
118
0
03 Mar 2022
Recent Advances in Vision Transformer: A Survey and Outlook of Recent
  Work
Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work
Khawar Islam
ViT
28
45
0
03 Mar 2022
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic
  Semantic Segmentation
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation
Jiaming Zhang
Kailun Yang
Chaoxiang Ma
Simon Reiß
Kunyu Peng
Rainer Stiefelhagen
ViT
29
72
0
02 Mar 2022
A Unified Query-based Paradigm for Point Cloud Understanding
A Unified Query-based Paradigm for Point Cloud Understanding
Zetong Yang
Li Jiang
Yanan Sun
Bernt Schiele
Jiaya Jia
3DPC
23
38
0
02 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse &
  Other Factors
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors
Christos Matsoukas
Johan Fredin Haslum
Moein Sorkhei
Magnus P Soderberg
Kevin Smith
VLM
OOD
MedIm
24
85
0
02 Mar 2022
Adaptive Discriminative Regularization for Visual Classification
Adaptive Discriminative Regularization for Visual Classification
Qingsong Zhao
Yi Wang
Shuguang Dou
Chen Gong
Yin Wang
Cairong Zhao
18
0
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary
  Detection
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
43
14
0
01 Mar 2022
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Joya Chen
Kai Xu
Yuhui Wang
Yifei Cheng
Angela Yao
19
7
0
28 Feb 2022
CTformer: Convolution-free Token2Token Dilated Vision Transformer for
  Low-dose CT Denoising
CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising
Dayang Wang
Fenglei Fan
Zhan Wu
R. Liu
Fei-Yue Wang
Hengyong Yu
ViT
MedIm
26
121
0
28 Feb 2022
Learn From the Past: Experience Ensemble Knowledge Distillation
Learn From the Past: Experience Ensemble Knowledge Distillation
Chaofei Wang
Shaowei Zhang
S. Song
Gao Huang
25
4
0
25 Feb 2022
Delving Deep into One-Shot Skeleton-based Action Recognition with
  Diverse Occlusions
Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
21
28
0
23 Feb 2022
Previous
123...151617...202122
Next