ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,106 papers shown
Title
Vision Transformers in 2022: An Update on Tiny ImageNet
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
28
11
0
21 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image
  Modeling
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
27
1
0
21 May 2022
Learning to Count Anything: Reference-less Class-agnostic Counting with
  Weak Supervision
Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision
Michael A. Hobley
V. Prisacariu
39
36
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
30
51
0
19 May 2022
Cross-Enhancement Transformer for Action Segmentation
Cross-Enhancement Transformer for Action Segmentation
Jiahui Wang
Zhenyou Wang
Shanna Zhuang
Hui Wang
ViT
48
23
0
19 May 2022
MulT: An End-to-End Multitask Learning Transformer
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
36
62
0
17 May 2022
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised
  Semantic Segmentation and Localization
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
Andrea Vedaldi
28
159
0
16 May 2022
Diffusion Models for Adversarial Purification
Diffusion Models for Adversarial Purification
Weili Nie
Brandon Guo
Yujia Huang
Chaowei Xiao
Arash Vahdat
Anima Anandkumar
WIGM
197
418
0
16 May 2022
Discovering and Explaining the Representation Bottleneck of Graph Neural
  Networks from Multi-order Interactions
Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions
Fang Wu
Siyuan Li
Lirong Wu
Dragomir R. Radev
Stan Z. Li
27
2
0
15 May 2022
Dense residual Transformer for image denoising
Dense residual Transformer for image denoising
Chao Yao
Shuo Jin
Meiqin Liu
Xiaojuan Ban
ViT
33
29
0
14 May 2022
Simple Open-Vocabulary Object Detection with Vision Transformers
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer
A. Gritsenko
Austin Stone
Maxim Neumann
Dirk Weissenborn
...
Zhuoran Shen
Xiao Wang
Xiaohua Zhai
Thomas Kipf
N. Houlsby
ObjD
CLIP
VLM
ViT
OCL
8
307
0
12 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
27
37
0
12 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
61
600
0
09 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
19
121
0
08 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
  Transformers
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
L. Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
31
180
0
06 May 2022
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection
Mingdong Yang
Guo Chen
Yin-Dong Zheng
Tong Lu
Limin Wang
33
45
0
05 May 2022
Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge
  Graph Completion
Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
Xiang Chen
Ningyu Zhang
Lei Li
Shumin Deng
Chuanqi Tan
Changliang Xu
Fei Huang
Luo Si
Huajun Chen
18
126
0
04 May 2022
Better plain ViT baselines for ImageNet-1k
Better plain ViT baselines for ImageNet-1k
Lucas Beyer
Xiaohua Zhai
Alexander Kolesnikov
ViT
VLM
33
111
0
03 May 2022
A Comprehensive Survey of Image Augmentation Techniques for Deep
  Learning
A Comprehensive Survey of Image Augmentation Techniques for Deep Learning
Mingle Xu
Sook Yoon
A. Fuentes
D. Park
VLM
22
396
0
03 May 2022
Source Domain Subset Sampling for Semi-Supervised Domain Adaptation in
  Semantic Segmentation
Source Domain Subset Sampling for Semi-Supervised Domain Adaptation in Semantic Segmentation
Daehan Kim
Min-seok Seo
Jinsun Park
Dong-Geol Choi
TTA
32
3
0
30 Apr 2022
Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel
  Transformer
Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel Transformer
Wu Yun
Mengshi Qi
Chuanming Wang
Huiyuan Fu
Huadong Ma
ViT
11
6
0
30 Apr 2022
Improving Transferability for Domain Adaptive Detection Transformers
Improving Transferability for Domain Adaptive Detection Transformers
Kaixiong Gong
Shuang Li
Shugang Li
Rui Zhang
Chi Harold Liu
Qiang Chen
51
34
0
29 Apr 2022
A Challenging Benchmark of Anime Style Recognition
A Challenging Benchmark of Anime Style Recognition
Haotang Li
S. Guo
Kailin Lyu
Xiao Yang
Tianchen Chen
Jianqing Zhu
Huanqiang Zeng
CVBM
16
5
0
29 Apr 2022
Depth Estimation with Simplified Transformer
Depth Estimation with Simplified Transformer
John Yang
Le An
Anurag Dixit
Jinkyu Koo
Su Inn Park
MDE
28
21
0
28 Apr 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision
  Transformers
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
32
76
0
27 Apr 2022
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Dading Chong
Helin Wang
Peilin Zhou
Qingcheng Zeng
39
65
0
27 Apr 2022
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Deeper Insights into the Robustness of ViTs towards Common Corruptions
Rui Tian
Zuxuan Wu
Qi Dai
Han Hu
Yu-Gang Jiang
ViT
AAML
21
4
0
26 Apr 2022
Fused Audio Instance and Representation for Respiratory Disease
  Detection
Fused Audio Instance and Representation for Respiratory Disease Detection
Tuan Truong
Matthias Lenga
A. Serrurier
Sadegh Mohammadi
16
0
0
22 Apr 2022
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented
  Visual Models
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Chunyuan Li
Haotian Liu
Liunian Harold Li
Pengchuan Zhang
J. Aneja
...
Ping Jin
Houdong Hu
Zicheng Liu
Yong Jae Lee
Jianfeng Gao
29
144
0
19 Apr 2022
CTCNet: A CNN-Transformer Cooperation Network for Face Image
  Super-Resolution
CTCNet: A CNN-Transformer Cooperation Network for Face Image Super-Resolution
Guangwei Gao
Zixiang Xu
Juncheng Li
Jian Yang
T. Zeng
Guo-Jun Qi
CVBM
ViT
SupR
34
80
0
19 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
22
53
0
18 Apr 2022
Searching Intrinsic Dimensions of Vision Transformers
Searching Intrinsic Dimensions of Vision Transformers
Fanghui Xue
Biao Yang
Y. Qi
Jack Xin
ViT
31
2
0
16 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio
  Representations
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
36
53
0
15 Apr 2022
MiniViT: Compressing Vision Transformers with Weight Multiplexing
MiniViT: Compressing Vision Transformers with Weight Multiplexing
Jinnian Zhang
Houwen Peng
Kan Wu
Mengchen Liu
Bin Xiao
Jianlong Fu
Lu Yuan
ViT
23
123
0
14 Apr 2022
Neighborhood Attention Transformer
Neighborhood Attention Transformer
Ali Hassani
Steven Walton
Jiacheng Li
Shengjia Li
Humphrey Shi
ViT
AI4TS
36
251
0
14 Apr 2022
Masked Siamese Networks for Label-Efficient Learning
Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
28
311
0
14 Apr 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Wenqiang Zhang
Zilong Huang
Guozhong Luo
Tao Chen
Xinggang Wang
Wenyu Liu
Gang Yu
Chunhua Shen
ViT
22
198
0
12 Apr 2022
OutfitTransformer: Learning Outfit Representations for Fashion
  Recommendation
OutfitTransformer: Learning Outfit Representations for Fashion Recommendation
Rohan Sarkar
Navaneeth Bodla
Mariya I. Vasileva
Yen-Liang Lin
Anu Beniwal
Alan Lu
Gérard Medioni
19
35
0
11 Apr 2022
Fashionformer: A simple, Effective and Unified Baseline for Human
  Fashion Segmentation and Recognition
Fashionformer: A simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition
Shilin Xu
Xiangtai Li
Jingbo Wang
Guangliang Cheng
Yunhai Tong
Dacheng Tao
ViT
23
27
0
10 Apr 2022
Points to Patches: Enabling the Use of Self-Attention for 3D Shape
  Recognition
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Axel Berg
Magnus Oskarsson
Mark O'Connor
3DPC
ViT
24
26
0
08 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
38
11
0
07 Apr 2022
An Empirical Study of Remote Sensing Pretraining
An Empirical Study of Remote Sensing Pretraining
Di Wang
Jing Zhang
Bo Du
Guisong Xia
Dacheng Tao
EDL
31
190
0
06 Apr 2022
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Towards An End-to-End Framework for Flow-Guided Video Inpainting
Z. Li
Cheng Lu
Jia Qin
Chunle Guo
Mingg-Ming Cheng
41
149
0
06 Apr 2022
MixFormer: Mixing Features across Windows and Dimensions
MixFormer: Mixing Features across Windows and Dimensions
Qiang Chen
Qiman Wu
Jian Wang
Qinghao Hu
T. Hu
Errui Ding
Jian Cheng
Jingdong Wang
MDE
ViT
16
101
0
06 Apr 2022
Vision Transformer Equipped with Neural Resizer on Facial Expression
  Recognition Task
Vision Transformer Equipped with Neural Resizer on Facial Expression Recognition Task
Hyeonbin Hwang
Soyeon Kim
Wei-Jin Park
Jiho Seo
Kyungtae Ko
Hyeon Yeo
ViT
31
9
0
05 Apr 2022
Region Rebalance for Long-Tailed Semantic Segmentation
Region Rebalance for Long-Tailed Semantic Segmentation
Jiequan Cui
Yuhui Yuan
Zhisheng Zhong
Zhuotao Tian
Han Hu
Stephen Lin
Jiaya Jia
18
18
0
05 Apr 2022
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric
  Videos
Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
Shao-Wei Liu
Subarna Tripathi
Somdeb Majumdar
Xiaolong Wang
EgoV
26
93
0
04 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
35
265
0
04 Apr 2022
BatchFormerV2: Exploring Sample Relationships for Dense Representation
  Learning
BatchFormerV2: Exploring Sample Relationships for Dense Representation Learning
Zhi Hou
Baosheng Yu
Chaoyue Wang
Yibing Zhan
Dacheng Tao
ViT
20
11
0
04 Apr 2022
Previous
123...141516...212223
Next