ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,080 papers shown
Title
BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks
BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks
Xiaowei Chi
Jiaming Liu
Ming Lu
Rongyu Zhang
Zhaoqing Wang
Yandong Guo
Shanghang Zhang
3DPC
38
19
0
02 Dec 2022
Part-based Face Recognition with Vision Transformers
Part-based Face Recognition with Vision Transformers
Zhonglin Sun
Georgios Tzimiropoulos
ViT
15
15
0
30 Nov 2022
Rethinking Out-of-Distribution Detection From a Human-Centric
  Perspective
Rethinking Out-of-Distribution Detection From a Human-Centric Perspective
Yao Zhu
YueFeng Chen
Xiaodan Li
Rong Zhang
Hui Xue
Xiang Tian
Rongxin Jiang
Bo Zheng
Yao-wu Chen
OODD
19
7
0
30 Nov 2022
Hierarchical Transformer for Survival Prediction Using Multimodality
  Whole Slide Images and Genomics
Hierarchical Transformer for Survival Prediction Using Multimodality Whole Slide Images and Genomics
Chunyuan Li
Xinliang Zhu
Jiawen Yao
Junzhou Huang
MedIm
30
11
0
29 Nov 2022
Exploiting Category Names for Few-Shot Classification with
  Vision-Language Models
Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Taihong Xiao
Zirui Wang
Liangliang Cao
Jiahui Yu
Shengyang Dai
Ming Yang
VLM
MLLM
27
5
0
29 Nov 2022
Finding Differences Between Transformers and ConvNets Using
  Counterfactual Simulation Testing
Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing
Nataniel Ruiz
Sarah Adel Bargal
Cihang Xie
Kate Saenko
Stan Sclaroff
ViT
30
5
0
29 Nov 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
26
2
0
29 Nov 2022
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization
  for Vision Transformers
NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers
Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
MQ
29
45
0
29 Nov 2022
LUMix: Improving Mixup by Better Modelling Label Uncertainty
LUMix: Improving Mixup by Better Modelling Label Uncertainty
Shuyang Sun
Jieneng Chen
Ruifei He
Alan Yuille
Philip H. S. Torr
Song Bai
UQCV
NoLa
13
5
0
29 Nov 2022
Superpoint Transformer for 3D Scene Instance Segmentation
Superpoint Transformer for 3D Scene Instance Segmentation
Jiahao Sun
Chunmei Qing
Junpeng Tan
Xiangmin Xu
3DPC
34
103
0
28 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image
  Models
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
31
2
0
27 Nov 2022
Dynamic Feature Pruning and Consolidation for Occluded Person
  Re-Identification
Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye
Hang Zhou
Jiale Cai
Chenxing Gao
Youjia Zhang
Junle Wang
Qiang Hu
Junqing Yu
Wei Yang
23
6
0
27 Nov 2022
Semantic-Aware Local-Global Vision Transformer
Semantic-Aware Local-Global Vision Transformer
Jiatong Zhang
Zengwei Yao
Fanglin Chen
Guangming Lu
Wenjie Pei
ViT
23
0
0
27 Nov 2022
Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
  Semantic Segmentation
Exploring Consistency in Cross-Domain Transformer for Domain Adaptive Semantic Segmentation
Kaihong Wang
Donghyun Kim
Regerio Feris
Kate Saenko
Margrit Betke
ViT
20
4
0
27 Nov 2022
CMC v2: Towards More Accurate COVID-19 Detection with Discriminative
  Video Priors
CMC v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors
Junlin Hou
Jilan Xu
Nan Zhang
Yi Wang
Yuejie Zhang
X. Zhang
Rui Feng
16
2
0
26 Nov 2022
Meta Architecture for Point Cloud Analysis
Meta Architecture for Point Cloud Analysis
Haojia Lin
Xiawu Zheng
Lijiang Li
Fei Chao
Sha Wang
Yan Wang
Yonghong Tian
Rongrong Ji
3DPC
25
45
0
26 Nov 2022
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
  Multi-Modality Image Fusion
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
Zixiang Zhao
Hao Bai
Jiangshe Zhang
Yulun Zhang
Shuang Xu
Zudi Lin
Radu Timofte
Luc Van Gool
29
309
0
26 Nov 2022
Degenerate Swin to Win: Plain Window-based Transformer without
  Sophisticated Operations
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
46
5
0
25 Nov 2022
Aggregated Text Transformer for Scene Text Detection
Aggregated Text Transformer for Scene Text Detection
Zhao Zhou
Xiangcheng Du
Yingbin Zheng
Cheng Jin
ViT
25
1
0
25 Nov 2022
SVFormer: Semi-supervised Video Transformer for Action Recognition
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing
Qi Dai
Hang-Rui Hu
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
ViT
22
69
0
23 Nov 2022
SPCXR: Self-supervised Pretraining using Chest X-rays Towards a Domain
  Specific Foundation Model
SPCXR: Self-supervised Pretraining using Chest X-rays Towards a Domain Specific Foundation Model
Syed Muhammad Anwar
Abhijeet Parida
Sara Atito
Muhammad Awais
G. Nino
Josef Kitler
M. Linguraru
ViT
SSL
OOD
27
6
0
23 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network
  Ticket
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
24
19
0
23 Nov 2022
Masked Autoencoding for Scalable and Generalizable Decision Making
Masked Autoencoding for Scalable and Generalizable Decision Making
Fangchen Liu
Hao Liu
Aditya Grover
Pieter Abbeel
OffRL
42
45
0
23 Nov 2022
On the Transferability of Visual Features in Generalized Zero-Shot
  Learning
On the Transferability of Visual Features in Generalized Zero-Shot Learning
Paola Cascante-Bonilla
Leonid Karlinsky
James Smith
Yanjun Qi
Vicente Ordonez
22
2
0
22 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
25
129
0
22 Nov 2022
Boosting the Transferability of Adversarial Attacks with Global Momentum
  Initialization
Boosting the Transferability of Adversarial Attacks with Global Momentum Initialization
Jiafeng Wang
Zhaoyu Chen
Kaixun Jiang
Dingkang Yang
Lingyi Hong
Pinxue Guo
Yan Wang
Wenqiang Zhang
AAML
18
27
0
21 Nov 2022
Unifying Tracking and Image-Video Object Detection
Unifying Tracking and Image-Video Object Detection
Peirong Liu
Rui Wang
Pengchuan Zhang
Omid Poursaeed
Yipin Zhou
Xuefei Cao
Sreya . Dutta Roy
Ashish Shah
Ser-Nam Lim
13
0
0
20 Nov 2022
Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular
  Depth Estimation
Hybrid Transformer Based Feature Fusion for Self-Supervised Monocular Depth Estimation
S. Tomar
Maitreya Suin
A. N. Rajagopalan
ViT
MDE
16
4
0
20 Nov 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for
  Efficient Vision Transformer Training
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
26
22
0
19 Nov 2022
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text
  Spotting
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
Maoyuan Ye
Jing Zhang
Shanshan Zhao
Juhua Liu
Tongliang Liu
Bo Du
Dacheng Tao
36
70
0
19 Nov 2022
Explanation on Pretraining Bias of Finetuned Vision Transformer
Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park
Jaesik Choi
ViT
29
1
0
18 Nov 2022
Compressing Transformer-based self-supervised models for speech
  processing
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
32
6
0
17 Nov 2022
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
  Information
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su
Xizhou Zhu
Chenxin Tao
Lewei Lu
Bin Li
Gao Huang
Yu Qiao
Xiaogang Wang
Jie Zhou
Jifeng Dai
34
41
0
17 Nov 2022
Multi-Camera Multi-Object Tracking on the Move via Single-Stage Global
  Association Approach
Multi-Camera Multi-Object Tracking on the Move via Single-Stage Global Association Approach
Pha Nguyen
Kha Gia Quach
C. Duong
S. L. Phung
Ngan Le
Khoa Luu
41
12
0
17 Nov 2022
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision
  Transformers
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
21
1
0
17 Nov 2022
Hypergraph Transformer for Skeleton-based Action Recognition
Hypergraph Transformer for Skeleton-based Action Recognition
Yuxuan Zhou
Zhi-Qi Cheng
C. Li
Yanwen Fang
Yifeng Geng
Xuansong Xie
M. Keuper
ViT
18
52
0
17 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video
  UniFormer
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
25
106
0
17 Nov 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision
  Transformers
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Z. Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
26
58
0
15 Nov 2022
Knowledge Distillation for Detection Transformer with Consistent
  Distillation Points Sampling
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang
Xin Li
Shengzhao Wen
Fu-En Yang
Wanping Zhang
Gang Zhang
Haocheng Feng
Junyu Han
Errui Ding
37
5
0
15 Nov 2022
YORO -- Lightweight End to End Visual Grounding
YORO -- Lightweight End to End Visual Grounding
Chih-Hui Ho
Srikar Appalaraju
Bhavan A. Jasani
R. Manmatha
Nuno Vasconcelos
ObjD
21
21
0
15 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
25
6
0
14 Nov 2022
BiViT: Extremely Compressed Binary Vision Transformer
BiViT: Extremely Compressed Binary Vision Transformer
Yefei He
Zhenyu Lou
Luoming Zhang
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
ViT
MQ
18
28
0
14 Nov 2022
Training a Vision Transformer from scratch in less than 24 hours with 1
  GPU
Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
Saghar Irandoust
Thibaut Durand
Yunduz Rakhmangulova
Wenjie Zi
Hossein Hajimirsadeghi
ViT
33
6
0
09 Nov 2022
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision
  Transformer Acceleration with a Linear Taylor Attention
ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention
Jyotikrishna Dass
Shang Wu
Huihong Shi
Chaojian Li
Zhifan Ye
Zhongfeng Wang
Yingyan Lin
17
49
0
09 Nov 2022
Masked Vision-Language Transformers for Scene Text Recognition
Masked Vision-Language Transformers for Scene Text Recognition
Jie Wu
Ying Peng
Shenmin Zhang
Weigang Qi
Jian Andrew Zhang
27
3
0
09 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
  Distillation
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
20
58
0
09 Nov 2022
Harmonizing the object recognition strategies of deep neural networks
  with humans
Harmonizing the object recognition strategies of deep neural networks with humans
Thomas Fel
Ivan Felipe
Drew Linsley
Thomas Serre
30
71
0
08 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
ViT-CX: Causal Explanation of Vision Transformers
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
24
17
0
06 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
13
79
0
02 Nov 2022
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary
  Object Detection
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection
Yanxin Long
Jianhua Han
Runhu Huang
Xu Hang
Yi Zhu
Chunjing Xu
Xiaodan Liang
VLM
ObjD
22
18
0
02 Nov 2022
Previous
123...91011...202122
Next