ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.03348
  4. Cited By
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

7 June 2021
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
    ViT
ArXivPDFHTML

Papers citing "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias"

47 / 197 papers shown
Title
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
ViT
22
509
0
26 Apr 2022
Adaptive Split-Fusion Transformer
Adaptive Split-Fusion Transformer
Zixuan Su
Hao Zhang
Jingjing Chen
Lei Pang
Chong-Wah Ngo
Yu-Gang Jiang
ViT
19
7
0
26 Apr 2022
VSA: Learning Varied-Size Window Attention in Vision Transformers
VSA: Learning Varied-Size Window Attention in Vision Transformers
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
22
53
0
18 Apr 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
An Empirical Study of Remote Sensing Pretraining
An Empirical Study of Remote Sensing Pretraining
Di Wang
Jing Zhang
Bo Du
Guisong Xia
Dacheng Tao
EDL
23
190
0
06 Apr 2022
BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy
  for Source-free Domain Adaptation
BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation
Sanqing Qu
Guang Chen
Jing Zhang
Zhijun Li
Wei He
Dacheng Tao
TTA
28
54
0
06 Apr 2022
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Dynamic Focus-aware Positional Queries for Semantic Segmentation
Haoyu He
Jianfei Cai
Zizheng Pan
Jing Liu
Jing Zhang
Dacheng Tao
Bohan Zhuang
34
16
0
04 Apr 2022
Rethinking Portrait Matting with Privacy Preserving
Rethinking Portrait Matting with Privacy Preserving
Sihan Ma
Jizhizi Li
Jing Zhang
He-jun Zhang
Dacheng Tao
18
23
0
31 Mar 2022
Towards Data-Efficient Detection Transformers
Towards Data-Efficient Detection Transformers
Wen Wang
Jing Zhang
Yang Cao
Yongliang Shen
Dacheng Tao
ViT
18
57
0
17 Mar 2022
TransCAM: Transformer Attention-based CAM Refinement for Weakly
  Supervised Semantic Segmentation
TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation
Ruiwen Li
Zheda Mai
C. Trabelsi
Zhibo Zhang
Jongseong Jang
Scott Sanner
ViT
20
61
0
14 Mar 2022
SATr: Slice Attention with Transformer for Universal Lesion Detection
SATr: Slice Attention with Transformer for Universal Lesion Detection
Hantao Li
Long Chen
H. Han
S. Kevin Zhou
ViT
MedIm
27
25
0
13 Mar 2022
Information-Theoretic Odometry Learning
Information-Theoretic Odometry Learning
Sen Zhang
Jing Zhang
Dacheng Tao
15
5
0
11 Mar 2022
Where Does the Performance Improvement Come From? -- A Reproducibility
  Concern about Image-Text Retrieval
Where Does the Performance Improvement Come From? -- A Reproducibility Concern about Image-Text Retrieval
Jun Rao
Fei-Yue Wang
Liang Ding
Shuhan Qi
Yibing Zhan
Weifeng Liu
Dacheng Tao
OOD
29
28
0
08 Mar 2022
Boosting Crowd Counting via Multifaceted Attention
Boosting Crowd Counting via Multifaceted Attention
Hui Lin
Zhiheng Ma
Rongrong Ji
Yaowei Wang
Xiaopeng Hong
23
145
0
05 Mar 2022
Ensembles of Vision Transformers as a New Paradigm for Automated
  Classification in Ecology
Ensembles of Vision Transformers as a New Paradigm for Automated Classification in Ecology
S. Kyathanahally
T. Hardeman
M. Reyes
E. Merz
T. Bulas
P. Brun
F. Pomati
M. Baity-Jesi
27
15
0
03 Mar 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for
  Image Recognition and Beyond
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
22
229
0
21 Feb 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
17
636
0
20 Feb 2022
ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer
Pengfei Guo
Yiqun Mei
Jinyuan Zhou
Shanshan Jiang
Vishal M. Patel
ViT
MedIm
76
65
0
23 Jan 2022
Siamese Network with Interactive Transformer for Video Object
  Segmentation
Siamese Network with Interactive Transformer for Video Object Segmentation
Meng Lan
Jing Zhang
Fengxiang He
Lefei Zhang
ViT
13
36
0
28 Dec 2021
MetaCVR: Conversion Rate Prediction via Meta Learning in Small-Scale
  Recommendation Scenarios
MetaCVR: Conversion Rate Prediction via Meta Learning in Small-Scale Recommendation Scenarios
Xiaofeng Pan
Ming Li
Jing Zhang
Keren Yu
Luping Wang
Hong Wen
Chengjun Mao
Bo Cao
8
8
0
27 Dec 2021
Visual Semantics Allow for Textual Reasoning Better in Scene Text
  Recognition
Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
Y. He
Chen Chen
Jing Zhang
Juhua Liu
Fengxiang He
Chaoyue Wang
Bo Du
34
55
0
24 Dec 2021
MPViT: Multi-Path Vision Transformer for Dense Prediction
MPViT: Multi-Path Vision Transformer for Dense Prediction
Youngwan Lee
Jonghee Kim
Jeffrey Willette
Sung Ju Hwang
ViT
27
243
0
21 Dec 2021
Recurrent Glimpse-based Decoder for Detection with Transformer
Recurrent Glimpse-based Decoder for Detection with Transformer
Zhe Chen
Jing Zhang
Dacheng Tao
ViT
22
30
0
09 Dec 2021
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic
  Segmentation
PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation
Haobo Yuan
Xiangtai Li
Yibo Yang
Guangliang Cheng
Jing Zhang
Yunhai Tong
Lefei Zhang
Dacheng Tao
MDE
30
42
0
05 Dec 2021
FIBA: Frequency-Injection based Backdoor Attack in Medical Image
  Analysis
FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis
Yu Feng
Benteng Ma
Jing Zhang
Shanshan Zhao
Yong-quan Xia
Dacheng Tao
AAML
18
84
0
02 Dec 2021
Background Activation Suppression for Weakly Supervised Object
  Localization
Background Activation Suppression for Weakly Supervised Object Localization
Ping Wu
Wei Zhai
Yang Cao
WSOL
32
50
0
01 Dec 2021
Hierarchical Prototype Networks for Continual Graph Representation
  Learning
Hierarchical Prototype Networks for Continual Graph Representation Learning
Xikun Zhang
Dongjin Song
Dacheng Tao
CLL
29
31
0
30 Nov 2021
GMFlow: Learning Optical Flow via Global Matching
GMFlow: Learning Optical Flow via Global Matching
Haofei Xu
Jing Zhang
Jianfei Cai
Hamid Rezatofighi
Dacheng Tao
51
342
0
26 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
29
40
0
23 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under
  Tiny Model Constraints
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
24
3
0
19 Nov 2021
Convolutional Gated MLP: Combining Convolutions & gMLP
Convolutional Gated MLP: Combining Convolutions & gMLP
A. Rajagopal
V. Nirmala
26
14
0
06 Nov 2021
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Ripple Attention for Visual Perception with Sub-quadratic Complexity
Lin Zheng
Huijie Pan
Lingpeng Kong
21
3
0
06 Oct 2021
RobustART: Benchmarking Robustness on Architecture Design and Training
  Techniques
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques
Shiyu Tang
Ruihao Gong
Yan Wang
Aishan Liu
Jiakai Wang
...
Xianglong Liu
D. Song
Alan Yuille
Philip H. S. Torr
Dacheng Tao
VLM
AAML
21
106
0
11 Sep 2021
One-Shot Object Affordance Detection in the Wild
One-Shot Object Affordance Detection in the Wild
Wei Zhai
Hongcheng Luo
Jing Zhang
Yang Cao
Dacheng Tao
71
44
0
08 Aug 2021
I3CL:Intra- and Inter-Instance Collaborative Learning for
  Arbitrary-shaped Scene Text Detection
I3CL:Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection
Bo Du
Jian Ye
Jing Zhang
Juhua Liu
Dacheng Tao
VLM
26
29
0
03 Aug 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale
  Attention
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
32
256
0
31 Jul 2021
The USYD-JD Speech Translation System for IWSLT 2021
The USYD-JD Speech Translation System for IWSLT 2021
Liang Ding
Di Wu
Dacheng Tao
24
16
0
24 Jul 2021
DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic
  Segmentation
DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic Segmentation
Li Gao
Jing Zhang
Lefei Zhang
Dacheng Tao
18
84
0
20 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
239
2,600
0
04 May 2021
End-to-end One-shot Human Parsing
End-to-end One-shot Human Parsing
Haoyu He
Bohan Zhuang
Jing Zhang
Jianfei Cai
Dacheng Tao
VLM
32
8
0
04 May 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,523
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,622
0
24 Feb 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Tsung-Yi Lin
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
270
979
0
27 Jan 2021
SIR: Self-supervised Image Rectification via Seeing the Same Scene from
  Multiple Different Lenses
SIR: Self-supervised Image Rectification via Seeing the Same Scene from Multiple Different Lenses
Jinlong Fan
Jing Zhang
Dacheng Tao
19
10
0
30 Nov 2020
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision
  Applications
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,549
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
282
10,214
0
16 Nov 2016
Semantic Understanding of Scenes through the ADE20K Dataset
Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou
Hang Zhao
Xavier Puig
Tete Xiao
Sanja Fidler
Adela Barriuso
Antonio Torralba
SSeg
251
1,824
0
18 Aug 2016
Previous
1234