Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.06399
Cited By
Co-Scale Conv-Attentional Image Transformers
13 April 2021
Weijian Xu
Yifan Xu
Tyler A. Chang
Z. Tu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Co-Scale Conv-Attentional Image Transformers"
50 / 73 papers shown
Title
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With A Affine Transformation Contrastive Learning
Wenhui Diao
Haichen Yu
Kaiyue Kang
Tong Ling
Di Liu
...
Hanbo Bi
Libo Ren
Xuexue Li
Yongqiang Mao
Xian Sun
34
1
0
20 Sep 2024
MacFormer: Semantic Segmentation with Fine Object Boundaries
Guoan Xu
Wenfeng Huang
Tao Wu
Ligeng Chen
Wenjing Jia
Guangwei Gao
Xiatian Zhu
Stuart W. Perry
40
0
0
11 Aug 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
40
56
0
10 Jul 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
46
2
0
22 May 2024
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
F. Fleuret
31
2
0
01 Nov 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
31
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
40
3
0
08 Oct 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
42
3
0
18 Aug 2023
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
32
2
0
07 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
37
28
0
01 Jun 2023
Two-Stream Regression Network for Dental Implant Position Prediction
Xinquan Yang
Xuguang Li
Xuechen Li
Wenting Chen
Linlin Shen
X. Li
Yongqiang Deng
20
6
0
17 May 2023
Diabetic Foot Ulcer Grand Challenge 2022 Summary
Connah Kendrick
B. Cassidy
N. Reeves
Joseph M Pappachan
C. O'Shea
Vishnu Chandrabalan
Moi Hoon Yap
17
4
0
24 Apr 2023
Pretrained ViTs Yield Versatile Representations For Medical Images
Christos Matsoukas
Johan Fredin Haslum
Magnus P Soderberg
Kevin Smith
MedIm
ViT
19
11
0
13 Mar 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
28
177
0
19 Feb 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
26
6
0
16 Feb 2023
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
29
50
0
13 Feb 2023
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
32
2
0
25 Jan 2023
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
X. Wang
ViT
32
21
0
13 Dec 2022
Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
Lingshun Kong
Jiangxin Dong
Mingqiang Li
J. Ge
Jin-shan Pan
ViT
22
141
0
22 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
31
129
0
22 Nov 2022
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Yunhao Chen
Yunjie Zhu
Zihui Yan
Yifan Huang
Zhen Ren
Jianlu Shen
Lifang Chen
20
9
0
05 Nov 2022
Automatic Diagnosis of Myocarditis Disease in Cardiac MRI Modality using Deep Transformers and Explainable Artificial Intelligence
M. Jafari
A. Shoeibi
Navid Ghassemi
Jónathan Heras
Saiguang Ling
...
Shuihua Wang
R. Alizadehsani
Juan M Gorriz
U. Acharya
Hamid Alinejad-Rokny
MedIm
20
11
0
26 Oct 2022
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
38
3
0
25 Oct 2022
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
29
31
0
21 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
28
35
0
14 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
15
57
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
25
17
0
11 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
34
3
0
09 Oct 2022
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks
Yen-Cheng Liu
Chih-Yao Ma
Junjiao Tian
Zijian He
Z. Kira
120
47
0
07 Oct 2022
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features
S. Wadekar
Abhishek Chaurasia
ViT
98
87
0
30 Sep 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Rethinking Blur Synthesis for Deep Real-World Image Deblurring
Hao Wei
Chenyang Ge
Xin Qiao
Pengchao Deng
19
0
0
28 Sep 2022
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Yingyi Chen
Xiaoke Shen
Yahui Liu
Qinghua Tao
Johan A. K. Suykens
AAML
ViT
23
22
0
25 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
33
33
0
09 Jun 2022
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
26
5
0
05 Jun 2022
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
26
187
0
25 May 2022
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
36
62
0
17 May 2022
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
30
240
0
07 Apr 2022
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
25
32
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
23
55
0
06 Apr 2022
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
26
39
0
29 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
25
32
0
24 Mar 2022
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
24
263
0
22 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
19
53
0
21 Mar 2022
HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
Qing Cai
Yiming Qian
Jinxing Li
Junjie Lv
Yee-Hong Yang
Feng Wu
Dafan Zhang
19
28
0
19 Mar 2022
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
19
637
0
20 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
32
465
0
14 Feb 2022
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Amin Ghiasi
Hamid Kazemi
Steven Reich
Chen Zhu
Micah Goldblum
Tom Goldstein
34
15
0
31 Jan 2022
1
2
Next