ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2104.06399
  4. Cited By
Co-Scale Conv-Attentional Image Transformers

Co-Scale Conv-Attentional Image Transformers

13 April 2021
Weijian Xu
Yifan Xu
Tyler A. Chang
Z. Tu
    ViT
ArXivPDFHTML

Papers citing "Co-Scale Conv-Attentional Image Transformers"

50 / 74 papers shown
Title
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With A Affine Transformation Contrastive Learning
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With A Affine Transformation Contrastive Learning
Wenhui Diao
Haichen Yu
Kaiyue Kang
Tong Ling
Di Liu
...
Hanbo Bi
Libo Ren
Xuexue Li
Yongqiang Mao
Xian Sun
34
1
0
20 Sep 2024
MacFormer: Semantic Segmentation with Fine Object Boundaries
MacFormer: Semantic Segmentation with Fine Object Boundaries
Guoan Xu
Wenfeng Huang
Tao Wu
Ligeng Chen
Wenjing Jia
Guangwei Gao
Xiatian Zhu
Stuart W. Perry
40
0
0
11 Aug 2024
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh
Jan Kautz
Mamba
40
56
0
10 Jul 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
46
2
0
22 May 2024
PAUMER: Patch Pausing Transformer for Semantic Segmentation
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
F. Fleuret
34
2
0
01 Nov 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
31
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
40
3
0
08 Oct 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
42
3
0
18 Aug 2023
Distributionally Robust Classification on a Data Budget
Distributionally Robust Classification on a Data Budget
Ben Feuer
Ameya Joshi
Minh Pham
C. Hegde
OOD
32
2
0
07 Aug 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
37
28
0
01 Jun 2023
Two-Stream Regression Network for Dental Implant Position Prediction
Two-Stream Regression Network for Dental Implant Position Prediction
Xinquan Yang
Xuguang Li
Xuechen Li
Wenting Chen
Linlin Shen
X. Li
Yongqiang Deng
20
6
0
17 May 2023
Diabetic Foot Ulcer Grand Challenge 2022 Summary
Diabetic Foot Ulcer Grand Challenge 2022 Summary
Connah Kendrick
B. Cassidy
N. Reeves
Joseph M Pappachan
C. O'Shea
Vishnu Chandrabalan
Moi Hoon Yap
17
4
0
24 Apr 2023
Pretrained ViTs Yield Versatile Representations For Medical Images
Pretrained ViTs Yield Versatile Representations For Medical Images
Christos Matsoukas
Johan Fredin Haslum
Magnus P Soderberg
Kevin Smith
MedIm
ViT
19
11
0
13 Mar 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image
  Classification
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
28
177
0
19 Feb 2023
Efficiency 360: Efficient Vision Transformers
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
26
6
0
16 Feb 2023
Semantic Image Segmentation: Two Decades of Research
Semantic Image Segmentation: Two Decades of Research
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
29
50
0
13 Feb 2023
Out of Distribution Performance of State of Art Vision Model
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
35
2
0
25 Jan 2023
Exploiting the Generative Adversarial Network Approach to Create a
  Synthetic Topography Corneal Image
Exploiting the Generative Adversarial Network Approach to Create a Synthetic Topography Corneal Image
S. Jameel
S. Aydin
N. Ghaeb
Jafar Majidpour
Tarik A. Rashid
Sinan Q. Salih
P. S. JosephNg
GAN
MedIm
14
16
0
25 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
X. Wang
ViT
32
21
0
13 Dec 2022
Efficient Frequency Domain-based Transformers for High-Quality Image
  Deblurring
Efficient Frequency Domain-based Transformers for High-Quality Image Deblurring
Lingshun Kong
Jiangxin Dong
Mingqiang Li
J. Ge
Jin-shan Pan
ViT
22
142
0
22 Nov 2022
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
31
129
0
22 Nov 2022
Effective Audio Classification Network Based on Paired Inverse Pyramid
  Structure and Dense MLP Block
Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block
Yunhao Chen
Yunjie Zhu
Zihui Yan
Yifan Huang
Zhen Ren
Jianlu Shen
Lifang Chen
20
9
0
05 Nov 2022
Automatic Diagnosis of Myocarditis Disease in Cardiac MRI Modality using
  Deep Transformers and Explainable Artificial Intelligence
Automatic Diagnosis of Myocarditis Disease in Cardiac MRI Modality using Deep Transformers and Explainable Artificial Intelligence
M. Jafari
A. Shoeibi
Navid Ghassemi
Jónathan Heras
Saiguang Ling
...
Shuihua Wang
R. Alizadehsani
Juan M Gorriz
U. Acharya
Hamid Alinejad-Rokny
MedIm
20
11
0
26 Oct 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
38
3
0
25 Oct 2022
Boosting vision transformers for image retrieval
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
29
31
0
21 Oct 2022
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for
  Transformers
TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers
Hyeong Kyu Choi
Joonmyung Choi
Hyunwoo J. Kim
ViT
28
35
0
14 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural
  Networks on Small Datasets
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
20
57
0
12 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
25
17
0
11 Oct 2022
Coded Residual Transform for Generalizable Deep Metric Learning
Coded Residual Transform for Generalizable Deep Metric Learning
Shichao Kan
Yixiong Liang
Min Li
Yigang Cen
Jianxin Wang
Z. He
34
3
0
09 Oct 2022
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision
  Tasks
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks
Yen-Cheng Liu
Chih-Yao Ma
Junjiao Tian
Zijian He
Z. Kira
120
47
0
07 Oct 2022
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and
  Effective Fusion of Local, Global and Input Features
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features
S. Wadekar
Abhishek Chaurasia
ViT
98
87
0
30 Sep 2022
Effective Vision Transformer Training: A Data-Centric Perspective
Effective Vision Transformer Training: A Data-Centric Perspective
Benjia Zhou
Pichao Wang
Jun Wan
Yan-Ni Liang
Fan Wang
26
5
0
29 Sep 2022
Rethinking Blur Synthesis for Deep Real-World Image Deblurring
Rethinking Blur Synthesis for Deep Real-World Image Deblurring
Hao Wei
Chenyang Ge
Xin Qiao
Pengchao Deng
22
0
0
28 Sep 2022
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer
Yingyi Chen
Xiaoke Shen
Yahui Liu
Qinghua Tao
Johan A. K. Suykens
AAML
ViT
23
22
0
25 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online
  Action Detection
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
36
33
0
09 Jun 2022
Which models are innately best at uncertainty estimation?
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
26
5
0
05 Jun 2022
Inception Transformer
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
26
187
0
25 May 2022
MulT: An End-to-End Multitask Learning Transformer
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
36
62
0
17 May 2022
DaViT: Dual Attention Vision Transformers
DaViT: Dual Attention Vision Transformers
Mingyu Ding
Bin Xiao
Noel Codella
Ping Luo
Jingdong Wang
Lu Yuan
ViT
32
240
0
07 Apr 2022
Learning Local and Global Temporal Contexts for Video Semantic
  Segmentation
Learning Local and Global Temporal Contexts for Video Semantic Segmentation
Guolei Sun
Yun Liu
Henghui Ding
Min Wu
Luc Van Gool
25
32
0
07 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for
  Object Detection
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
23
55
0
06 Apr 2022
SepViT: Separable Vision Transformer
SepViT: Separable Vision Transformer
Wei Li
Xing Wang
Xin Xia
Jie Wu
Jiashi Li
Xuefeng Xiao
Min Zheng
Shiping Wen
ViT
26
40
0
29 Mar 2022
Beyond Fixation: Dynamic Window Visual Transformer
Beyond Fixation: Dynamic Window Visual Transformer
Pengzhen Ren
Changlin Li
Guangrun Wang
Yun Xiao
Qing Du
Xiaodan Liang
Qing Du Xiaodan Liang Xiaojun Chang
ViT
25
32
0
24 Mar 2022
Focal Modulation Networks
Focal Modulation Networks
Jianwei Yang
Chunyuan Li
Xiyang Dai
Lu Yuan
Jianfeng Gao
3DPC
24
263
0
22 Mar 2022
ScalableViT: Rethinking the Context-oriented Generalization of Vision
  Transformer
ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer
Rui Yang
Hailong Ma
Jie Wu
Yansong Tang
Xuefeng Xiao
Min Zheng
Xiu Li
ViT
19
53
0
21 Mar 2022
HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
HIPA: Hierarchical Patch Transformer for Single Image Super Resolution
Qing Cai
Yiming Qian
Jinxing Li
Junjie Lv
Yee-Hong Yang
Feng Wu
Dafan Zhang
19
28
0
19 Mar 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
19
637
0
20 Feb 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
32
465
0
14 Feb 2022
12
Next