Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.05786
Cited By
CAT: Cross Attention in Vision Transformer
10 June 2021
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CAT: Cross Attention in Vision Transformer"
18 / 18 papers shown
Title
Multimodal Graph Representation Learning for Robust Surgical Workflow Recognition with Adversarial Feature Disentanglement
Long Bai
Boyi Ma
Ruohan Wang
Guankun Wang
Beilei Cui
...
Mobarakol Islam
Zhe Min
Jiewen Lai
Nassir Navab
Hongliang Ren
41
0
0
03 May 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
29
0
0
19 Apr 2025
ESIQA: Perceptual Quality Assessment of Vision-Pro-based Egocentric Spatial Images
Zhirui Kuai
Liu Yang
Huiyu Duan
Yuxing Han
Guoyu Tang
P. Callet
76
2
0
24 Feb 2025
BackdoorDM: A Comprehensive Benchmark for Backdoor Learning in Diffusion Model
Weilin Lin
Nanjun Zhou
Y. Wang
Jianze Li
Hui Xiong
Li Liu
AAML
DiffM
107
0
0
17 Feb 2025
Complementary Advantages: Exploiting Cross-Field Frequency Correlation for NIR-Assisted Image Denoising
Y. Wang
Hongyuan Wang
Lizhi Wang
X. Wang
Lin Zhu
Wanxuan Lu
Hua Huang
84
1
0
21 Dec 2024
Learning the Generalizable Manipulation Skills on Soft-body Tasks via Guided Self-attention Behavior Cloning Policy
XueTao Li
Fang Gao
Jun Yu
Shaodong Li
Feng Shuang
LM&Ro
16
0
0
08 Oct 2024
3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition
Lei Wang
Piotr Koniusz
ViT
21
45
0
25 Mar 2023
Towards Simultaneous Segmentation of Liver Tumors and Intrahepatic Vessels via Cross-attention Mechanism
Haopeng Kuang
Dingkang Yang
Shunli Wang
Xiaoying Wang
Lihua Zhang
MedIm
17
11
0
20 Feb 2023
Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification
Bo-Wen Zhang
Jiakang Yuan
Baopu Li
Tao Chen
Jiayuan Fan
Botian Shi
ViT
9
31
0
02 Jul 2022
BiTr-Unet: a CNN-Transformer Combined Network for MRI Brain Tumor Segmentation
Qiran Jia
Hai Shu
ViT
MedIm
88
68
0
25 Sep 2021
CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
Wenxiao Wang
Lulian Yao
Long Chen
Binbin Lin
Deng Cai
Xiaofei He
Wei Liu
17
255
0
31 Jul 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,518
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,604
0
24 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
220
510
0
11 Feb 2021
Conditional Convolutions for Instance Segmentation
Zhi Tian
Chunhua Shen
Hao Chen
ISeg
169
597
0
12 Mar 2020
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
229
776
0
17 Sep 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,471
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,196
0
16 Nov 2016
1