ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.14881
  4. Cited By
Early Convolutions Help Transformers See Better

Early Convolutions Help Transformers See Better

28 June 2021
Tete Xiao
Mannat Singh
Eric Mintun
Trevor Darrell
Piotr Dollár
Ross B. Girshick
ArXivPDFHTML

Papers citing "Early Convolutions Help Transformers See Better"

50 / 114 papers shown
Title
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
87
0
0
27 Feb 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
70
0
0
26 Jan 2025
RobustBlack: Challenging Black-Box Adversarial Attacks on State-of-the-Art Defenses
RobustBlack: Challenging Black-Box Adversarial Attacks on State-of-the-Art Defenses
Mohamed Djilani
Salah Ghamizi
Maxime Cordy
38
0
0
31 Dec 2024
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices
PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices
Ming Kang
F. F. Ting
Raphaël C.-W. Phan
C. Ting
ViT
MedIm
57
1
0
29 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
50
2
0
22 Jul 2024
Steerable Transformers
Steerable Transformers
Soumyabrata Kundu
Risi Kondor
ViT
LLMSV
30
1
0
24 May 2024
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
Bingchen Li
Xin Li
Yiting Lu
Ruoyu Feng
Mengxi Guo
Shijie Zhao
Li Zhang
Zhibo Chen
34
13
0
26 Apr 2024
Training Transformer Models by Wavelet Losses Improves Quantitative and
  Visual Performance in Single Image Super-Resolution
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Cansu Korkmaz
A. Murat Tekalp
ViT
36
6
0
17 Apr 2024
DRCT: Saving Image Super-resolution away from Information Bottleneck
DRCT: Saving Image Super-resolution away from Information Bottleneck
Chih-Chung Hsu
Chia-Ming Lee
Yi-Shiuan Chou
24
31
0
31 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis
Dimitrios Karageorgiou
Giorgos Kordopatis-Zilos
Symeon Papadopoulos
ViT
20
5
0
18 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
23
15
0
18 Mar 2024
D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric
  Medical Image Segmentation
D-Net: Dynamic Large Kernel with Dynamic Feature Fusion for Volumetric Medical Image Segmentation
Jin Yang
Peijie Qiu
Yichi Zhang
Daniel S. Marcus
Aristeidis Sotiras
MedIm
36
9
0
15 Mar 2024
Activating Wider Areas in Image Super-Resolution
Activating Wider Areas in Image Super-Resolution
Cheng Cheng
Hang Wang
Hongbin Sun
34
10
0
13 Mar 2024
Self-supervised Video Object Segmentation with Distillation Learning of
  Deformable Attention
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention
Quang-Trung Truong
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VOS
34
1
0
25 Jan 2024
ClassLIE: Structure- and Illumination-Adaptive Classification for
  Low-Light Image Enhancement
ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement
Zixiang Wei
Yiting Wang
Lichao Sun
Athanasios V. Vasilakos
Lin Wang
36
0
0
20 Dec 2023
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu
Shujian Yu
Jingzheng Wu
S. Picek
AAML
35
0
0
08 Dec 2023
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
Meng Lou
Hong-Yu Zhou
Sibei Yang
Yizhou Yu
Chuan Wu
Yizhou Yu
ViT
41
36
0
30 Oct 2023
S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and
  Blind Super-Resolution
S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution
Minghao She
Wendong Mao
Huihong Shi
Zhongfeng Wang
ViT
9
0
0
16 Aug 2023
Bridging Vision and Language Encoders: Parameter-Efficient Tuning for
  Referring Image Segmentation
Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
Zunnan Xu
Zhihong Chen
Yong Zhang
Yibing Song
Xiang Wan
Guanbin Li
VLM
32
47
0
21 Jul 2023
Physics-Driven Turbulence Image Restoration with Stochastic Refinement
Physics-Driven Turbulence Image Restoration with Stochastic Refinement
Ajay Jaiswal
Xingguang Zhang
Stanley H. Chan
Zhangyang Wang
18
21
0
20 Jul 2023
Random Position Adversarial Patch for Vision Transformers
Random Position Adversarial Patch for Vision Transformers
Mingzhen Shao
ViT
AAML
23
2
0
09 Jul 2023
LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved
  Wavelet Attention and Reverse Diffusion
LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion
Long Bai
Tong Chen
Yanan Wu
An-Chi Wang
Mobarakol Islam
Hongliang Ren
DiffM
MedIm
28
19
0
05 Jul 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
37
28
0
01 Jun 2023
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation
Abdul Rehman Khan
Asifullah Khan
ViT
MedIm
34
14
0
15 May 2023
CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision
  Transformer: a clinically-inspired deep learning pipeline
CAD-RADS scoring of coronary CT angiography with Multi-Axis Vision Transformer: a clinically-inspired deep learning pipeline
Alessia Gerbasi
A. Dagliati
Giuseppe Albi
M. Chiesa
D. Andreini
A. Baggiano
S. Mushtaq
G. Pontone
Riccardo Bellazzi
G. Colombo
MedIm
25
5
0
14 Apr 2023
Effective Theory of Transformers at Initialization
Effective Theory of Transformers at Initialization
Emily Dinan
Sho Yaida
Susan Zhang
20
14
0
04 Apr 2023
PMatch: Paired Masked Image Modeling for Dense Geometric Matching
PMatch: Paired Masked Image Modeling for Dense Geometric Matching
Shengjie Zhu
Xiaoming Liu
30
22
0
30 Mar 2023
Vision Transformer with Quadrangle Attention
Vision Transformer with Quadrangle Attention
Qiming Zhang
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
19
38
0
27 Mar 2023
FastViT: A Fast Hybrid Vision Transformer using Structural
  Reparameterization
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization
Pavan Kumar Anasosalu Vasu
J. Gabriel
Jeff J. Zhu
Oncel Tuzel
Anurag Ranjan
ViT
34
151
0
24 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
25
24
0
02 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
23
3
0
28 Feb 2023
Device Tuning for Multi-Task Large Model
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
21
0
0
21 Feb 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
16
13
0
20 Feb 2023
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
26
18
0
09 Feb 2023
PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep
  Learning Models on Edge Devices
PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices
Yuji Chai
Devashree Tripathy
Chu Zhou
Dibakar Gope
Igor Fedorov
Ramon Matas
David Brooks
Gu-Yeon Wei
P. Whatmough
GNN
26
4
0
26 Jan 2023
RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in
  Autonomous Driving
RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving
Angelika Ando
Spyros Gidaris
Andrei Bursuc
Gilles Puy
Alexandre Boulch
Renaud Marlet
ViT
3DPC
8
71
0
24 Jan 2023
Holistically Explainable Vision Transformers
Holistically Explainable Vision Transformers
Moritz D Boehle
Mario Fritz
Bernt Schiele
ViT
33
9
0
20 Jan 2023
Explainability and Robustness of Deep Visual Classification Models
Explainability and Robustness of Deep Visual Classification Models
Jindong Gu
AAML
31
2
0
03 Jan 2023
EIT: Enhanced Interactive Transformer
EIT: Enhanced Interactive Transformer
Tong Zheng
Bei Li
Huiwen Bao
Tong Xiao
Jingbo Zhu
24
2
0
20 Dec 2022
AbHE: All Attention-based Homography Estimation
AbHE: All Attention-based Homography Estimation
Mingxiao Huo
Zhihao Zhang
Xinyang Ren
Xianqiang Yang
22
6
0
06 Dec 2022
Part-based Face Recognition with Vision Transformers
Part-based Face Recognition with Vision Transformers
Zhonglin Sun
Georgios Tzimiropoulos
ViT
15
15
0
30 Nov 2022
Lightweight Structure-Aware Attention for Visual Understanding
Lightweight Structure-Aware Attention for Visual Understanding
Heeseung Kwon
F. M. Castro
M. Marín-Jiménez
N. Guil
Alahari Karteek
26
2
0
29 Nov 2022
Degenerate Swin to Win: Plain Window-based Transformer without
  Sophisticated Operations
Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations
Tan Yu
Ping Li
ViT
43
5
0
25 Nov 2022
Transformer Based Multi-Grained Features for Unsupervised Person
  Re-Identification
Transformer Based Multi-Grained Features for Unsupervised Person Re-Identification
Jiacheng Li
Menglin Wang
Xiaojin Gong
ViT
13
16
0
22 Nov 2022
Spikeformer: A Novel Architecture for Training High-Performance
  Low-Latency Spiking Neural Network
Spikeformer: A Novel Architecture for Training High-Performance Low-Latency Spiking Neural Network
Yudong Li
Yunlin Lei
Xu Yang
21
26
0
19 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video
  UniFormer
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
25
106
0
17 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
25
6
0
14 Nov 2022
Explicitly Increasing Input Information Density for Vision Transformers
  on Small Datasets
Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets
Xiangyu Chen
Ying Qin
Wenju Xu
A. Bur
Cuncong Zhong
Guanghui Wang
ViT
38
3
0
25 Oct 2022
123
Next