ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.16527
  4. Cited By
Exploring Plain Vision Transformer Backbones for Object Detection

Exploring Plain Vision Transformer Backbones for Object Detection

30 March 2022
Yanghao Li
Hanzi Mao
Ross B. Girshick
Kaiming He
    ViT
ArXivPDFHTML

Papers citing "Exploring Plain Vision Transformer Backbones for Object Detection"

50 / 117 papers shown
Title
Perceptual MAE for Image Manipulation Localization: A High-level Vision
  Learner Focusing on Low-level Features
Perceptual MAE for Image Manipulation Localization: A High-level Vision Learner Focusing on Low-level Features
Xiaochen Ma
Jizhe Zhou
Xiong Xu
Zhuohang Jiang
Chi-Man Pun
24
0
0
10 Oct 2023
Training a Large Video Model on a Single Machine in a Day
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
25
15
0
28 Sep 2023
Self-Supervised Masked Digital Elevation Models Encoding for
  Low-Resource Downstream Tasks
Self-Supervised Masked Digital Elevation Models Encoding for Low-Resource Downstream Tasks
Priyam Mazumdar
Aiman Soliman
Volodymyr V. Kindratenko
Luigi Marini
Kenton McHenry
15
0
0
06 Sep 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
19
53
0
21 Aug 2023
Exploring Part-Informed Visual-Language Learning for Person Re-Identification
Exploring Part-Informed Visual-Language Learning for Person Re-Identification
Y. Lin
Cong Liu
Yehansen Chen
Jinshui Hu
Bing Yin
Baocai Yin
Zengfu Wang
60
7
0
04 Aug 2023
DETR Doesn't Need Multi-Scale or Locality Design
DETR Doesn't Need Multi-Scale or Locality Design
Yutong Lin
Yuhui Yuan
Zheng-Wei Zhang
Chen Li
Nanning Zheng
Han Hu
25
5
0
03 Aug 2023
IML-ViT: Benchmarking Image Manipulation Localization by Vision
  Transformer
IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer
Xiaochen Ma
Bo Du
Zhuohang Jiang
Ahmed Y. Al Hammadi
Jizhe Zhou
11
7
0
27 Jul 2023
COCO-O: A Benchmark for Object Detectors under Natural Distribution
  Shifts
COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts
Xiaofeng Mao
YueFeng Chen
Yao Zhu
Da Chen
Hang Su
Rong Zhang
H. Xue
ObjD
OOD
21
18
0
24 Jul 2023
Quantized Feature Distillation for Network Quantization
Quantized Feature Distillation for Network Quantization
Kevin Zhu
Yin He
Jianxin Wu
MQ
24
8
0
20 Jul 2023
Hierarchical Open-vocabulary Universal Image Segmentation
Hierarchical Open-vocabulary Universal Image Segmentation
Xudong Wang
Shufang Li
Konstantinos Kallidromitis
Yu Kato
Kazuki Kozuka
Trevor Darrell
VLM
OCL
30
36
0
03 Jul 2023
How can objects help action recognition?
How can objects help action recognition?
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
30
14
0
20 Jun 2023
Semi-supervised Cell Recognition under Point Supervision
Semi-supervised Cell Recognition under Point Supervision
Zhongyi Shui
Yizhi Zhao
S. Zheng
Yunlong Zhang
Honglin Li
Shichuan Zhang
Xiaoxuan Yu
Chenglu Zhu
Lin Yang
18
1
0
14 Jun 2023
DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Xiuye Gu
Yin Cui
Jonathan Huang
Abdullah M. Rashwan
X. Yang
...
Golnaz Ghiasi
Weicheng Kuo
Huizhong Chen
Liang-Chieh Chen
David A. Ross
ISeg
24
26
0
02 Jun 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
19
12
0
22 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
16
114
0
18 May 2023
Semantically Structured Image Compression via Irregular Group-Based Decoupling
Semantically Structured Image Compression via Irregular Group-Based Decoupling
V. Sheoran
Yixin Gao
Shreyansh Joshi
Tanisha R. Bhayani
Zhibo Chen
34
13
0
04 May 2023
Segment Anything Model for Medical Image Analysis: an Experimental Study
Segment Anything Model for Medical Image Analysis: an Experimental Study
Maciej Mazurowski
Haoyu Dong
Han Gu
Jichen Yang
N. Konz
Yixin Zhang
MedIm
VLM
28
470
0
20 Apr 2023
Permutation Equivariance of Transformers and Its Applications
Permutation Equivariance of Transformers and Its Applications
Hengyuan Xu
Liyao Xiang
Hang Ye
Dixi Yao
Pengzhi Chu
Baochun Li
17
13
0
16 Apr 2023
Efficient OCR for Building a Diverse Digital History
Efficient OCR for Building a Diverse Digital History
Jacob Carlson
Tom Bryan
Melissa Dell
16
11
0
05 Apr 2023
Industrial Anomaly Detection with Domain Shift: A Real-world Dataset and
  Masked Multi-scale Reconstruction
Industrial Anomaly Detection with Domain Shift: A Real-world Dataset and Masked Multi-scale Reconstruction
Zilong Zhang
Zhibin Zhao
Xingwu Zhang
Chuang Sun
Xuefeng Chen
19
47
0
05 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
8
46
0
30 Mar 2023
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks
Weicheng Kuo
A. Piergiovanni
Dahun Kim
Xiyang Luo
Benjamin Caine
...
Luowei Zhou
Andrew M. Dai
Zhifeng Chen
Claire Cui
A. Angelova
MLLM
VLM
23
23
0
29 Mar 2023
Vision Transformer with Quadrangle Attention
Vision Transformer with Quadrangle Attention
Qiming Zhang
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
19
38
0
27 Mar 2023
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object
  Detection
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection
Hwanjun Song
Jihwan Bang
VLM
ObjD
18
14
0
25 Mar 2023
PoseRAC: Pose Saliency Transformer for Repetitive Action Counting
PoseRAC: Pose Saliency Transformer for Repetitive Action Counting
Ziyu Yao
Xuxin Cheng
Yuexian Zou
ViT
16
19
0
15 Mar 2023
RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose
RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose
Tao Jiang
Peng Lu
Li Zhang
Ning Ma
Rui Han
Chengqi Lyu
Yining Li
Kai-xiang Chen
3DH
31
155
0
13 Mar 2023
Token Sparsification for Faster Medical Image Segmentation
Token Sparsification for Faster Medical Image Segmentation
Lei Zhou
Huidong Liu
Joseph Bae
Junjun He
Dimitris Samaras
Prateek Prasanna
MedIm
11
3
0
11 Mar 2023
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Yuan Liu
Songyang Zhang
Jiacheng Chen
Kai-xiang Chen
Dahua Lin
67
27
0
04 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
16
3
0
28 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
27
3
0
18 Feb 2023
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection
Hao Chen
Feihong Shen
ViT
29
0
0
16 Feb 2023
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Liya Wang
A. Tien
30
7
0
28 Jan 2023
Long Range Pooling for 3D Large-Scale Scene Understanding
Long Range Pooling for 3D Large-Scale Scene Understanding
Xiang-Li Li
Meng-Hao Guo
Tai-Jiang Mu
Ralph Robert Martin
Shiyong Hu
3DV
3DPC
14
2
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
27
11
0
17 Jan 2023
Proposal Distribution Calibration for Few-Shot Object Detection
Proposal Distribution Calibration for Few-Shot Object Detection
Bohao Li
Chang-rui Liu
Mengnan Shi
Xiaozhong Chen
Xiang Ji
QiXiang Ye
ObjD
19
5
0
15 Dec 2022
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group
  Propagation
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
X. Wang
ViT
27
21
0
13 Dec 2022
DETRs with Collaborative Hybrid Assignments Training
DETRs with Collaborative Hybrid Assignments Training
Zhuofan Zong
Guanglu Song
Yu Liu
ViT
19
304
0
22 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at
  Scale
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
49
673
0
14 Nov 2022
BiViT: Extremely Compressed Binary Vision Transformer
BiViT: Extremely Compressed Binary Vision Transformer
Yefei He
Zhenyu Lou
Luoming Zhang
Jing Liu
Weijia Wu
Hong Zhou
Bohan Zhuang
ViT
MQ
18
28
0
14 Nov 2022
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining
Qiang Chen
Jian Wang
Chuchu Han
Shangang Zhang
Zexian Li
...
Haocheng Feng
Kun Yao
Junyu Han
Errui Ding
Jingdong Wang
ViT
VLM
22
44
0
07 Nov 2022
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Rethinking Hierarchies in Pre-trained Plain Vision Transformer
Yufei Xu
Jing Zhang
Qiming Zhang
Dacheng Tao
11
1
0
03 Nov 2022
State-of-the-art Models for Object Detection in Various Fields of
  Application
State-of-the-art Models for Object Detection in Various Fields of Application
S. A. G. Naqvi
Syed Shahnawaz Ali
ObjD
OOD
16
0
0
01 Nov 2022
Face Pyramid Vision Transformer
Face Pyramid Vision Transformer
Khawar Islam
M. Zaheer
Arif Mahmood
ViT
CVBM
17
4
0
21 Oct 2022
Towards Sustainable Self-supervised Learning
Towards Sustainable Self-supervised Learning
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
CLL
25
7
0
20 Oct 2022
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View
  Completion
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion
Philippe Weinzaepfel
Vincent Leroy
Thomas Lucas
Romain Brégier
Yohann Cabon
Vaibhav Arora
L. Antsfeld
Boris Chidlovskii
G. Csurka
Jérôme Revaud
SSL
21
64
0
19 Oct 2022
Sequence and Circle: Exploring the Relationship Between Patches
Sequence and Circle: Exploring the Relationship Between Patches
Zhengyang Yu
Jochen Triesch
ViT
17
0
0
18 Oct 2022
1st Place Solutions for the UVO Challenge 2022
1st Place Solutions for the UVO Challenge 2022
Jiajun Zhang
Boyu Chen
Zhilong Ji
Jinfeng Bai
Zonghai Hu
12
1
0
18 Oct 2022
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
SaiT: Sparse Vision Transformers through Adaptive Token Pruning
Ling Li
D. Thorsley
Joseph Hassoun
ViT
25
17
0
11 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
17
25
0
03 Oct 2022
Learning Hierarchical Image Segmentation For Recognition and By
  Recognition
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke
Sangwoo Mo
Stella X. Yu
VLM
22
9
0
01 Oct 2022
Previous
123
Next