ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.09315
  4. Cited By
End-to-End Object Detection with Adaptive Clustering Transformer

End-to-End Object Detection with Adaptive Clustering Transformer

18 November 2020
Minghang Zheng
Peng Gao
Renrui Zhang
Kunchang Li
Xiaogang Wang
Hongsheng Li
Hao Dong
    ViT
ArXivPDFHTML

Papers citing "End-to-End Object Detection with Adaptive Clustering Transformer"

50 / 103 papers shown
Title
A 2D Semantic-Aware Position Encoding for Vision Transformers
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Yuyao Zhang
Huishuai Bao
Sijia Peng
Chong Li
Feng Shi
ViT
31
0
0
14 May 2025
Context Aware Grounded Teacher for Source Free Object Detection
Context Aware Grounded Teacher for Source Free Object Detection
Tajamul Ashraf
Rajes Manna
Partha Sarathi Purkayastha
Tavaheed Tariq
Janibul Bashir
25
0
0
21 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
42
0
0
31 Mar 2025
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Efficient Token Compression for Vision Transformer with Spatial Information Preserved
Junzhu Mao
Yang Shen
Jinyang Guo
Yazhou Yao
Xiansheng Hua
ViT
36
0
0
30 Mar 2025
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
Pengcheng Zhao
Zhixian He
Fuwei Zhang
Shujin Lin
Fan Zhou
42
1
0
18 Jan 2025
ENACT: Entropy-based Clustering of Attention Input for Improving the
  Computational Performance of Object Detection Transformers
ENACT: Entropy-based Clustering of Attention Input for Improving the Computational Performance of Object Detection Transformers
Giorgos Savathrakis
Antonis Argyros
ViT
22
0
0
11 Sep 2024
A Review of Transformer-Based Models for Computer Vision Tasks:
  Capturing Global Context and Spatial Relationships
A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships
Gracile Astlin Pereira
Muhammad Hussain
ViT
37
7
0
27 Aug 2024
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation
Yifan Feng
Jiangang Huang
Shaoyi Du
Shihui Ying
Jun-Hai Yong
Yipeng Li
Guiguang Ding
Rongrong Ji
Yue Gao
ObjD
40
7
0
09 Aug 2024
Neural-based Video Compression on Solar Dynamics Observatory Images
Neural-based Video Compression on Solar Dynamics Observatory Images
Atefeh Khoshkhahtinat
Ali Zafari
P. Mehta
Nasser M. Nasrabadi
Barbara J. Thompson
M. Kirk
D. D. Silva
48
0
0
12 Jul 2024
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot
  Classification
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Jiaying Shi
Xuetong Xue
Shenghui Xu
VLM
37
0
0
08 Jul 2024
Diff3Dformer: Leveraging Slice Sequence Diffusion for Enhanced 3D CT
  Classification with Transformer Networks
Diff3Dformer: Leveraging Slice Sequence Diffusion for Enhanced 3D CT Classification with Transformer Networks
Zihao Jin
Yingying Fang
Jiahao Huang
Caiwen Xu
Simon Walsh
Guang Yang
MedIm
70
0
0
24 Jun 2024
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic
  Segmentation with Plain Vision Transformers
ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Narges Norouzi
Svetlana Orlova
Daan de Geus
Gijs Dubbelman
ViT
FedML
48
4
0
14 Jun 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision
  Transformers
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
36
0
0
14 Apr 2024
CathFlow: Self-Supervised Segmentation of Catheters in Interventional
  Ultrasound Using Optical Flow and Transformers
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers
Alex Ranne
Liming Kuang
Yordanka Velikova
Nassir Navab
F. R. Y. Baena
OOD
39
1
0
21 Mar 2024
PEEB: Part-based Image Classifiers with an Explainable and Editable
  Language Bottleneck
PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck
Thang M. Pham
Peijie Chen
Tin Nguyen
Seunghyun Yoon
Trung Bui
Anh Nguyen
VLM
49
7
0
08 Mar 2024
DEYO: DETR with YOLO for End-to-End Object Detection
DEYO: DETR with YOLO for End-to-End Object Detection
Haodong Ouyang
21
7
0
26 Feb 2024
Semi-supervised Counting via Pixel-by-pixel Density Distribution
  Modelling
Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling
Hui Lin
Zhiheng Ma
Rongrong Ji
Yao Wang
Zhou Su
Xiaopeng Hong
Deyu Meng
33
2
0
23 Feb 2024
Weakly Supervised Open-Vocabulary Object Detection
Weakly Supervised Open-Vocabulary Object Detection
Jianghang Lin
Yunhang Shen
Bingquan Wang
Shaohui Lin
Ke Li
Liujuan Cao
WSOD
33
7
0
19 Dec 2023
PixelLM: Pixel Reasoning with Large Multimodal Model
PixelLM: Pixel Reasoning with Large Multimodal Model
Zhongwei Ren
Zhicheng Huang
Yunchao Wei
Yao-Min Zhao
Dongmei Fu
Jiashi Feng
Xiaojie Jin
VLM
MLLM
LRM
28
82
0
04 Dec 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
61
5
0
02 Nov 2023
Improving Robustness for Vision Transformer with a Simple Dynamic
  Scanning Augmentation
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
27
2
0
01 Nov 2023
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series
  Forecasting
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
Xu Liu
Junfeng Hu
Yuan N. Li
Shizhe Diao
Yuxuan Liang
Bryan Hooi
Roger Zimmermann
AI4TS
27
76
0
15 Oct 2023
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for
  Accurate Object Detection
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
Yilong Lv
Min Li
Yujie He
Shaopeng Li
Zhuzhen He
Aitao Yang
26
1
0
09 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its
  Routing Policy
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
26
33
0
02 Oct 2023
ClusterFormer: Clustering As A Universal Visual Learner
ClusterFormer: Clustering As A Universal Visual Learner
James Liang
Yiming Cui
Qifan Wang
Tong Geng
Wenguan Wang
Dongfang Liu
VLM
37
8
0
22 Sep 2023
CL-MAE: Curriculum-Learned Masked Autoencoders
CL-MAE: Curriculum-Learned Masked Autoencoders
Neelu Madan
Nicolae-Cătălin Ristea
Kamal Nasrollahi
T. Moeslund
Radu Tudor Ionescu
19
10
0
31 Aug 2023
SPANet: Frequency-balancing Token Mixer using Spectral Pooling
  Aggregation Modulation
SPANet: Frequency-balancing Token Mixer using Spectral Pooling Aggregation Modulation
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Dong Hwan Kim
MoE
30
8
0
22 Aug 2023
ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive
  Sparse Anchor Generation
ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive Sparse Anchor Generation
Sheng-Hsiang Fu
Junkai Yan
Yipeng Gao
Xiaohua Xie
Wei-Shi Zheng
28
6
0
18 Aug 2023
Revisiting Vision Transformer from the View of Path Ensemble
Revisiting Vision Transformer from the View of Path Ensemble
Shuning Chang
Pichao Wang
Haowen Luo
Fan Wang
Mike Zheng Shou
ViT
34
3
0
12 Aug 2023
Graph Ladling: Shockingly Simple Parallel GNN Training without
  Intermediate Communication
Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
GNN
42
5
0
18 Jun 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery
  Tickets from Large Models
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
32
22
0
18 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The
  Weights that Matter
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
21
33
0
06 Jun 2023
Referred by Multi-Modality: A Unified Temporal Transformer for Video
  Object Segmentation
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
Shilin Yan
Renrui Zhang
Ziyu Guo
Wenchao Chen
Wei Zhang
Hongyang Li
Yu Qiao
Hao Dong
Zhongjiang He
Peng Gao
VOS
22
30
0
25 May 2023
SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for
  Monocular 3D Object Detection
SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection
Xuan He
Fan Yang
Kailun Yang
Jiacheng Lin
Haolong Fu
Hao Wu
Jin Yuan
Zhiyong Li
ViT
20
12
0
12 May 2023
AutoFocusFormer: Image Segmentation off the Grid
AutoFocusFormer: Image Segmentation off the Grid
Chen Ziwen
K. Patnaik
Shuangfei Zhai
Alvin Wan
Zhile Ren
A. Schwing
Alex Colburn
Li Fuxin
24
9
0
24 Apr 2023
Transformer-Based Visual Segmentation: A Survey
Transformer-Based Visual Segmentation: A Survey
Xiangtai Li
Henghui Ding
Haobo Yuan
Wenwei Zhang
Jiangmiao Pang
Guangliang Cheng
Kai-xiang Chen
Ziwei Liu
Chen Change Loy
ViT
MedIm
42
132
0
19 Apr 2023
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient
  Vision Transformers
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei
Brendan Duke
R. Jiang
P. Aarabi
Graham W. Taylor
Florian Shkurti
ViT
46
14
0
24 Mar 2023
OcTr: Octree-based Transformer for 3D Object Detection
OcTr: Octree-based Transformer for 3D Object Detection
Chao Zhou
Yanan Zhang
Jiaxin Chen
Di Huang
3DPC
ViT
27
42
0
22 Mar 2023
Making Vision Transformers Efficient from A Token Sparsification View
Making Vision Transformers Efficient from A Token Sparsification View
Shuning Chang
Pichao Wang
Ming Lin
Fan Wang
David Junhao Zhang
Rong Jin
Mike Zheng Shou
ViT
45
24
0
15 Mar 2023
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D
  Object Detection
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
Anthony Chen
Kevin Zhang
Renrui Zhang
Zihan Wang
Yuheng Lu
Yandong Guo
Shanghang Zhang
3DPC
70
60
0
14 Mar 2023
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware
  Attention
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Shijie Geng
Jianbo Yuan
Yu Tian
Yuxiao Chen
Yongfeng Zhang
CLIP
VLM
46
44
0
06 Mar 2023
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable
  Transformers
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Tianlong Chen
Zhenyu (Allen) Zhang
Ajay Jaiswal
Shiwei Liu
Zhangyang Wang
MoE
38
46
0
02 Mar 2023
iQuery: Instruments as Queries for Audio-Visual Sound Separation
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
34
26
0
07 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
26
2
0
06 Dec 2022
Vision Transformer with Super Token Sampling
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Ran He
Tieniu Tan
ViT
23
56
0
21 Nov 2022
Vision Transformers in Medical Imaging: A Review
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
27
34
0
18 Nov 2022
Pair DETR: Contrastive Learning Speeds Up DETR Training
Pair DETR: Contrastive Learning Speeds Up DETR Training
M. Iranmanesh
Xiaotong Chen
Kuo-Chin Lien
ViT
21
0
0
29 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
32
25
0
03 Oct 2022
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention
Ziyu Guo
Renrui Zhang
Longtian Qiu
Xianzheng Ma
Xupeng Miao
Xuming He
Bin Cui
VLM
AAML
59
109
0
28 Sep 2022
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video
  Grounding
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Yang Jin
Yongzhi Li
Zehuan Yuan
Yadong Mu
31
32
0
27 Sep 2022
123
Next