ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.04159
  4. Cited By
Deformable DETR: Deformable Transformers for End-to-End Object Detection
v1v2v3v4 (latest)

Deformable DETR: Deformable Transformers for End-to-End Object Detection

International Conference on Learning Representations (ICLR), 2020
8 October 2020
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
    ViT
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (3553★)

Papers citing "Deformable DETR: Deformable Transformers for End-to-End Object Detection"

50 / 2,782 papers shown
Vision-based 3D occupancy prediction in autonomous driving: a review and
  outlook
Vision-based 3D occupancy prediction in autonomous driving: a review and outlook
Yanan Zhang
Jinqing Zhang
Zengran Wang
Junhao Xu
Di Huang
377
26
0
04 May 2024
ViTALS: Vision Transformer for Action Localization in Surgical
  Nephrectomy
ViTALS: Vision Transformer for Action Localization in Surgical Nephrectomy
Soumyadeep Chandra
Sayeed Shafayet Chowdhury
Courtney Yong
Chandru P. Sundaram
Kaushik Roy
183
2
0
04 May 2024
Development of Skip Connection in Deep Neural Networks for Computer
  Vision and Medical Image Analysis: A Survey
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A SurveyEngineering applications of artificial intelligence (EAAI), 2024
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
248
12
0
02 May 2024
Imagine the Unseen: Occluded Pedestrian Detection via Adversarial
  Feature Completion
Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion
Shanshan Zhang
Mingqian Ji
Yang Li
Jian Yang
320
3
0
02 May 2024
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
Xiaoshi Wu
Yiming Hao
Manyuan Zhang
Keqiang Sun
Zhaoyang Huang
Guanglu Song
Yu Liu
Jiaming Song
EGVM
249
43
0
01 May 2024
Model Quantization and Hardware Acceleration for Vision Transformers: A
  Comprehensive Survey
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
Dayou Du
Gu Gong
Xiaowen Chu
MQ
456
17
0
01 May 2024
Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned
  Matching Transformer
Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer
Tahira Shehzadi
Shalini Sarode
Didier Stricker
Muhammad Zeshan Afzal
LMTD
332
6
0
30 Apr 2024
VimTS: A Unified Video and Image Text Spotter for Enhancing the
  Cross-domain Generalization
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
Yuliang Liu
Mingxin Huang
Hao Yan
Linger Deng
Weijia Wu
Hao Lu
Chunhua Shen
Lianwen Jin
Xiang Bai
241
3
0
30 Apr 2024
Reliable or Deceptive? Investigating Gated Features for Smooth Visual
  Explanations in CNNs
Reliable or Deceptive? Investigating Gated Features for Smooth Visual Explanations in CNNs
Soham Mitra
Atri Sukul
Swalpa Kumar Roy
Pravendra Singh
Vinay Kumar Verma
AAMLFAtt
175
1
0
30 Apr 2024
Robust Pedestrian Detection via Constructing Versatile Pedestrian
  Knowledge Bank
Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank
Sungjune Park
Hyunjun Kim
Y. Ro
239
19
0
30 Apr 2024
C2FDrone: Coarse-to-Fine Drone-to-Drone Detection using Vision
  Transformer Networks
C2FDrone: Coarse-to-Fine Drone-to-Drone Detection using Vision Transformer Networks
Sairam VC Rebbapragada
Pranoy Panda
Vineeth N. Balasubramanian
ViT
236
14
0
30 Apr 2024
Dexterous Grasp Transformer
Dexterous Grasp Transformer
Guo-Hao Xu
Yi-Lin Wei
Dian Zheng
Xiao-Ming Wu
Wei-Shi Zheng
ViT
261
19
0
28 Apr 2024
A Hybrid Approach for Document Layout Analysis in Document images
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
226
15
0
27 Apr 2024
Efficient Bi-manipulation using RGBD Multi-model Fusion based on
  Attention Mechanism
Efficient Bi-manipulation using RGBD Multi-model Fusion based on Attention Mechanism
Jian Shen
Jiaxin Huang
Zhigong Song
102
0
0
27 Apr 2024
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Zhenghong Li
Jiaxiang Ren
Wensheng Cheng
C. Du
Yingtian Pan
Haibin Ling
H. Ling
257
0
0
26 Apr 2024
UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning
UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning
Maoxun Yuan
Bo Cui
Tianyi Zhao
Xingxing Wei
Shan Fu
Xue Yang
Xingxing Wei
367
0
0
26 Apr 2024
Features Fusion for Dual-View Mammography Mass Detection
Features Fusion for Dual-View Mammography Mass Detection
Arina Varlamova
Valery Belotsky
Grigory Novikov
Anton Konushin
Evgeny Sidorov
MedIm
160
2
0
25 Apr 2024
Multi-Scale Representations by Varying Window Attention for Semantic
  Segmentation
Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Haotian Yan
Ming Wu
Chuang Zhang
327
29
0
25 Apr 2024
BezierFormer: A Unified Architecture for 2D and 3D Lane Detection
BezierFormer: A Unified Architecture for 2D and 3D Lane Detection
Zhiwei Dong
Xi Zhu
Xiya Cao
Ran Ding
Wei Li
Caifa Zhou
Yongliang Wang
Qiangbo Liu
258
11
0
25 Apr 2024
ChEX: Interactive Localization and Region Description in Chest X-rays
ChEX: Interactive Localization and Region Description in Chest X-rays
Philip Muller
Georgios Kaissis
Daniel Rueckert
252
12
0
24 Apr 2024
SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Style Transfer
SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Style TransferPattern Recognition (Pattern Recogn.), 2024
Yantao Du
Yuqi Zhang
GAN
336
0
0
24 Apr 2024
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous
  Driving
OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving
Guoqing Wang
Zhongdao Wang
Pin Tang
Jilai Zheng
Xiangxuan Ren
Bailan Feng
Chao Ma
DiffM
231
24
0
23 Apr 2024
DesignProbe: A Graphic Design Benchmark for Multimodal Large Language
  Models
DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models
Jieru Lin
Danqing Huang
Tiejun Zhao
Dechen Zhan
Chin-Yew Lin
VLMMLLM
242
4
0
23 Apr 2024
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Abhishek Aich
Yumin Suh
S. Schulter
Manmohan Chandraker
452
1
0
23 Apr 2024
PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Zhangjing Yang
Dun Liu
Wensheng Cheng
Jinqiao Wang
Yi Wu
VLM
270
2
0
22 Apr 2024
Groma: Localized Visual Tokenization for Grounding Multimodal Large
  Language Models
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Chuofan Ma
Yi Jiang
Jiannan Wu
Zehuan Yuan
Xiaojuan Qi
VLMObjD
271
107
0
19 Apr 2024
FipTR: A Simple yet Effective Transformer Framework for Future Instance
  Prediction in Autonomous Driving
FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
Xingtai Gui
Tengteng Huang
Haonan Shao
Haotian Yao
Chi Zhang
238
7
0
19 Apr 2024
Performance Evaluation of Segment Anything Model with Variational
  Prompting for Application to Non-Visible Spectrum Imagery
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery
Yona Falinie A. Gaus
Neelanjan Bhowmik
Brian K. S. Isaac-Medina
T. Breckon
VLM
210
5
0
18 Apr 2024
MLS-Track: Multilevel Semantic Interaction in RMOT
MLS-Track: Multilevel Semantic Interaction in RMOT
Zeliang Ma
Yang Song
Zhe Cui
Zhicheng Zhao
Fei Su
Delong Liu
Jingyu Wang
211
8
0
18 Apr 2024
Curriculum Point Prompting for Weakly-Supervised Referring Image
  Segmentation
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Qiyuan Dai
Sibei Yang
218
25
0
18 Apr 2024
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with
  Self-Distillation
Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Song Wang
Jiawei Yu
Wentong Li
Wenyu Liu
Xiaolu Liu
Junbo Chen
Jianke Zhu
269
48
0
18 Apr 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric
  Action Recognition
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
433
7
0
18 Apr 2024
TempBEV: Improving Learned BEV Encoders with Combined Image and BEV
  Space Temporal Aggregation
TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation
T. Monninger
Vandana Dokkadi
Md Zafar Anwar
Steffen Staab
181
7
0
17 Apr 2024
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale
  Approach
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
Mir Rayat Imtiaz Hossain
Mennatullah Siam
Leonid Sigal
James J. Little
VLM
281
21
0
17 Apr 2024
Multi-resolution Rescored ByteTrack for Video Object Detection on
  Ultra-low-power Embedded Systems
Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems
Luca Bompani
Manuele Rusci
Daniele Palossi
Francesco Conti
Luca Benini
MQ
196
3
0
17 Apr 2024
CarcassFormer: An End-to-end Transformer-based Framework for
  Simultaneous Localization, Segmentation and Classification of Poultry Carcass
  Defect
CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect
Minh Q. Tran
Sang Truong
Arthur F. A. Fernandes
Michael Kidd
Ngan Le
ViT
279
11
0
17 Apr 2024
Improving Hierarchical Representations of Vectorized HD Maps with Perspective Clues
Improving Hierarchical Representations of Vectorized HD Maps with Perspective Clues
Chi Zhang
Qi Song
Feifei Li
Yongquan Chen
Rui Huang
181
4
0
17 Apr 2024
OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection
  and Discovery
OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery
Matthew J. Inkawhich
Nathan Inkawhich
Hao Yang
Jingyang Zhang
Randolph Linderman
Yiran Chen
ObjD
242
1
0
16 Apr 2024
No More Ambiguity in 360° Room Layout via Bi-Layout Estimation
No More Ambiguity in 360° Room Layout via Bi-Layout Estimation
Yu-Ju Tsai
Jin-Cheng Jhang
Jingjing Zheng
Wei Wang
Albert Y. C. Chen
Min Sun
Cheng-Hao Kuo
Ming-Hsuan Yang
3DV
209
7
0
15 Apr 2024
Design and Analysis of Efficient Attention in Transformers for Social
  Group Activity Recognition
Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition
Masato Tamura
150
4
0
15 Apr 2024
STMixer: A One-Stage Sparse Action Detector
STMixer: A One-Stage Sparse Action Detector
Tao Wu
Mengqing Cao
Ziteng Gao
Gangshan Wu
Limin Wang
229
37
0
15 Apr 2024
SparseOcc: Rethinking Sparse Latent Representation for Vision-Based
  Semantic Occupancy Prediction
SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction
Pin Tang
Zhongdao Wang
Guoqing Wang
Jilai Zheng
Xiangxuan Ren
Bailan Feng
Chao Ma
230
78
0
15 Apr 2024
Q2A: Querying Implicit Fully Continuous Feature Pyramid to Align
  Features for Medical Image Segmentation
Q2A: Querying Implicit Fully Continuous Feature Pyramid to Align Features for Medical Image Segmentation
Jiahao Yu
Li Chen
294
0
0
15 Apr 2024
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision
  Transformers
Weight Copy and Low-Rank Adaptation for Few-Shot Distillation of Vision Transformers
Diana-Nicoleta Grigore
Mariana-Iuliana Georgescu
J. A. Justo
T. Johansen
Andreea-Iuliana Ionescu
Radu Tudor Ionescu
340
1
0
14 Apr 2024
Arena: A Patch-of-Interest ViT Inference Acceleration System for
  Edge-Assisted Video Analytics
Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics
Haosong Peng
Wei Feng
Hao Li
Yufeng Zhan
Qihua Zhou
Yuanqing Xia
145
3
0
14 Apr 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Lewei Yao
Renjie Pi
Jianhua Han
Xiaodan Liang
Hang Xu
Wei Zhang
Zhenguo Li
Dan Xu
VLMObjD
304
46
0
14 Apr 2024
MAProtoNet: A Multi-scale Attentive Interpretable Prototypical Part
  Network for 3D Magnetic Resonance Imaging Brain Tumor Classification
MAProtoNet: A Multi-scale Attentive Interpretable Prototypical Part Network for 3D Magnetic Resonance Imaging Brain Tumor Classification
Binghua Li
Jie Mao
Zhe Sun
Chao Li
Qibin Zhao
Toshihisa Tanaka
231
2
0
13 Apr 2024
Sparse Laneformer
Sparse Laneformer
Ji Liu
Zifeng Zhang
Mingjie Lu
Hongyang Wei
Dong Li
Yile Xie
Jinzhang Peng
Lu Tian
Ashish Sirasao
E. Barsoum
188
4
0
11 Apr 2024
Do You Remember? Dense Video Captioning with Cross-Modal Memory
  Retrieval
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval
Minkuk Kim
Hyeon Bae Kim
Jinyoung Moon
Jinwoo Choi
Seong Tae Kim
181
40
0
11 Apr 2024
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
GLID: Pre-training a Generalist Encoder-Decoder Vision Model
Jihao Liu
Jinliang Zheng
Yu Liu
Jiaming Song
VLM
210
6
0
11 Apr 2024
Previous
123...171819...545556
Next
Page 18 of 56
Pageof 56