ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.03605
  4. Cited By
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object
  Detection

DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

7 March 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
    ViT
ArXivPDFHTML

Papers citing "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"

50 / 718 papers shown
Title
Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned
  Matching Transformer
Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer
Tahira Shehzadi
Shalini Sarode
Didier Stricker
Muhammad Zeshan Afzal
LMTD
25
4
0
30 Apr 2024
UniFS: Universal Few-shot Instance Perception with Point Representations
UniFS: Universal Few-shot Instance Perception with Point Representations
Sheng Jin
Ruijie Yao
Lumin Xu
Wentao Liu
Chao Qian
Ji Wu
Ping Luo
48
2
0
30 Apr 2024
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
H. R. Medeiros
David Latortue
Fidel Alejandro Guerrero Peña
Eric Granger
M. Pedersoli
19
1
0
29 Apr 2024
A Hybrid Approach for Document Layout Analysis in Document images
A Hybrid Approach for Document Layout Analysis in Document images
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
29
5
0
27 Apr 2024
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Sparse Reconstruction of Optical Doppler Tomography with Alternative State Space Model and Attention
Zhenghong Li
Jiaxiang Ren
Wensheng Cheng
C. Du
Yingtian Pan
Haibin Ling
48
0
0
26 Apr 2024
AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with
  Foundation Models
AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models
Zhiqiang Tang
Haoyang Fang
Su Zhou
Taojiannan Yang
Zihan Zhong
Tony Hu
Katrin Kirchhoff
George Karypis
40
11
0
24 Apr 2024
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Abhishek Aich
Yumin Suh
S. Schulter
Manmohan Chandraker
56
0
0
23 Apr 2024
On-the-Fly Point Annotation for Fast Medical Video Labeling
On-the-Fly Point Annotation for Fast Medical Video Labeling
A. Meyer
J. Mazellier
Jérémy Dana
Nicolas Padoy
24
0
0
22 Apr 2024
MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed
  3D Human Motions
MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions
Sheng Yan
Mengyuan Liu
Yong Wang
Yang Liu
C. L. P. Chen
Hong Liu
38
0
0
21 Apr 2024
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Zhuofan Zong
Bingqi Ma
Dazhong Shen
Guanglu Song
Hao Shao
Dongzhi Jiang
Hongsheng Li
Yu Liu
MoE
40
40
0
19 Apr 2024
Groma: Localized Visual Tokenization for Grounding Multimodal Large
  Language Models
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Chuofan Ma
Yi-Xin Jiang
Jiannan Wu
Zehuan Yuan
Xiaojuan Qi
VLM
ObjD
37
51
0
19 Apr 2024
Curriculum Point Prompting for Weakly-Supervised Referring Image
  Segmentation
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Qiyuan Dai
Sibei Yang
29
8
0
18 Apr 2024
The 8th AI City Challenge
The 8th AI City Challenge
Shuo Wang
D. Anastasiu
Zhenghang Tang
Ming-Ching Chang
Yue Yao
...
Xunlei Wu
S. Pusegaonkar
Yizhou Wang
Sujit Biswas
Rama Chellappa
33
31
0
15 Apr 2024
Arena: A Patch-of-Interest ViT Inference Acceleration System for
  Edge-Assisted Video Analytics
Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics
Haosong Peng
Wei Feng
Hao Li
Yufeng Zhan
Qihua Zhou
Yuanqing Xia
26
2
0
14 Apr 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
Lewei Yao
Renjie Pi
Jianhua Han
Xiaodan Liang
Hang Xu
Wei Zhang
Zhenguo Li
Dan Xu
VLM
ObjD
45
20
0
14 Apr 2024
Enhancing Mobile "How-to" Queries with Automated Search Results
  Verification and Reranking
Enhancing Mobile "How-to" Queries with Automated Search Results Verification and Reranking
Lei Ding
Jeshwanth Bheemanpally
Yi Zhang
27
1
0
13 Apr 2024
COCONut: Modernizing COCO Segmentation
COCONut: Modernizing COCO Segmentation
XueQing Deng
Qihang Yu
Peng Wang
Xiaohui Shen
Liang-Chieh Chen
40
16
0
12 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A
  Survey
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
50
47
0
08 Apr 2024
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
Shiyi Zhang
Wen-Dao Dai
Sujia Wang
Xiangwei Shen
Jiwen Lu
Jie Zhou
Yansong Tang
56
25
0
07 Apr 2024
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Hyperbolic Learning with Synthetic Captions for Open-World Detection
Fanjie Kong
Yanbei Chen
Jiarui Cai
Davide Modolo
VLM
ObjD
31
7
0
07 Apr 2024
Mixed-Query Transformer: A Unified Image Segmentation Architecture
Mixed-Query Transformer: A Unified Image Segmentation Architecture
Pei Wang
Zhaowei Cai
Hao-Yu Yang
Ashwin Swaminathan
R. Manmatha
Stefano Soatto
70
2
0
06 Apr 2024
DQ-DETR: DETR with Dynamic Query for Tiny Object Detection
DQ-DETR: DETR with Dynamic Query for Tiny Object Detection
Yi-Xin Huang
Hou-I Liu
Hong-Han Shuai
Wen-Huang Cheng
29
15
0
04 Apr 2024
Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer
Effective Lymph Nodes Detection in CT Scans Using Location Debiased Query Selection and Contrastive Query Representation in Transformer
Qinji Yu
Yirui Wang
K. Yan
Haoshen Li
Dazhou Guo
...
Na Shen
Qifeng Wang
Xiaowei Ding
X. Ye
Dakai Jin
MedIm
66
2
0
04 Apr 2024
TE-TAD: Towards Full End-to-End Temporal Action Detection via
  Time-Aligned Coordinate Expression
TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression
Ho-Joong Kim
Jung-Ho Hong
Heejo Kong
Seong-Whan Lee
47
5
0
03 Apr 2024
Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object
  Detection
Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection
Tahira Shehzadi
K. Hashmi
Didier Stricker
Muhammad Zeshan Afzal
36
12
0
02 Apr 2024
Roadside Monocular 3D Detection via 2D Detection Prompting
Roadside Monocular 3D Detection via 2D Detection Prompting
Yechi Ma
Shuoquan Wei
Churun Zhang
Wei Hua
Yanan Li
Shu Kong
36
0
0
01 Apr 2024
Dual DETRs for Multi-Label Temporal Action Detection
Dual DETRs for Multi-Label Temporal Action Detection
Yuhan Zhu
Guozhen Zhang
Jing Tan
Gangshan Wu
Limin Wang
35
11
0
31 Mar 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques
  and Insights
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
41
7
0
28 Mar 2024
Img2Loc: Revisiting Image Geolocalization using Multi-modality
  Foundation Models and Image-based Retrieval-Augmented Generation
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation
Zhongliang Zhou
Jielu Zhang
Zihan Guan
Mengxuan Hu
Ni Lao
Lan Mu
Sheng R. Li
Gengchen Mai
VLM
41
12
0
28 Mar 2024
Illicit object detection in X-ray images using Vision Transformers
Illicit object detection in X-ray images using Vision Transformers
Jorgen Cani
Ioannis Mademlis
Adamantia Anna Rebolledo Chrysochoou
Georgios Th. Papadopoulos
ViT
28
2
0
27 Mar 2024
AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
Qingping Sun
Yanjun Wang
Ailing Zeng
Wanqi Yin
Chen Wei
...
Haiyi Mei
Chi Sing Leung
Ziwei Liu
Lei Yang
Zhongang Cai
3DH
38
16
0
26 Mar 2024
DOCTR: Disentangled Object-Centric Transformer for Point Scene
  Understanding
DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding
Xiaoxuan Yu
Hao Wang
Weiming Li
Qiang Wang
Soonyong Cho
Younghun Sung
3DPC
ViT
27
0
0
25 Mar 2024
Data-Efficient 3D Visual Grounding via Order-Aware Referring
Data-Efficient 3D Visual Grounding via Order-Aware Referring
Tung-Yu Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
34
0
0
25 Mar 2024
Salience DETR: Enhancing Detection Transformer with Hierarchical
  Salience Filtering Refinement
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement
Xiuquan Hou
Meiqin Liu
Senlin Zhang
Ping Wei
Badong Chen
37
24
0
24 Mar 2024
Segment Anything Model for Road Network Graph Extraction
Segment Anything Model for Road Network Graph Extraction
Congrui Hetang
Haoru Xue
Cindy X. Le
Tianwei Yue
Wenping Wang
Yihui He
49
11
0
24 Mar 2024
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Qing Jiang
Feng Li
Zhaoyang Zeng
Tianhe Ren
Shilong Liu
Lei Zhang
VLM
27
37
0
21 Mar 2024
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
Yufan Chen
Jiaming Zhang
Kunyu Peng
Junwei Zheng
Ruiping Liu
Philip H. S. Torr
Rainer Stiefelhagen
OOD
29
5
0
21 Mar 2024
Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship
  Detection
Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection
Tim Salzmann
Markus Ryll
Alex Bewley
Matthias Minderer
40
4
0
21 Mar 2024
Bounding Box Stability against Feature Dropout Reflects Detector
  Generalization across Environments
Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments
Yang Yang
Wenhai Wang
Zhe Chen
Jifeng Dai
Liang Zheng
41
2
0
20 Mar 2024
Rotary Position Embedding for Vision Transformer
Rotary Position Embedding for Vision Transformer
Byeongho Heo
Song Park
Dongyoon Han
Sangdoo Yun
31
33
0
20 Mar 2024
TAPTR: Tracking Any Point with Transformers as Detection
TAPTR: Tracking Any Point with Transformers as Detection
Hongyang Li
Hao Zhang
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Lei Zhang
32
19
0
19 Mar 2024
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual
  Navigation
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation
Hao Wang
Jiayou Qin
Ashish Bastola
Xiwen Chen
John Suchanek
Zihao Gong
Abolfazl Razi
35
15
0
19 Mar 2024
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity
  Recognition
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition
Jielin Qiu
William Jongwon Han
Winfred Wang
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Christos Faloutsos
Lei Li
Lijuan Wang
VLM
56
2
0
19 Mar 2024
Siamese Learning with Joint Alignment and Regression for
  Weakly-Supervised Video Paragraph Grounding
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
Chaolei Tan
Jian-Huang Lai
Wei-Shi Zheng
Jianfang Hu
AI4TS
36
5
0
18 Mar 2024
SimPB: A Single Model for 2D and 3D Object Detection from Multiple
  Cameras
SimPB: A Single Model for 2D and 3D Object Detection from Multiple Cameras
Yingqi Tang
Zhaotie Meng
Guoliang Chen
Erkang Cheng
3DPC
24
0
0
15 Mar 2024
Animate Your Motion: Turning Still Images into Dynamic Videos
Animate Your Motion: Turning Still Images into Dynamic Videos
Mingxiao Li
Bo Wan
Marie-Francine Moens
Tinne Tuytelaars
VGen
DiffM
35
4
0
15 Mar 2024
GiT: Towards Generalist Vision Transformer through Universal Language
  Interface
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Haiyang Wang
Hao Tang
Li Jiang
Shaoshuai Shi
Muhammad Ferjad Naeem
Hongsheng Li
Bernt Schiele
Liwei Wang
VLM
30
10
0
14 Mar 2024
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling
  and Visual-Language Co-Referring
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Yufei Zhan
Yousong Zhu
Hongyin Zhao
Fan Yang
Ming Tang
Jinqiao Wang
ObjD
31
12
0
14 Mar 2024
Annotation Free Semantic Segmentation with Vision Foundation Models
Annotation Free Semantic Segmentation with Vision Foundation Models
Soroush Seifi
Daniel Olmeda Reino
Fabien Despinoy
Rahaf Aljundi
VLM
29
1
0
14 Mar 2024
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting
  Editing
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
Jing Wu
Jiawang Bian
Xinghui Li
Guangrun Wang
Ian D Reid
Philip H. S. Torr
V. Prisacariu
3DGS
27
33
0
13 Mar 2024
Previous
123...678...131415
Next