Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2203.03605
Cited By
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
7 March 2022
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
50 / 716 papers shown
Title
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
53
0
0
20 Mar 2025
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology
Siyuan Yan
Ming Hu
Yiwen Jiang
X. Li
Hao Fei
P. Tschandl
Harald Kittler
Zongyuan Ge
VLM
62
0
0
19 Mar 2025
UltraFlwr -- An Efficient Federated Medical and Surgical Object Detection Framework
Yang Li
Soumya Snigdha Kundu
Maxence Boels
Toktam Mahmoodi
Sebastien Ourselin
Tom Vercauteren
Prokar Dasgupta
J. Shapey
Alejandro Granados
FedML
35
0
0
19 Mar 2025
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation
Yang Zhou
Shiyu Zhao
Y. Chen
Z. Wang
Dimitris N. Metaxas
ObjD
56
0
0
18 Mar 2025
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection
Shani Gamrian
Hila Barel
Feiran Li
Masakazu Yoshimura
Daisuke Iso
48
0
0
17 Mar 2025
STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans
Shashikant Verma
Harish Katti
Soumyaratna Debnath
Yamuna Swamy
S. Raman
100
0
0
17 Mar 2025
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
60
0
0
17 Mar 2025
EditID: Training-Free Editable ID Customization for Text-to-Image Generation
Guandong Li
Zhaobin Chu
DiffM
57
0
0
16 Mar 2025
Cross-Modal Consistency Learning for Sign Language Recognition
Kepeng Wu
Zecheng Li
Weichao Zhao
Hezhen Hu
Wengang Zhou
SLR
42
0
0
16 Mar 2025
Modeling Variants of Prompts for Vision-Language Models
Ao Li
Zongfang Liu
Xinhua Li
Jinghui Zhang
Pengwei Wang
Hu Wang
VLM
48
0
0
13 Mar 2025
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Jinyang Li
En Yu
Sijia Chen
Wenbing Tao
52
1
0
13 Mar 2025
TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction
Xuying Zhang
Yutong Liu
Yangguang Li
Renrui Zhang
Y. Liu
...
Wanli Ouyang
Zhiwei Xiong
Peng Gao
Qibin Hou
Ming-Ming Cheng
118
3
0
13 Mar 2025
Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis
N. Islam
Dongao Ma
Jiaxuan Pang
Shivasakthi Senthil Velan
Michael B. Gotway
Jianming Liang
53
0
0
12 Mar 2025
From Slices to Sequences: Autoregressive Tracking Transformer for Cohesive and Consistent 3D Lymph Node Detection in CT Scans
Qinji Yu
Yirui Wang
K. Yan
Dandan Zheng
Dashan Ai
...
N. Shen
Xiaowei Ding
Le Lu
X. Ye
Dakai Jin
ViT
MedIm
57
0
0
11 Mar 2025
SparseVoxFormer: Sparse Voxel-based Transformer for Multi-modal 3D Object Detection
Hyeongseok Son
Jia He
Seung-In Park
Ying Min
Yunhao Zhang
ByungIn Yoo
50
0
0
11 Mar 2025
Pre-trained Models Succeed in Medical Imaging with Representation Similarity Degradation
Wenqiang Zu
Shenghao Xie
Hao Chen
Lei Ma
MedIm
42
0
0
11 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
VLM
ObjD
72
1
0
10 Mar 2025
SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
Jing Zhang
Z. Li
Qingyi Gu
MQ
VLM
51
0
0
09 Mar 2025
Segment Anything, Even Occluded
Wei-En Tai
Yu-Lin Shih
Cheng Sun
Y. Wang
Hwann-Tzong Chen
VLM
60
0
0
08 Mar 2025
Fractional Correspondence Framework in Detection Transformer
Masoumeh Zareapoor
Pourya Shamsolmoali
Huiyu Zhou
Yue Lu
Salvador García
50
0
0
06 Mar 2025
A lightweight model FDM-YOLO for small target improvement based on YOLOv8
Xuerui Zhang
ObjD
53
0
0
06 Mar 2025
AHCPTQ: Accurate and Hardware-Compatible Post-Training Quantization for Segment Anything Model
Wenlun Zhang
Shimpei Ando
Kentaro Yoshioka
VLM
MQ
57
0
0
05 Mar 2025
Evaluating Stenosis Detection with Grounding DINO, YOLO, and DINO-DETR
Muhammad Musab Ansari
29
0
0
03 Mar 2025
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang
Shaobin Zhuang
Canmiao Fu
Binxin Yang
Ying Zhang
Chong Sun
Zhizheng Zhang
Yali Wang
Chen Li
Zheng-Jun Zha
DiffM
69
1
0
03 Mar 2025
Object-Aware Video Matting with Cross-Frame Guidance
H. Zhang
Dongyue Wu
Yuanjie Shao
Nong Sang
Changxin Gao
VOS
72
0
0
03 Mar 2025
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan
Xianghong Li
Jifeng Dai
Tao Xiang
46
0
0
03 Mar 2025
SAR-W-MixMAE: SAR Foundation Model Training Using Backscatter Power Weighting
Ali Caglayan
Nevrez Imamoglu
T. Kouyama
60
0
0
03 Mar 2025
Solving Instance Detection from an Open-World Perspective
Qianqian Shen
Yunhan Zhao
Nahyun Kwon
Jeeeun Kim
Yanan Li
Shu Kong
32
0
0
01 Mar 2025
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
Shuming Liu
Chen Zhao
Fatimah Zohra
Mattia Soldan
Alejandro Pardo
...
Juan Carlos León Alcázar
A. Cioppa
Silvio Giancola
Carlos Hinojosa
Bernard Ghanem
57
3
0
27 Feb 2025
WalnutData: A UAV Remote Sensing Dataset of Green Walnuts and Model Evaluation
Mingjie Wu
Chenggui Yang
Huihua Wang
Chen Xue
Yibo Wang
...
Yuqi Han
R. Li
Lijun Yun
Zaiqing Chen
S.
47
0
0
27 Feb 2025
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration
X. J. Yang
J. Liu
Peng Wang
Guoqing Wang
Y. Yang
H. Shen
ObjD
79
0
0
27 Feb 2025
K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs
Ziheng Ouyang
Zhen Li
Qibin Hou
MoMe
OffRL
95
2
0
25 Feb 2025
Vision Language Models in Medicine
Beria Chingnabe Kalpelbe
Angel Gabriel Adaambiik
Wei Peng
VLM
LM&MA
86
2
0
24 Feb 2025
Hierarchical Context Transformer for Multi-level Semantic Scene Understanding
Luoying Hao
Yan Hu
Yang Yue
Li Wu
Huazhu Fu
Jinming Duan
Jiang Liu
59
0
0
24 Feb 2025
EDocNet: Efficient Datasheet Layout Analysis Based on Focus and Global Knowledge Distillation
Hong Cai Chen
Longchang Wu
Yang Zhang
34
0
0
23 Feb 2025
MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering
Caixiong Li
Xiongwei Zhao
Jinhang Zhang
Xing Zhang
Qihao Sun
Zhou Wu
ObjD
MLLM
VLM
51
0
0
23 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
146
51
0
21 Feb 2025
Bridging Text and Vision: A Multi-View Text-Vision Registration Approach for Cross-Modal Place Recognition
Tianyi Shang
Zhenyu Li
Pengjie Xu
Jinwei Qiao
Gang Chen
Zihan Ruan
Weijun Hu
54
0
0
20 Feb 2025
Understanding and Evaluating Hallucinations in 3D Visual Language Models
Ruiying Peng
Kaiyuan Li
Weichen Zhang
Chen Gao
Xinlei Chen
Y. Li
38
0
0
18 Feb 2025
CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs
Qizhen Lan
Qing Tian
47
0
0
15 Feb 2025
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
Wenxi Li
Yuchen Guo
Jilai Zheng
Haozhe Lin
Chao Ma
Lu Fang
Xiaokang Yang
ViT
60
1
0
11 Feb 2025
Dense Object Detection Based on De-homogenized Queries
Yueming Huang
Chenrui Ma
Hao Zhou
Hao Wu
Guowu Yuan
120
0
0
11 Feb 2025
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Keyi Zhu
Jiajia Li
Kaixiang Zhang
Chaaran Arunachalam
Siddhartha Bhattacharya
R. Lu
Zhaojian Li
63
0
0
03 Feb 2025
CSPCL: Category Semantic Prior Contrastive Learning for Deformable DETR-Based Prohibited Item Detectors
Mingyuan Li
Tong Jia
Hui Lu
Bowen Ma
Hao Wang
Dongyue Chen
70
0
0
28 Jan 2025
MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis
Mai A. Shaaban
Adnan Khan
Mohammad Yaqub
LM&MA
78
2
0
28 Jan 2025
DynamicEarth: How Far are We from Open-Vocabulary Change Detection?
Kaiyu Li
Xiangyong Cao
Yupeng Deng
Chao Pang
Zepeng Xin
Deyu Meng
Zhi Wang
ObjD
69
1
0
22 Jan 2025
See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization
Zongqi He
Zhe Xiao
Kin-Chung Chan
Yushen Zuo
Jun Xiao
Kin-Man Lam
3DGS
53
0
0
20 Jan 2025
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Benjamin Kiefer
Lojze Žust
Jon Muhovič
Matej Kristan
J. Pers
...
Ashraf Saleem
Ching-Heng Cheng
Yu-Fan Lin
Tzu-Yu Lin
Chih-Chung Hsu
38
0
0
20 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
F. Khan
ObjD
VLM
121
1
0
17 Jan 2025
Enhancing Image Generation Fidelity via Progressive Prompts
Zhen Xiong
Yuqi Li
Chuanguang Yang
Tiao Tan
Zhihong Zhu
Siyuan Li
Yue Ma
43
1
0
13 Jan 2025
Previous
1
2
3
4
5
...
13
14
15
Next