Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.02777
Cited By
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
6 June 2022
Feng Li
Hao Zhang
Hu-Sheng Xu
Siyi Liu
Lei Zhang
L. Ni
H. Shum
ISeg
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
50 / 230 papers shown
Title
Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding
Chih-Chung Hsu
I-Hsuan Wu
Wen-Hai Tseng
Ching-Heng Cheng
Ming-Hsuan Wu
Jin-Hui Jiang
Yu-Jou Hsiao
13
0
0
11 May 2025
Global Collinearity-aware Polygonizer for Polygonal Building Mapping in Remote Sensing
Fahong Zhang
Yilei Shi
Xiao Xiang Zhu
34
0
0
02 May 2025
Temporal Propagation of Asymmetric Feature Pyramid for Surgical Scene Segmentation
Cheng Yuan
Yutong Ban
MedIm
28
0
0
18 Apr 2025
DGOcc: Depth-aware Global Query-based Network for Monocular 3D Occupancy Prediction
Xu Zhao
Pengju Zhang
Bo Liu
Yihong Wu
36
0
0
10 Apr 2025
UCS: A Universal Model for Curvilinear Structure Segmentation
Dianshuo Li
Li Chen
Y. Cao
Kai Zhu
Jun Cheng
33
0
0
05 Apr 2025
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Andrei Dumitriu
Florin Tatui
Florin Miron
Radu Tudor Ionescu
Radu Timofte
37
19
0
03 Apr 2025
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Chang-Bin Zhang
Jinhong Ni
Yujie Zhong
Kai Han
3DV
VLM
57
0
0
02 Apr 2025
Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians
Jiamin Wu
Hongyang Li
Xiaoke Jiang
Yuan Yao
Lei Zhang
3DGS
49
0
0
01 Apr 2025
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization
Martin Kiss
Michal Hradiš
Martina Dvořáková
Václav Jiroušek
Filip Kersch
36
1
0
28 Mar 2025
InteractionMap: Improving Online Vectorized HDMap Construction with Interaction
Kuang Wu
Chuan Yang
Zhanbin Li
53
0
0
27 Mar 2025
From Fragment to One Piece: A Survey on AI-Driven Graphic Design
Xingxing Zou
Wen Zhang
Nanxuan Zhao
54
0
0
24 Mar 2025
A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation
Qing Zhong
Peng-Tao Jiang
Wen Wang
Guodong Ding
Lin Wu
Kaiqi Huang
VLM
48
0
0
22 Mar 2025
Structured-Noise Masked Modeling for Video, Audio and Beyond
Aritra Bhowmik
Fida Mohammad Thoker
Carlos Hinojosa
Bernard Ghanem
Cees G. M. Snoek
VGen
59
0
0
20 Mar 2025
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
53
0
0
20 Mar 2025
EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining
Boshen Xu
Yuting Mei
Xinbi Liu
Sipeng Zheng
Qin Jin
VLM
MDE
60
0
0
19 Mar 2025
Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis
N. Islam
Dongao Ma
Jiaxuan Pang
Shivasakthi Senthil Velan
Michael B. Gotway
Jianming Liang
53
0
0
12 Mar 2025
From Slices to Sequences: Autoregressive Tracking Transformer for Cohesive and Consistent 3D Lymph Node Detection in CT Scans
Qinji Yu
Yirui Wang
K. Yan
Dandan Zheng
Dashan Ai
...
N. Shen
Xiaowei Ding
Le Lu
X. Ye
Dakai Jin
ViT
MedIm
57
0
0
11 Mar 2025
MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution
X. Li
Jianlong Wu
Xinchuan Huang
C. L. Philip Chen
Weili Guan
Xian-Sheng Hua
Liqiang Nie
DiffM
51
0
0
11 Mar 2025
YOLOE: Real-Time Seeing Anything
Ao Wang
Lihao Liu
Hui Chen
Zijia Lin
J. Han
Guiguang Ding
VLM
ObjD
66
1
0
10 Mar 2025
MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism
Zhixiong Nan
Xianghong Li
Jifeng Dai
Tao Xiang
44
0
0
03 Mar 2025
Object-Aware Video Matting with Cross-Frame Guidance
H. Zhang
Dongyue Wu
Yuanjie Shao
Nong Sang
Changxin Gao
VOS
69
0
0
03 Mar 2025
Autonomous Dissection in Robotic Cholecystectomy
K. Oh
Leonardo Borgioli
Milos Zefran
Valentina Valle
P. Giulianotti
31
0
0
01 Mar 2025
Beyond the Final Layer: Hierarchical Query Fusion Transformer with Agent-Interpolation Initialization for 3D Instance Segmentation
Jiahao Lu
Jiacheng Deng
Tianzhu Zhang
76
2
0
06 Feb 2025
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
Tao Zhang
Jinyong Wen
Zhen Chen
Kun Ding
S. Xiang
Chunhong Pan
70
1
0
04 Feb 2025
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results
Benjamin Kiefer
Lojze Žust
Jon Muhovič
Matej Kristan
J. Pers
...
Ashraf Saleem
Ching-Heng Cheng
Yu-Fan Lin
Tzu-Yu Lin
Chih-Chung Hsu
38
0
0
20 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
88
45
0
03 Jan 2025
RoboCup@Home 2024 OPL Winner NimbRo: Anthropomorphic Service Robots using Foundation Models for Perception and Planning
Raphael Memmesheimer
Jan Nogga
Bastian Patzold
Evgenii Kruzhkov
S. Bultmann
...
Jonas Bode
Bertan Karacora
Juhui Park
A. Savinykh
Sven Behnke
67
2
0
19 Dec 2024
Expanded Comprehensive Robotic Cholecystectomy Dataset (CRCD)
K. Oh
Leonardo Borgioli
Alberto Mangano
Valentina Valle
Marco Di Pangrazio
...
Luciano Ambrosini
Alvaro Ducas
Milos Zefran
Liaohai Chen
P. Giulianotti
64
1
0
16 Dec 2024
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu
Meng Lou
Yizhou Yu
112
1
0
16 Dec 2024
Towards Real-Time Open-Vocabulary Video Instance Segmentation
Bin Yan
Martin Sundermeyer
D. Tan
Huchuan Lu
F. Tombari
VLM
VOS
79
0
0
05 Dec 2024
Measure Anything: Real-time, Multi-stage Vision-based Dimensional Measurement using Segment Anything
Y. Lee
S. K. Panda
Wei Wang
M. Jawed
59
0
0
04 Dec 2024
Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
Xuechao Zou
Shun Zhang
Kai Li
Shiying Wang
Junliang Xing
Lei Jin
Congyan Lang
Pin Tao
61
1
0
20 Nov 2024
Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural Images
Shen Li
Lei Jiang
Wei Wang
Hongwei Hu
Liang Li
65
0
0
20 Nov 2024
Person Segmentation and Action Classification for Multi-Channel Hemisphere Field of View LiDAR Sensors
Svetlana Seliunina
Artem Otelepko
Raphael Memmesheimer
Sven Behnke
28
0
0
17 Nov 2024
Learning Generalizable 3D Manipulation With 10 Demonstrations
Yu Ren
Yang Cong
Ronghan Chen
Jiahao Long
SSL
41
1
0
15 Nov 2024
PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting
Yu Wang
Xiaobao Wei
Ming Lu
Guoliang Kang
3DGS
23
5
0
23 Oct 2024
Masked Differential Privacy
David Schneider
Sina Sajadmanesh
Vikash Sehwag
Saquib Sarfraz
Rainer Stiefelhagen
Lingjuan Lyu
Vivek Sharma
28
1
0
22 Oct 2024
DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model
Zhixiong Nan
Xianghong Li
Tao Xiang
Jifeng Dai
ISeg
30
0
0
22 Oct 2024
big.LITTLE Vision Transformer for Efficient Visual Recognition
He Guo
Yulong Wang
Zixuan Ye
Jifeng Dai
Yuwen Xiong
ViT
34
0
0
14 Oct 2024
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Jianqi Chen
Panwen Hu
Xiaojun Chang
Z. Shi
Michael C. Kampffmeyer
Xiaodan Liang
46
5
0
14 Oct 2024
UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation
Ye Sun
Hao Zhang
Tiehua Zhang
Xingjun Ma
Yu-Gang Jiang
VLM
32
3
0
13 Oct 2024
Multi-Scale Deformable Transformers for Student Learning Behavior Detection in Smart Classroom
Zhifeng Wang
Minghui Wang
Chunyan Zeng
Longlong Li
19
1
0
10 Oct 2024
Shift and matching queries for video semantic segmentation
Tsubasa Mizuno
Toru Tamaki
15
0
0
10 Oct 2024
Iterative Optimization Annotation Pipeline and ALSS-YOLO-Seg for Efficient Banana Plantation Segmentation in UAV Imagery
Ang He
Ximei Wu
Xing Xu
Jing Chen
Xiaobin Guo
Sheng Xu
13
0
0
09 Oct 2024
Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
Zhiwei Lin
Yongtao Wang
Zhi Tang
ObjD
VLM
16
2
0
08 Oct 2024
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding
Shenghao Li
22
1
0
06 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
16
7
0
03 Oct 2024
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin
Xinyu Wei
Renrui Zhang
Le Zhuo
Shitian Zhao
...
Junlin Xie
Junlin Xie
Yu Qiao
Peng Gao
Hongsheng Li
MLLM
DiffM
50
10
0
23 Sep 2024
A Bottom-Up Approach to Class-Agnostic Image Segmentation
Sebastian Dille
Ari Blondal
Sylvain Paris
Yağız Aksoy
11
0
0
20 Sep 2024
COCO-OLAC: A Benchmark for Occluded Panoptic Segmentation and Image Understanding
Wenbo Wei
Jun Wang
Abhir Bhalerao
32
0
0
19 Sep 2024
1
2
3
4
5
Next