Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.06870
Cited By
Mask R-CNN
20 March 2017
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mask R-CNN"
50 / 239 papers shown
Title
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
129
8
0
24 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
Lefei Zhang
Philip H. S. Torr
107
4
0
24 Feb 2025
An Expert Ensemble for Detecting Anomalous Scenes, Interactions, and Behaviors in Autonomous Driving
Tianchen Ji
Neeloy Chakraborty
Andre Schreiber
Katherine Rose Driggs-Campbell
372
1
0
23 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
215
52
0
21 Feb 2025
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
355
2
0
20 Feb 2025
Contrastive Localized Language-Image Pre-Training
Hong-You Chen
Zhengfeng Lai
Hao Zhang
Xiang Wang
Marcin Eichner
Keen You
Meng Cao
Bowen Zhang
Yue Yang
Zhe Gan
CLIP
VLM
73
9
0
20 Feb 2025
SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection
Yun Peng
Xiao Lin
Nachuan Ma
Jiayuan Du
Chuangwei Liu
Chengju Liu
Qi Chen
107
3
0
17 Feb 2025
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Mo Yu
Lemao Liu
J. Wu
Tsz Ting Chung
Shunchi Zhang
JiangNan Li
Dit-Yan Yeung
Jie Zhou
102
1
0
13 Feb 2025
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
Wenxi Li
Yuchen Guo
Jilai Zheng
Haozhe Lin
Chao Ma
Lu Fang
Xiaokang Yang
ViT
76
3
0
11 Feb 2025
Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Yi Yu
Botao Ren
Peiyuan Zhang
Mingxin Liu
Junwei Luo
Shaofeng Zhang
Feipeng Da
Junchi Yan
Xue Yang
3DPC
144
2
0
06 Feb 2025
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Keyi Zhu
Jiajia Li
Kaixiang Zhang
Chaaran Arunachalam
Siddhartha Bhattacharya
R. Lu
Zhaojian Li
103
0
0
03 Feb 2025
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Wei-Lun Chen
Chia-Yeh Hsieh
Yu-Hsiang Kao
Kai-Chun Liu
Sheng-Yu Peng
Yu Tsao
105
0
0
30 Jan 2025
Glissando-Net: Deep sinGLe vIew category level poSe eStimation ANd 3D recOnstruction
Bo Sun
Hao Kang
Li Guan
Haoxiang Li
Philippos Mordohai
Gang Hua
72
1
0
28 Jan 2025
SpineFM: Leveraging Foundation Models for Automatic Spine X-ray Segmentation
Samuel J. Simons
Bartłomiej W. Papież
MedIm
109
0
0
28 Jan 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
89
0
0
26 Jan 2025
Towards Robust Unsupervised Attention Prediction in Autonomous Driving
Mengshi Qi
Xiaoyang Bi
Pengfei Zhu
Huadong Ma
104
0
0
25 Jan 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
353
2
0
25 Jan 2025
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
Minsoo Khang
Teakgyu Hong
LMTD
122
0
0
21 Jan 2025
Surface-SOS: Self-Supervised Object Segmentation via Neural Surface Representation
Xiaoyun Zheng
Liwei Liao
Jianbo Jiao
Feng Gao
Ronggang Wang
99
6
0
20 Jan 2025
AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards
Uddhav Bhattarai
Santosh Bhusal
Qin Zhang
Manoj Karkee
119
2
0
20 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
ObjD
VLM
225
1
0
17 Jan 2025
Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM
Xin Hu
Janet Wang
Jihun Hamm
R. Yotsu
Zhengming Ding
110
0
0
17 Jan 2025
A method for estimating roadway billboard salience
Zuzana Berger Haladova
Michal Zrubec
Zuzana Cernekova
52
0
0
13 Jan 2025
Visual Semantic Navigation with Real Robots
Carlos Gutiérrez-Álvarez
Pablo Ríos-Navarro
Rafael Flor-Rodríguez
Francisco Javier Acevedo-Rodríguez
Roberto J. López-Sastre
70
2
0
10 Jan 2025
TipSegNet: Fingertip Segmentation in Contactless Fingerprint Imaging
L. Ruzicka
Bernhard Kohn
Clemens Heitzinger
104
0
0
10 Jan 2025
Geometry Restoration and Dewarping of Camera-Captured Document Images
Valery Istomin
Oleg Pereziabov
Ilya Afanasyev
66
0
0
10 Jan 2025
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
Xinzi Cao
Xiawu Zheng
G. Wang
Weijiang Yu
Yunhang Shen
Ke Li
Yutong Lu
Yonghong Tian
CLL
108
5
0
09 Jan 2025
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
Kangsheng Yin
Quan Liu
Xuelin Shen
Yulin He
Wenhan Yang
Shiqi Wang
VLM
60
0
0
08 Jan 2025
Rapid Automated Mapping of Clouds on Titan With Instance Segmentation
Zachary Yahn
Douglas Trent
Ethan Duncan
B. Seignovert
John Santerre
Conor A. Nixon
38
0
0
08 Jan 2025
Machine Learning for Identifying Grain Boundaries in Scanning Electron Microscopy (SEM) Images of Nanoparticle Superlattices
Aanish Paruchuri
Carl Thrasher
A. J. Hart
Robert Macfarlane
Arthi Jayaraman
53
0
0
07 Jan 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
350
0
0
06 Jan 2025
H-Net: A Multitask Architecture for Simultaneous 3D Force Estimation and Stereo Semantic Segmentation in Intracardiac Catheters
Pedram Fekri
M. Zadeh
Javad Dargahi
94
1
0
03 Jan 2025
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
120
4
0
31 Dec 2024
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria
Stefan Schoder
138
0
0
31 Dec 2024
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
184
657
0
31 Dec 2024
ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing
Yuka Ogino
Yuho Shoji
Takahiro Toizumi
Atsushi Ito
66
2
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
75
2
0
29 Dec 2024
Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Yudong Zhang
Ruobing Xie
Jiansheng Chen
Xingwu Sun
Zhanhui Kang
Yu Wang
130
0
0
21 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
138
2
0
12 Dec 2024
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Baorui Ma
Huachen Gao
Haoge Deng
Zhengxiong Luo
Tiejun Huang
Lulu Tang
Xinlong Wang
DiffM
VGen
137
14
0
09 Dec 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
130
8
0
27 Nov 2024
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Yansen Wang
Ming-Hsuan Yang
VLM
106
2
0
26 Nov 2024
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu
Jiajian Li
Yichen Jiang
Niranjan Sujay
Zhiyong Yang
Juexiao Zhang
John Abanes
Jing Zhang
Chen Feng
129
2
0
26 Nov 2024
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu
Gu Wang
Ruida Zhang
Chenyangguang Zhang
F. Tombari
Xiangyang Ji
393
2
0
25 Nov 2024
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation
Linqing Zhong
Chen Gao
Zihan Ding
Yue Liao
Si Liu
Shifeng Zhang
Xu Zhou
Si Liu
LRM
115
5
0
25 Nov 2024
Interpreting Object-level Foundation Models via Visual Precision Search
Ruoyu Chen
Siyuan Liang
Jingzhi Li
Shiming Liu
Maosen Li
Zheng Huang
Qichuan Geng
Xiaochun Cao
FAtt
121
4
0
25 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
135
3
0
22 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
130
1
0
19 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
66
1
0
12 Nov 2024
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
45
0
0
12 Nov 2024
Previous
1
2
3
4
5
Next