ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.06870
  4. Cited By
Mask R-CNN

Mask R-CNN

20 March 2017
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
    ObjD
ArXivPDFHTML

Papers citing "Mask R-CNN"

50 / 239 papers shown
Title
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
129
8
0
24 Feb 2025
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
DeepInteraction++: Multi-Modality Interaction for Autonomous Driving
Zeyu Yang
Nan Song
Wei Li
Xiatian Zhu
Lefei Zhang
Philip H. S. Torr
107
4
0
24 Feb 2025
An Expert Ensemble for Detecting Anomalous Scenes, Interactions, and Behaviors in Autonomous Driving
An Expert Ensemble for Detecting Anomalous Scenes, Interactions, and Behaviors in Autonomous Driving
Tianchen Ji
Neeloy Chakraborty
Andre Schreiber
Katherine Rose Driggs-Campbell
372
1
0
23 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
215
52
0
21 Feb 2025
SMITE: Segment Me In TimE
SMITE: Segment Me In TimE
Amirhossein Alimohammadi
Sauradip Nag
Saeid Asgari Taghanaki
Andrea Tagliasacchi
Ghassan Hamarneh
Ali Mahdavi-Amiri
VLM
VOS
355
2
0
20 Feb 2025
Contrastive Localized Language-Image Pre-Training
Contrastive Localized Language-Image Pre-Training
Hong-You Chen
Zhengfeng Lai
Hao Zhang
Xiang Wang
Marcin Eichner
Keen You
Meng Cao
Bowen Zhang
Yue Yang
Zhe Gan
CLIP
VLM
73
9
0
20 Feb 2025
SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection
SAM-LAD: Segment Anything Model Meets Zero-Shot Logic Anomaly Detection
Yun Peng
Xiao Lin
Nachuan Ma
Jiayuan Du
Chuangwei Liu
Chengju Liu
Qi Chen
107
3
0
17 Feb 2025
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Mo Yu
Lemao Liu
J. Wu
Tsz Ting Chung
Shunchi Zhang
JiangNan Li
Dit-Yan Yeung
Jie Zhou
102
1
0
13 Feb 2025
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer
Wenxi Li
Yuchen Guo
Jilai Zheng
Haozhe Lin
Chao Ma
Lu Fang
Xiaokang Yang
ViT
76
3
0
11 Feb 2025
Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Yi Yu
Botao Ren
Peiyuan Zhang
Mingxin Liu
Junwei Luo
Shaofeng Zhang
Feipeng Da
Junchi Yan
Xue Yang
3DPC
144
2
0
06 Feb 2025
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting
Keyi Zhu
Jiajia Li
Kaixiang Zhang
Chaaran Arunachalam
Siddhartha Bhattacharya
R. Lu
Zhaojian Li
103
0
0
03 Feb 2025
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Transfer Learning for Keypoint Detection in Low-Resolution Thermal TUG Test Images
Wei-Lun Chen
Chia-Yeh Hsieh
Yu-Hsiang Kao
Kai-Chun Liu
Sheng-Yu Peng
Yu Tsao
105
0
0
30 Jan 2025
Glissando-Net: Deep sinGLe vIew category level poSe eStimation ANd 3D recOnstruction
Bo Sun
Hao Kang
Li Guan
Haoxiang Li
Philippos Mordohai
Gang Hua
72
1
0
28 Jan 2025
SpineFM: Leveraging Foundation Models for Automatic Spine X-ray Segmentation
SpineFM: Leveraging Foundation Models for Automatic Spine X-ray Segmentation
Samuel J. Simons
Bartłomiej W. Papież
MedIm
109
0
0
28 Jan 2025
iFormer: Integrating ConvNet and Transformer for Mobile Application
iFormer: Integrating ConvNet and Transformer for Mobile Application
Chuanyang Zheng
ViT
89
0
0
26 Jan 2025
Towards Robust Unsupervised Attention Prediction in Autonomous Driving
Towards Robust Unsupervised Attention Prediction in Autonomous Driving
Mengshi Qi
Xiaoyang Bi
Pengfei Zhu
Huadong Ma
104
0
0
25 Jan 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
353
2
0
25 Jan 2025
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
TFLOP: Table Structure Recognition Framework with Layout Pointer Mechanism
Minsoo Khang
Teakgyu Hong
LMTD
122
0
0
21 Jan 2025
Surface-SOS: Self-Supervised Object Segmentation via Neural Surface Representation
Xiaoyun Zheng
Liwei Liao
Jianbo Jiao
Feng Gao
Ronggang Wang
99
6
0
20 Jan 2025
AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards
AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards
Uddhav Bhattarai
Santosh Bhusal
Qin Zhang
Manoj Karkee
119
2
0
20 Jan 2025
Enhancing Novel Object Detection via Cooperative Foundational Models
Enhancing Novel Object Detection via Cooperative Foundational Models
Rohit K Bharadwaj
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
ObjD
VLM
225
1
0
17 Jan 2025
Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM
Enhancing Skin Disease Diagnosis: Interpretable Visual Concept Discovery with SAM
Xin Hu
Janet Wang
Jihun Hamm
R. Yotsu
Zhengming Ding
110
0
0
17 Jan 2025
A method for estimating roadway billboard salience
A method for estimating roadway billboard salience
Zuzana Berger Haladova
Michal Zrubec
Zuzana Cernekova
52
0
0
13 Jan 2025
Visual Semantic Navigation with Real Robots
Visual Semantic Navigation with Real Robots
Carlos Gutiérrez-Álvarez
Pablo Ríos-Navarro
Rafael Flor-Rodríguez
Francisco Javier Acevedo-Rodríguez
Roberto J. López-Sastre
70
2
0
10 Jan 2025
TipSegNet: Fingertip Segmentation in Contactless Fingerprint Imaging
TipSegNet: Fingertip Segmentation in Contactless Fingerprint Imaging
L. Ruzicka
Bernhard Kohn
Clemens Heitzinger
104
0
0
10 Jan 2025
Geometry Restoration and Dewarping of Camera-Captured Document Images
Geometry Restoration and Dewarping of Camera-Captured Document Images
Valery Istomin
Oleg Pereziabov
Ilya Afanasyev
66
0
0
10 Jan 2025
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
Solving the Catastrophic Forgetting Problem in Generalized Category Discovery
Xinzi Cao
Xiawu Zheng
G. Wang
Weijiang Yu
Yunhang Shen
Ke Li
Yutong Lu
Yonghong Tian
CLL
108
5
0
09 Jan 2025
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision
Kangsheng Yin
Quan Liu
Xuelin Shen
Yulin He
Wenhan Yang
Shiqi Wang
VLM
60
0
0
08 Jan 2025
Rapid Automated Mapping of Clouds on Titan With Instance Segmentation
Rapid Automated Mapping of Clouds on Titan With Instance Segmentation
Zachary Yahn
Douglas Trent
Ethan Duncan
B. Seignovert
John Santerre
Conor A. Nixon
38
0
0
08 Jan 2025
Machine Learning for Identifying Grain Boundaries in Scanning Electron Microscopy (SEM) Images of Nanoparticle Superlattices
Machine Learning for Identifying Grain Boundaries in Scanning Electron Microscopy (SEM) Images of Nanoparticle Superlattices
Aanish Paruchuri
Carl Thrasher
A. J. Hart
Robert Macfarlane
Arthi Jayaraman
53
0
0
07 Jan 2025
First-place Solution for Streetscape Shop Sign Recognition Competition
First-place Solution for Streetscape Shop Sign Recognition Competition
Bin Wang
Li Jing
350
0
0
06 Jan 2025
H-Net: A Multitask Architecture for Simultaneous 3D Force Estimation and Stereo Semantic Segmentation in Intracardiac Catheters
Pedram Fekri
M. Zadeh
Javad Dargahi
94
1
0
03 Jan 2025
Towards Visual Grounding: A Survey
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
120
4
0
31 Dec 2024
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria
Stefan Schoder
138
0
0
31 Dec 2024
VMamba: Visual State Space Model
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
184
657
0
31 Dec 2024
ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing
ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing
Yuka Ogino
Yuho Shoji
Takahiro Toizumi
Atsushi Ito
66
2
0
31 Dec 2024
PTQ4VM: Post-Training Quantization for Visual Mamba
PTQ4VM: Post-Training Quantization for Visual Mamba
Younghyun Cho
Changhun Lee
Seonggon Kim
Eunhyeok Park
MQ
Mamba
75
2
0
29 Dec 2024
Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
Yudong Zhang
Ruobing Xie
Jiansheng Chen
Xingwu Sun
Zhanhui Kang
Yu Wang
130
0
0
21 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
138
2
0
12 Dec 2024
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
Baorui Ma
Huachen Gao
Haoge Deng
Zhengxiong Luo
Tiejun Huang
Lulu Tang
Xinlong Wang
DiffM
VGen
137
14
0
09 Dec 2024
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Qing Jiang
Gen Luo
Yuqin Yang
Yuda Xiong
Yihao Chen
Zhaoyang Zeng
Tianhe Ren
Lei Zhang
VLM
LRM
130
8
0
27 Nov 2024
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection
Zhongyu Xia
Jishuo Li
Zhiwei Lin
Xinhao Wang
Yansen Wang
Ming-Hsuan Yang
VLM
106
2
0
26 Nov 2024
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu
Jiajian Li
Yichen Jiang
Niranjan Sujay
Zhiyong Yang
Juexiao Zhang
John Abanes
Jing Zhang
Chen Feng
129
2
0
26 Nov 2024
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu
Gu Wang
Ruida Zhang
Chenyangguang Zhang
F. Tombari
Xiangyang Ji
393
2
0
25 Nov 2024
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation
Linqing Zhong
Chen Gao
Zihan Ding
Yue Liao
Si Liu
Shifeng Zhang
Xu Zhou
Si Liu
LRM
115
5
0
25 Nov 2024
Interpreting Object-level Foundation Models via Visual Precision Search
Interpreting Object-level Foundation Models via Visual Precision Search
Ruoyu Chen
Siyuan Liang
Jingzhi Li
Shiming Liu
Maosen Li
Zheng Huang
Qichuan Geng
Xiaochun Cao
FAtt
121
4
0
25 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
135
3
0
22 Nov 2024
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
CLIC: Contrastive Learning Framework for Unsupervised Image Complexity Representation
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
130
1
0
19 Nov 2024
Breaking the Low-Rank Dilemma of Linear Attention
Breaking the Low-Rank Dilemma of Linear Attention
Qihang Fan
Huaibo Huang
Ran He
66
1
0
12 Nov 2024
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
MSEG-VCUQ: Multimodal SEGmentation with Enhanced Vision Foundation Models, Convolutional Neural Networks, and Uncertainty Quantification for High-Speed Video Phase Detection Data
Chika Maduabuchi
Ericmoore Jossou
Matteo Bucci
45
0
0
12 Nov 2024
Previous
12345
Next