Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.06870
Cited By
Mask R-CNN
20 March 2017
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mask R-CNN"
50 / 239 papers shown
Title
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin
Yue Hu
Jiangtao Shen
Yunhang Shen
Liujuan Cao
Shengchuan Zhang
Chia-Wen Lin
ObjD
VLM
82
0
0
26 May 2025
Locality-Aware Zero-Shot Human-Object Interaction Detection
Sanghyun Kim
Deunsol Jung
Minsu Cho
VLM
104
0
0
26 May 2025
Deformable Attentive Visual Enhancement for Referring Segmentation Using Vision-Language Model
Alaa Dalaq
Muzammil Behzad
VLM
26
0
0
25 May 2025
Track Anything Annotate: Video annotation and dataset generation of computer vision models
Nikita Ivanov
Mark Klimov
Dmitry Glukhikh
Tatiana Chernysheva
Igor Glukhikh
VGen
10
0
0
23 May 2025
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation
Zhihua Liu
Amrutha Saseendran
Lei Tong
Xilin He
Fariba Yousefi
...
Dino Oglic
Tom Diethe
Philip Teare
Huiyu Zhou
Chen Jin
VLM
134
0
0
23 May 2025
Detailed Evaluation of Modern Machine Learning Approaches for Optic Plastics Sorting
Vaishali Maheshkar
Aadarsh Anantha Ramakrishnan
Charuvahan Adhivarahan
Karthik Dantu
21
0
0
22 May 2025
Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation
Bin-Bin Gao
Xiaochen Chen
Z. Huang
Congchong Nie
Jun Liu
Jinxiang Lai
Guannan Jiang
Xi-Zhao Wang
Chengjie Wang
72
28
0
20 May 2025
FIGhost: Fluorescent Ink-based Stealthy and Flexible Backdoor Attacks on Physical Traffic Sign Recognition
Shuai Yuan
Guowen Xu
Hongwei Li
Rui Zhang
Xinyuan Qian
Wenbo Jiang
Hangcheng Cao
Qingchuan Zhao
AAML
45
0
0
17 May 2025
FG-CLIP: Fine-Grained Visual and Textual Alignment
Chunyu Xie
Bin Wang
Fanjing Kong
Jincheng Li
Dawei Liang
Gengshen Zhang
Dawei Leng
Yuhui Yin
CLIP
VLM
89
0
0
08 May 2025
InstanceGen: Image Generation with Instance-level Instructions
Etai Sella
Yanir Kleiman
Hadar Averbuch-Elor
40
0
0
08 May 2025
A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic
Muhammad Imran Zaman
Usama Ijaz Bajwa
Gulshan Saleem
Rana Hammad Raza
VOT
109
7
0
01 May 2025
Pack-PTQ: Advancing Post-training Quantization of Neural Networks by Pack-wise Reconstruction
Changjun Li
Runqing Jiang
Zhuo Song
Pengpeng Yu
Ye Zhang
Yulan Guo
MQ
75
0
0
01 May 2025
Motion Generation for Food Topping Challenge 2024: Serving Salmon Roe Bowl and Picking Fried Chicken
Koki Inami
Masashi Konosu
Koki Yamane
Nozomu Masuya
Yunhan Li
Yu-Han Shu
Hiroshi Sato
Shinnosuke Homma
S. Sakaino
96
0
0
28 Apr 2025
VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning
Run Luo
Renke Shan
Longze Chen
Ziqiang Liu
Lu Wang
Min Yang
Xiaobo Xia
MLLM
VLM
119
1
0
28 Apr 2025
Instance-Adaptive Keypoint Learning with Local-to-Global Geometric Aggregation for Category-Level Object Pose Estimation
Wei Wei
Lu Zou
Tao Lu
Yuan Yao
Zhangjin Huang
Guoping Wang
3DPC
59
0
0
21 Apr 2025
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches
Guodong Shen
Yuqi Ouyang
Junru Lu
Yixuan Yang
Victor Sanchez
112
1
0
20 Apr 2025
SG-Reg: Generalizable and Efficient Scene Graph Registration
Chuhao Liu
Zhijian Qiao
Jieqi Shi
Ke Wang
Peize Liu
Shaojie Shen
54
0
0
20 Apr 2025
LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks
A. Hojjat
Janek Haberer
Tayyaba Zainab
Olaf Landsiedel
52
3
0
18 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
150
5
0
17 Apr 2025
RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity
Ranjan Sapkota
Rahul Harsha Cheppally
Ajay Sharda
Manoj Karkee
53
0
0
17 Apr 2025
Evolved Hierarchical Masking for Self-Supervised Learning
Zhanzhou Feng
Shiliang Zhang
75
0
0
12 Apr 2025
AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes
Zhenteng Li
Sheng Lian
Dengfeng Pan
Yijiao Wang
Wei Liu
86
0
0
08 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
83
2
0
06 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
155
1
0
03 Apr 2025
Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results
Andrei Dumitriu
Florin Tatui
Florin Miron
Radu Tudor Ionescu
Radu Timofte
98
23
0
03 Apr 2025
Leveraging Sparse Annotations for Leukemia Diagnosis on the Large Leukemia Dataset
Abdul Rehman
Talha Meraj
A. Minhas
A. Imran
Mohsen Ali
Waqas Sultani
M. Shah
75
0
0
03 Apr 2025
RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety
Andrei Dumitriu
Florin Tatui
Florin Miron
Aakash Ralhan
Radu Tudor Ionescu
Radu Timofte
60
0
0
01 Apr 2025
CBIL: Collective Behavior Imitation Learning for Fish from Real Videos
Yifan Wu
Zhiyang Dou
Yuko Ishiwaka
Shun Ogawa
Yuke Lou
Wenping Wang
Lingjie Liu
Taku Komura
114
3
0
31 Mar 2025
PhysPose: Refining 6D Object Poses with Physical Constraints
Martin Malenický
Martin Cífka
Médéric Fourmy
Louis Montaut
Justin Carpentier
Josef Sivic
Vladimir Petrik
54
0
0
30 Mar 2025
A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery
Pengyu Chen
Sicheng Wang
Cuizhen Wang
Senrong Wang
Beiao Huang
Lu Huang
Zhe Zang
54
0
0
29 Mar 2025
Segment Any Motion in Videos
Nan Huang
Wenzhao Zheng
Chenfeng Xu
Kurt Keutzer
Shanghang Zhang
Angjoo Kanazawa
Qianqian Wang
VOS
65
0
0
28 Mar 2025
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Didolkar
Andrii Zadaianchuk
Rabiul Awal
Maximilian Seitzer
E. Gavves
Aishwarya Agrawal
OCL
VLM
128
3
0
27 Mar 2025
vGamba: Attentive State Space Bottleneck for efficient Long-range Dependencies in Visual Recognition
Yunusa Haruna
A. Lawan
Mamba
73
0
0
27 Mar 2025
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection
Zhichao Sun
Huazhang Hu
Yidong Ma
Gang Liu
Nemo Chen
Xu Tang
Feng-Long Xie
Yongchao Xu
ObjD
65
0
0
24 Mar 2025
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis
Jiawei Wang
Kai Hu
Qiang Huo
70
0
0
20 Mar 2025
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
Huan Ren
Wenfei Yang
Xiang Liu
Shifeng Zhang
Tianzhu Zhang
98
2
0
18 Mar 2025
Action tube generation by person query matching for spatio-temporal action detection
Kazuki Omi
Jion Oshima
Toru Tamaki
98
0
0
17 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
106
0
0
16 Mar 2025
SPRINT: Script-agnostic Structure Recognition in Tables
Dhruv Kudale
Badri Vishal Kasuba
Venkatapathy Subramanian
P. Chaudhuri
Ganesh Ramakrishnan
LMTD
97
0
0
15 Mar 2025
APLA: A Simple Adaptation Method for Vision Transformers
Moein Sorkhei
Emir Konuk
Kevin Smith
Christos Matsoukas
64
0
0
14 Mar 2025
NF-SLAM: Effective, Normalizing Flow-supported Neural Field representations for object-level visual SLAM in automotive applications
Li Cui
Yang Ding
Richard Hartley
Zirui Xie
L. Kneip
Zhenghua Yu
82
0
0
14 Mar 2025
CyclePose -- Leveraging Cycle-Consistency for Annotation-Free Nuclei Segmentation in Fluorescence Microscopy
Jonas Utz
Stefan Vocht
Anne Tjorven Buessen
Dennis Possart
Fabian Wagner
Mareike Thies
Mingxuan Gu
S. Uderhardt
Katharina Breininger
MedIm
68
0
0
14 Mar 2025
Deep Perceptual Enhancement for Medical Image Analysis
S. Sharif
R. A. Naqvi
Mithun Biswas
Woong-Kee Loh
MedIm
131
21
0
11 Mar 2025
Referring to Any Person
Qing Jiang
Lin Wu
Zhaoyang Zeng
Tianhe Ren
Yuda Xiong
Yihao Chen
Qin Liu
Lei Zhang
346
0
0
11 Mar 2025
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts
Shiu-hong Kao
Yu-Wing Tai
Chi-Keung Tang
LRM
MLLM
117
1
0
10 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
108
0
0
08 Mar 2025
DarkDeblur: Learning single-shot image deblurring in low-light condition
S. Sharif
R. A. Naqvi
Farman Alic
Mithun Biswas
VLM
123
20
0
04 Mar 2025
Monocular visual simultaneous localization and mapping: (r)evolution from geometry to deep learning-based pipelines
Olaya Álvarez-Tunón
Yury Brodskiy
Erdal Kayacan
141
6
0
04 Mar 2025
Boltzmann Attention Sampling for Image Analysis with Small Objects
Theodore Zhao
Sid Kiblawi
Naoto Usuyama
Ho Hin Lee
Sam Preston
Hoifung Poon
Mu-Hsin Wei
MedIm
114
0
0
04 Mar 2025
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
Meng Lou
Yizhou Yu
164
1
0
27 Feb 2025
1
2
3
4
5
Next