ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01527
  4. Cited By
Masked-attention Mask Transformer for Universal Image Segmentation
v1v2v3 (latest)

Masked-attention Mask Transformer for Universal Image Segmentation

2 December 2021
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
    ISeg
ArXiv (abs)PDFHTML

Papers citing "Masked-attention Mask Transformer for Universal Image Segmentation"

50 / 1,661 papers shown
Title
Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
Stitch: Training-Free Position Control in Multimodal Diffusion Transformers
Jessica Bader
Mateusz Pach
Maria A. Bravo
Serge Belongie
Zeynep Akata
124
1
0
30 Sep 2025
K-Prism: A Knowledge-Guided and Prompt Integrated Universal Medical Image Segmentation Model
K-Prism: A Knowledge-Guided and Prompt Integrated Universal Medical Image Segmentation Model
Bangwei Guo
Yunhe Gao
Meng Ye
Difei Gu
Yang Zhou
L. Axel
Dimitris N. Metaxas
VLM
122
0
0
29 Sep 2025
IRIS: Intrinsic Reward Image Synthesis
IRIS: Intrinsic Reward Image Synthesis
Yihang Chen
Yuanhao Ban
Yunqi Hong
Cho-Jui Hsieh
65
0
0
29 Sep 2025
CORE-3D: Context-aware Open-vocabulary Retrieval by Embeddings in 3D
CORE-3D: Context-aware Open-vocabulary Retrieval by Embeddings in 3D
Mohamad Amin Mirzaei
Pantea Amoie
Ali Ekhterachian
Matin Mirzababaei
Babak Khalaj
3DPC
108
0
0
29 Sep 2025
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
Cong Chen
Ziyuan Huang
Cheng Zou
Huanyi Zheng
Kaixiang Ji
Jiajia Liu
Jingdong Chen
Hao Chen
Chunhua Shen
122
2
0
28 Sep 2025
Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding
Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding
Xixi Jiang
Chen Yang
Dong Zhang
Pingcheng Dong
Xin Yang
Kwang-Ting Cheng
68
0
0
28 Sep 2025
CoPatch: Zero-Shot Referring Image Segmentation by Leveraging Untapped Spatial Knowledge in CLIP
CoPatch: Zero-Shot Referring Image Segmentation by Leveraging Untapped Spatial Knowledge in CLIP
Na Min An
Inha Kang
Minhyun Lee
Hyunjung Shim
VLM
129
0
0
27 Sep 2025
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
Wenyi Gong
Mieszko Lis
107
0
0
26 Sep 2025
UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data
UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data
Yujian Yuan
Changjie Wu
Xinyuan Chang
S. Wang
Hang Zhang
Shiyi Liang
Shuang Zeng
Mu Xu
Ning Guo
124
2
0
26 Sep 2025
Learning What To Hear: Boosting Sound-Source Association For Robust Audiovisual Instance Segmentation
Learning What To Hear: Boosting Sound-Source Association For Robust Audiovisual Instance Segmentation
Jinbae Seo
Hyeongjun Kwon
Kwonyoung Kim
Jiyoung Lee
Kwanghoon Sohn
VOS
176
0
0
26 Sep 2025
Boosting LiDAR-Based Localization with Semantic Insight: Camera Projection versus Direct LiDAR Segmentation
Boosting LiDAR-Based Localization with Semantic Insight: Camera Projection versus Direct LiDAR Segmentation
Sven Ochs
Philip Schorner
M. Zofka
Johann Marius Zöllner
60
0
0
24 Sep 2025
Queryable 3D Scene Representation: A Multi-Modal Framework for Semantic Reasoning and Robotic Task Planning
Queryable 3D Scene Representation: A Multi-Modal Framework for Semantic Reasoning and Robotic Task Planning
Xun Li
Rodrigo Santa Cruz
Mingze Xi
Hu Zhang
Madhawa Perera
...
Brandon J. Matthews
Feng Xu
Matt Adcock
Dadong Wang
Jiajun Liu
112
0
0
24 Sep 2025
MLF-4DRCNet: Multi-Level Fusion with 4D Radar and Camera for 3D Object Detection in Autonomous Driving
MLF-4DRCNet: Multi-Level Fusion with 4D Radar and Camera for 3D Object Detection in Autonomous Driving
Yuzhi Wu
Li Xiao
Jun Liu
Guangfeng Jiang
X. Xia
104
0
0
23 Sep 2025
Frequency-Domain Decomposition and Recomposition for Robust Audio-Visual Segmentation
Frequency-Domain Decomposition and Recomposition for Robust Audio-Visual Segmentation
Yunzhe Shen
Kai Peng
Leiye Liu
Wei Ji
Jingjing Li
Miao Zhang
Yongri Piao
Huchuan Lu
VOS
168
0
0
23 Sep 2025
Surgical Video Understanding with Label Interpolation
Surgical Video Understanding with Label Interpolation
Garam Kim
Tae Kyeong Jeong
Juyoun Park
60
0
0
23 Sep 2025
Visual Instruction Pretraining for Domain-Specific Foundation Models
Visual Instruction Pretraining for Domain-Specific Foundation Models
Yuxuan Li
Y. Zhang
Wenhao Tang
Yimian Dai
Ming-Ming Cheng
Xiang Li
Jian Yang
LRM
239
3
0
22 Sep 2025
Region-Aware Deformable Convolutions
Region-Aware Deformable Convolutions
Abolfazl Saheban Maleki
Maryam Imani
118
0
0
18 Sep 2025
Improving Generalized Visual Grounding with Instance-aware Joint Learning
Improving Generalized Visual Grounding with Instance-aware Joint LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Ming Dai
Wenxuan Cheng
Jiang-Jiang Liu
Lingfeng Yang
Zhenhua Feng
Wankou Yang
Jingdong Wang
ObjDISeg
207
3
0
17 Sep 2025
Re-purposing SAM into Efficient Visual Projectors for MLLM-Based Referring Image Segmentation
Re-purposing SAM into Efficient Visual Projectors for MLLM-Based Referring Image Segmentation
Xiaobo Yang
Xiaojin Gong
VLM
95
0
0
17 Sep 2025
Mitigating Query Selection Bias in Referring Video Object Segmentation
Mitigating Query Selection Bias in Referring Video Object Segmentation
Dingwei Zhang
Dong Zhang
Jinhui Tang
93
0
0
17 Sep 2025
White Aggregation and Restoration for Few-shot 3D Point Cloud Semantic Segmentation
White Aggregation and Restoration for Few-shot 3D Point Cloud Semantic Segmentation
Jiyun Im
Subeen Lee
Miso Lee
Jae-Pil Heo
3DPC
180
0
0
17 Sep 2025
Masked Feature Modeling Enhances Adaptive Segmentation
Masked Feature Modeling Enhances Adaptive Segmentation
Wenlve Zhou
Zhiheng Zhou
Tiantao Xian
Yikui Zhai
Weibin Wu
Biyun Ma
96
0
0
17 Sep 2025
Road Obstacle Video Segmentation
Road Obstacle Video Segmentation
Shyam Nandan Rai
Shyamgopal Karthik
Mariana-Iuliana Georgescu
Barbara Caputo
Carlo Masone
Zeynep Akata
VOS
177
0
0
16 Sep 2025
NavMoE: Hybrid Model- and Learning-based Traversability Estimation for Local Navigation via Mixture of Experts
NavMoE: Hybrid Model- and Learning-based Traversability Estimation for Local Navigation via Mixture of Experts
Botao He
A. Shahidzadeh
Yu Chen
Jiayi Wu
Tianrui Guan
...
Howie Choset
Dinesh Manocha
Glen Chou
Cornelia Fermüller
Yiannis Aloimonos
MoE
219
0
0
16 Sep 2025
CLAIRE: A Dual Encoder Network with RIFT Loss and Phi-3 Small Language Model Based Interpretability for Cross-Modality Synthetic Aperture Radar and Optical Land Cover Segmentation
CLAIRE: A Dual Encoder Network with RIFT Loss and Phi-3 Small Language Model Based Interpretability for Cross-Modality Synthetic Aperture Radar and Optical Land Cover Segmentation
Debopom Sutradhar
Arefin Ittesafun Abian
M. R
Reem E. Mohamed
Sheikh Izzal Azid
Sami Azam
88
0
0
15 Sep 2025
Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
Bingyu Li
Haocheng Dong
Da Zhang
Zhiyuan Zhao
Junyu Gao
Xuelong Li
121
3
0
15 Sep 2025
Microsurgical Instrument Segmentation for Robot-Assisted Surgery
Microsurgical Instrument Segmentation for Robot-Assisted Surgery
Tae Kyeong Jeong
Garam Kim
Juyoun Park
64
0
0
15 Sep 2025
RailSafeNet: Visual Scene Understanding for Tram Safety
RailSafeNet: Visual Scene Understanding for Tram SafetyPortuguese Conference on Artificial Intelligence (EPIA), 2025
Ondřej Valach
Ivan Gruber
60
0
0
15 Sep 2025
Leveraging Multi-View Weak Supervision for Occlusion-Aware Multi-Human Parsing
Leveraging Multi-View Weak Supervision for Occlusion-Aware Multi-Human Parsing
Laura Bragagnolo
Matteo Terreran
Leonardo Barcellona
Stefano Ghidoni
3DH
96
0
0
12 Sep 2025
Multimodal SAM-adapter for Semantic Segmentation
Multimodal SAM-adapter for Semantic SegmentationIEEE Access (IEEE Access), 2025
Iacopo Curti
Pierluigi Zama Ramirez
Alioscia Petrelli
Luigi Di Stefano
114
1
0
12 Sep 2025
I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation
I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation
Jordan Sassoon
Michal Szczepanski
Martyna Poreba
MQVLM
119
0
0
12 Sep 2025
DGFusion: Depth-Guided Sensor Fusion for Robust Semantic Perception
DGFusion: Depth-Guided Sensor Fusion for Robust Semantic Perception
Tim Broedermannn
Christos Sakaridis
Luigi Piccinelli
Wim Abbeloos
Luc Van Gool
MDE
260
1
0
11 Sep 2025
Prompt-Driven Image Analysis with Multimodal Generative AI: Detection, Segmentation, Inpainting, and Interpretation
Prompt-Driven Image Analysis with Multimodal Generative AI: Detection, Segmentation, Inpainting, and Interpretation
Kaleem Ahmad
MLLM
74
0
0
10 Sep 2025
UNO: Unifying One-stage Video Scene Graph Generation via Object-Centric Visual Representation Learning
UNO: Unifying One-stage Video Scene Graph Generation via Object-Centric Visual Representation Learning
Huy Le
Nhat Chung
Tung Kieu
Jingkang Yang
Ngan Le
VOSOCL
329
1
0
07 Sep 2025
A biologically inspired separable learning vision model for real-time traffic object perception in Dark
A biologically inspired separable learning vision model for real-time traffic object perception in DarkExpert systems with applications (ESWA), 2025
Hulin Li
Qiliang Ren
Jun Li
Hanbing Wei
Zheng Liu
Linfang Fan
ViTVLM
96
1
0
05 Sep 2025
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model
Hongyang Wei
Baixin Xu
Hongbo Liu
Cyrus Wu
J. Liu
...
Ying He
Yang Liu
Xuchen Song
Eric Li
Y. Zhou
153
10
0
04 Sep 2025
A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai
A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai
Kaizhen Tan
Yufan Wu
Yuxuan Liu
Haoran Zeng
40
0
0
04 Sep 2025
InstaDA: Augmenting Instance Segmentation Data with Dual-Agent System
InstaDA: Augmenting Instance Segmentation Data with Dual-Agent System
Xianbao Hou
Yonghao He
Zeyd Boukhers
John See
Hu Su
Wei Sui
Cong Yang
DiffMVLM
85
0
0
03 Sep 2025
SOPSeg: Prompt-based Small Object Instance Segmentation in Remote Sensing Imagery
SOPSeg: Prompt-based Small Object Instance Segmentation in Remote Sensing Imagery
Chenhao Wang
Yingrui Ji
Yu Meng
Yunjian Zhang
Yao Zhu
157
1
0
03 Sep 2025
Unsupervised Instance Segmentation with Superpixels
Unsupervised Instance Segmentation with SuperpixelsPattern Recognition (Pattern Recogn.), 2025
Cuong Manh Hoang
SSeg
158
1
0
03 Sep 2025
MedDINOv3: How to adapt vision foundation models for medical image segmentation?
MedDINOv3: How to adapt vision foundation models for medical image segmentation?
Yuheng Li
Yizhou Wu
Yuxiang Lai
Mingzhe Hu
Xiaofeng Yang
MedIm
242
6
0
02 Sep 2025
Can General-Purpose Omnimodels Compete with Specialists? A Case Study in Medical Image Segmentation
Can General-Purpose Omnimodels Compete with Specialists? A Case Study in Medical Image Segmentation
Yizhe Zhang
Qiang Chen
Tao Zhou
99
0
0
31 Aug 2025
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
Bin Yang
Yulin Zhang
Hong-Yu Zhou
Sibei Yang
128
0
0
31 Aug 2025
DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation
DGL-RSIS: Decoupling Global Spatial Context and Local Class Semantics for Training-Free Remote Sensing Image Segmentation
Boyi Li
Ce Zhang
Richard M. Timmerman
Wenxuan Bao
56
0
0
30 Aug 2025
Learning Yourself: Class-Incremental Semantic Segmentation with Language-Inspired Bootstrapped Disentanglement
Learning Yourself: Class-Incremental Semantic Segmentation with Language-Inspired Bootstrapped Disentanglement
Ruitao Wu
Yifan Zhao
Jia Li
CLL
140
1
0
30 Aug 2025
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
Luozhijie Jin
Zijie Qiu
J. Liu
Zijie Diao
Lifeng Qiao
Ning Ding
Alex Lamb
Xipeng Qiu
AI4CE
90
2
0
28 Aug 2025
Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models
Image Quality Assessment for Machines: Paradigm, Large-scale Database, and Models
Xiaoqi Wang
Yun Zhang
Weisi Lin
120
0
0
27 Aug 2025
AutoQ-VIS: Improving Unsupervised Video Instance Segmentation via Automatic Quality Assessment
AutoQ-VIS: Improving Unsupervised Video Instance Segmentation via Automatic Quality Assessment
Kaixuan Lu
Mehmet Onurcan Kaya
Dim P. Papadopoulos
56
0
0
27 Aug 2025
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
FreeVPS: Repurposing Training-Free SAM2 for Generalizable Video Polyp Segmentation
Qiang Hu
Ying Zhou
Gepeng Ji
Nick Barnes
Qiang Li
Zhiwei Wang
112
0
0
27 Aug 2025
IELDG: Suppressing Domain-Specific Noise with Inverse Evolution Layers for Domain Generalized Semantic Segmentation
IELDG: Suppressing Domain-Specific Noise with Inverse Evolution Layers for Domain Generalized Semantic Segmentation
Qizhe Fan
Chaoyu Liu
Zhonghua Qiao
Xiaoqin Shen
96
0
0
27 Aug 2025
Previous
123456...323334
Next