ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01527
  4. Cited By
Masked-attention Mask Transformer for Universal Image Segmentation
v1v2v3 (latest)

Masked-attention Mask Transformer for Universal Image Segmentation

2 December 2021
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
    ISeg
ArXiv (abs)PDFHTML

Papers citing "Masked-attention Mask Transformer for Universal Image Segmentation"

50 / 1,661 papers shown
Title
GLD-Road:A global-local decoding road network extraction model for remote sensing images
Ligao Deng
Yupeng Deng
Yu Meng
Jingbo Chen
Zhihao Xi
Diyou Liu
Qifeng Chu
183
1
0
11 Jun 2025
Vision Generalist Model: A Survey
Vision Generalist Model: A SurveyInternational Journal of Computer Vision (IJCV), 2025
Ziyi Wang
Yongming Rao
Shuofeng Sun
Xinrun Liu
Yi Wei
...
Zuyan Liu
Yanbo Wang
Hongmin Liu
Jie Zhou
Jiwen Lu
269
0
0
11 Jun 2025
JAFAR: Jack up Any Feature at Any Resolution
JAFAR: Jack up Any Feature at Any Resolution
Paul Couairon
Loick Chambon
Louis Serrano
Jean-Emmanuel Haugeard
Matthieu Cord
Nicolas Thome
MDE
384
6
0
10 Jun 2025
Data-Efficient Challenges in Visual Inductive Priors: A Retrospective
Data-Efficient Challenges in Visual Inductive Priors: A Retrospective
Robert-Jan Bruintjes
A. Lengyel
O. Kayhan
Davide Zambrano
Nergis Tomen
Hadi Jamali Rad
Jan van Gemert
VLM
161
0
0
10 Jun 2025
DCD: A Semantic Segmentation Model for Fetal Ultrasound Four-Chamber View
DCD: A Semantic Segmentation Model for Fetal Ultrasound Four-Chamber View
Donglian Li
Hui Guo
Minglang Chen
Huizhen Chen
Jialing Chen
Bocheng Liang
Pengchen Liang
Ying Tan
109
0
0
10 Jun 2025
Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images
Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view ImagesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Yingping Liang
Ying Fu
Yutao Hu
Wenqi Shao
Jiaming Liu
Debing Zhang
116
3
0
09 Jun 2025
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity
FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian VelocityComputer Vision and Pattern Recognition (CVPR), 2025
Jinxi Li
Ziyang Song
Siyuan Zhou
Bo Yang
AI4CE
211
4
0
09 Jun 2025
Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
Yuqian Fu
Runze Wang
Yanwei Fu
Danda Pani Paudel
Luc Van Gool
118
3
0
06 Jun 2025
GS4: Generalizable Sparse Splatting Semantic SLAM
GS4: Generalizable Sparse Splatting Semantic SLAM
Mingqi Jiang
Chanho Kim
Chen Ziwen
Li Fuxin
3DGS
378
1
0
06 Jun 2025
Query Nearby: Offset-Adjusted Mask2Former enhances small-organ segmentation
Query Nearby: Offset-Adjusted Mask2Former enhances small-organ segmentation
Xin Zhang
Dongdong Meng
Sheng Li
MedIm
151
0
0
06 Jun 2025
Controlled Data Rebalancing in Multi-Task Learning for Real-World Image Super-Resolution
Controlled Data Rebalancing in Multi-Task Learning for Real-World Image Super-Resolution
Shuchen Lin
Mingtao Feng
Weisheng Dong
Fangfang Wu
Jianqiao Luo
Yaonan Wang
Guangming Shi
121
1
0
05 Jun 2025
BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation
BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation
Jialei Chen
Xu Zheng
Danda Pani Paudel
Luc Van Gool
Hiroshi Murase
Daisuke Deguchi
206
0
0
04 Jun 2025
Unified Attention Modeling for Efficient Free-Viewing and Visual Search via Shared Representations
Unified Attention Modeling for Efficient Free-Viewing and Visual Search via Shared Representations
Fatma Youssef Mohammed
Kostas Alexis
134
0
0
03 Jun 2025
Towards In-the-wild 3D Plane Reconstruction from a Single Image
Towards In-the-wild 3D Plane Reconstruction from a Single ImageComputer Vision and Pattern Recognition (CVPR), 2025
Jiachen Liu
Jingbo Xia
Sili Chen
Sharon X. Huang
Hengkai Guo
3DV
188
5
0
03 Jun 2025
Pan-Arctic Permafrost Landform and Human-built Infrastructure Feature Detection with Vision Transformers and Location Embeddings
Pan-Arctic Permafrost Landform and Human-built Infrastructure Feature Detection with Vision Transformers and Location Embeddings
Amal S. Perera
David Fernandez
C. Witharana
Elias Manos
Michael Pimenta
...
Yili Yang
Todd Nicholson
Chia-Yu Hsu
Wenwen Li
Guido Grosse
ViT
202
0
0
03 Jun 2025
G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models
G4Seg: Generation for Inexact Segmentation Refinement with Diffusion Models
Tianjiao Zhang
Fei Zhang
Jiangchao Yao
Ya Zhang
Yanfeng Wang
DiffM
305
4
0
02 Jun 2025
ADEPT: Adaptive Diffusion Environment for Policy Transfer Sim-to-Real
ADEPT: Adaptive Diffusion Environment for Policy Transfer Sim-to-Real
Youwei Yu
Junhong Xu
Lantao Liu
272
2
0
02 Jun 2025
unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning
unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning
Yafei Yang
Zihui Zhang
Bo Yang
OCL
258
1
0
02 Jun 2025
Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
Danfeng li
Hui Zhang
Sheng Wang
Jiacheng Li
Zuxuan Wu
DiffMVLM
318
0
0
31 May 2025
Understanding while Exploring: Semantics-driven Active Mapping
Understanding while Exploring: Semantics-driven Active Mapping
Liyan Chen
Huangying Zhan
Hairong Yin
Yi Tian Xu
Philippos Mordohai
206
0
0
30 May 2025
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
PixelThink: Towards Efficient Chain-of-Pixel Reasoning
Song Wang
Gongfan Fang
Lingdong Kong
Xiangtai Li
Jianyun Xu
Maochun Luo
Qiang Li
Jianke Zhu
Xinchao Wang
LRM
323
10
0
29 May 2025
S2AFormer: Strip Self-Attention for Efficient Vision Transformer
S2AFormer: Strip Self-Attention for Efficient Vision TransformerIEEE Transactions on Image Processing (IEEE TIP), 2025
Guoan Xu
Wenfeng Huang
Wenjing Jia
Jiamao Li
Guangwei Gao
Guo-Jun Qi
199
0
0
28 May 2025
CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation
CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation
Pardis Taghavi
Tian Liu
Renjie Li
Reza Langari
Zhengzhong Tu
ISeg
394
0
0
28 May 2025
ObjectClear: Complete Object Removal via Object-Effect Attention
ObjectClear: Complete Object Removal via Object-Effect Attention
Jixin Zhao
Shangchen Zhou
Zhouxia Wang
Peiqing Yang
Chen Change Loy
DiffM
171
3
0
28 May 2025
On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation
On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation
Liyao Tang
Zhe Chen
Dacheng Tao
3DPC
258
0
0
28 May 2025
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Claudia Cuttano
Gabriele Trivigno
Giuseppe Averta
Carlo Masone
VLM
207
0
0
27 May 2025
The Missing Point in Vision Transformers for Universal Image Segmentation
The Missing Point in Vision Transformers for Universal Image Segmentation
Sajjad Shahabodini
Mobina Mansoori
Farnoush Bayatmakou
J. Abouei
Konstantinos N. Plataniotis
Arash Mohammadi
ViTISeg
242
0
0
26 May 2025
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin
Yue Hu
Jiangtao Shen
Chunjiang Ge
Liujuan Cao
Shengchuan Zhang
Jiayi Ji
ObjDVLM
319
0
0
26 May 2025
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
Jiayu Wang
Yang Jiao
Yue Yu
Tianwen Qian
Shaoxiang Chen
Yue Yu
Yu Jiang
MLLMLM&MAELM
238
0
0
24 May 2025
Reasoning Segmentation for Images and Videos: A Survey
Reasoning Segmentation for Images and Videos: A Survey
Yiqing Shen
Chenjia Li
Fei Xiong
Jeong-O Jeong
Tianpeng Wang
Michael Latman
Mathias Unberath
VOS
396
8
0
24 May 2025
Semantic segmentation with reward
Semantic segmentation with reward
Xie Ting
Ye Huang
Zhilin Liu
Lixin Duan
471
0
0
23 May 2025
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders
Savya Khosla
Sethuraman TV
Barnett Lee
Alexander Schwing
Derek Hoiem
VGen
351
0
0
23 May 2025
Sketchy Bounding-box Supervision for 3D Instance Segmentation
Sketchy Bounding-box Supervision for 3D Instance SegmentationComputer Vision and Pattern Recognition (CVPR), 2025
Qian Deng
Renjie He
Jin Xie
Zhiqiang Wang
ISeg
248
1
0
22 May 2025
Multi-View Projection for Unsupervised Domain Adaptation in 3D Semantic Segmentation
Multi-View Projection for Unsupervised Domain Adaptation in 3D Semantic Segmentation
Andrew Caunes
Thierry Chateau
Vincent Frémont
3DPC3DV
302
0
0
21 May 2025
A Methodology to Evaluate Strategies Predicting Rankings on Unseen Domains
A Methodology to Evaluate Strategies Predicting Rankings on Unseen Domains
Sébastien Piérard
Adrien Deliège
Anaïs Halin
Marc Van Droogenbroeck
116
0
0
21 May 2025
gen2seg: Generative Models Enable Generalizable Instance Segmentation
gen2seg: Generative Models Enable Generalizable Instance Segmentation
Om Khangaonkar
Hamed Pirsiavash
DiffMVLM
413
0
0
21 May 2025
Advancing Marine Research: UWSAM Framework and UIIS10K Dataset for Precise Underwater Instance Segmentation
Advancing Marine Research: UWSAM Framework and UIIS10K Dataset for Precise Underwater Instance Segmentation
Hua Li
Shijie Lian
Zhiyuan Li
Runmin Cong
Sam Kwong
Laurence Tianruo Yang
Weidong Zhang
Sam Kwong
VLM
333
1
0
21 May 2025
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
Ground-V: Teaching VLMs to Ground Complex Instructions in PixelsComputer Vision and Pattern Recognition (CVPR), 2025
Yongshuo Zong
Qin Zhang
Dongsheng An
Zhihua Li
Xiang Xu
Linghan Xu
Zhuowen Tu
Yifan Xing
Onkar Dabeer
ObjD
257
2
0
20 May 2025
Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts
Generalizable Multispectral Land Cover Classification via Frequency-Aware Mixture of Low-Rank Token Experts
Xi Chen
Shen Yan
Juelin Zhu
Chen Chen
Yu Liu
Maojun Zhang
221
1
0
20 May 2025
PiT: Progressive Diffusion Transformer
PiT: Progressive Diffusion Transformer
Jiafu Wu
Yabiao Wang
Jian Li
Jinlong Peng
Yun Cao
Chengjie Wang
Jiangning Zhang
516
0
0
19 May 2025
Industrial Synthetic Segment Pre-training
Industrial Synthetic Segment Pre-training
Shinichi Mae
Ryousuke Yamada
Hirokatsu Kataoka
VLM
255
0
0
19 May 2025
Is Semantic SLAM Ready for Embedded Systems ? A Comparative Survey
Is Semantic SLAM Ready for Embedded Systems ? A Comparative Survey
Calvin Galagain
Martyna Poreba
François Goulette
267
5
0
18 May 2025
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation
Jianghang Lin
Yilin Lu
Chunjiang Ge
Chaoyang Zhu
Shengchuan Zhang
Liujuan Cao
Rongrong Ji
ISeg
388
0
0
16 May 2025
StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
Daniel A. P. Oliveira
David Martins de Matos
VGen
193
1
0
15 May 2025
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Zongchuang Zhao
Haoyu Fu
Dingkang Liang
Xin Zhou
Dingyuan Zhang
Hongwei Xie
Bing Wang
Xiang Bai
MLLMVLM
326
4
0
13 May 2025
MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of an Urban Environment
MESSI: A Multi-Elevation Semantic Segmentation Image Dataset of an Urban Environment
Barak Pinkovich
Boaz Matalon
Ehud Rivlin
Hector Rotstein
269
0
0
13 May 2025
Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding
Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding
Chih-Chung Hsu
I-Hsuan Wu
Wen-Hai Tseng
Ching-Heng Cheng
Ming-Hsuan Wu
Jin-Hui Jiang
Yu-Jou Hsiao
197
0
0
11 May 2025
UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration
UnfoldIR: Rethinking Deep Unfolding Network in Illumination Degradation Image Restoration
Chunming He
Rihan Zhang
Fengyang Xiao
Chengyu Fang
Longxiang Tang
Yuanxing Zhang
Sina Farsiu
290
12
0
10 May 2025
Split Matching for Inductive Zero-shot Semantic Segmentation
Split Matching for Inductive Zero-shot Semantic Segmentation
Jialei Chen
Xu Zheng
Dongyue Li
Chong Yi
Seigo Ito
D. Paudel
Luc Van Gool
Hiroshi Murase
Daisuke Deguchi
VLM
486
2
0
08 May 2025
Visual Affordance Prediction: Survey and Reproducibility
Visual Affordance Prediction: Survey and Reproducibility
Tommaso Apicella
Alessio Xompero
Andrea Cavallaro
408
0
0
08 May 2025
Previous
123...567...323334
Next