Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.01527
Cited By
v1
v2
v3 (latest)
Masked-attention Mask Transformer for Universal Image Segmentation
2 December 2021
Bowen Cheng
Ishan Misra
Alex Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked-attention Mask Transformer for Universal Image Segmentation"
50 / 1,661 papers shown
Title
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRM
VLM
316
119
0
19 Feb 2024
MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation
Yue-Jiang Dong
Fang-Lue Zhang
Song-Hai Zhang
181
5
0
18 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLM
MLLM
407
23
0
17 Feb 2024
A Decoding Scheme with Successive Aggregation of Multi-Level Features for Light-Weight Semantic Segmentation
Jiwon Yoo
Jangwon Lee
Gyeonghwan Kim
201
0
0
17 Feb 2024
Is Continual Learning Ready for Real-world Challenges?
Theodora Kontogianni
Yuanwen Yue
Siyu Tang
Konrad Schindler
CLL
264
5
0
15 Feb 2024
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision
Zhaoqing Wang
Xiaobo Xia
Ziye Chen
Xiao He
Yandong Guo
Biwei Huang
Tongliang Liu
VLM
289
17
0
14 Feb 2024
M2fNet: Multi-modal Forest Monitoring Network on Large-scale Virtual Dataset
Yawen Lu
Yunhan Huang
Su Sun
Tansi Zhang
Xuewen Zhang
Songlin Fei
Yingjie Victor Chen
156
9
0
07 Feb 2024
Spatio-temporal Prompting Network for Robust Video Feature Extraction
Guanxiong Sun
Chi Wang
Zhaoyu Zhang
Jiankang Deng
Stefanos Zafeiriou
Yang Hua
ViT
182
7
0
04 Feb 2024
Generalizable Entity Grounding via Assistance of Large Language Model
Lu Qi
Yi-Wen Chen
Lehan Yang
Tiancheng Shen
Xiangtai Li
Weidong Guo
Yu-Syuan Xu
Ming-Hsuan Yang
VLM
216
13
0
04 Feb 2024
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-Xiong Wang
Derek Hoiem
424
14
0
04 Feb 2024
Theoretical Understanding of In-Context Learning in Shallow Transformers with Unstructured Data
Yue Xing
Xiaofeng Lin
Chenheng Xu
Namjoon Suh
Qifan Song
Guang Cheng
215
4
0
01 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
229
78
0
31 Jan 2024
SAGD: Boundary-Enhanced Segment Anything in 3D Gaussian via Gaussian Decomposition
Xu Hu
Yuxi Wang
Lue Fan
Junsong Fan
Junran Peng
Zhen Lei
Qing Li
Zhaoxiang Zhang
Zhaoxiang Zhang
3DGS
467
22
0
31 Jan 2024
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Shiyin Dong
Mingrui Zhu
Kun Cheng
Nannan Wang
Xinbo Gao
DiffM
129
4
0
29 Jan 2024
GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis
Jing Hao
Moyun Liu
Kuo Feng Hung
DiffM
124
2
0
27 Jan 2024
SAM-based instance segmentation models for the automation of structural damage detection
Advanced Engineering Informatics (AEI), 2024
Zehao Ye
Lucy Lovell
A. Faramarzi
Jelena Ninić
284
32
0
27 Jan 2024
SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation
Conference on Algebraic Informatics (CAI), 2024
Yanqi Ge
Ye Huang
Wen Li
Lixin Duan
101
1
0
26 Jan 2024
Rethinking Patch Dependence for Masked Autoencoders
Letian Fu
Long Lian
Renhao Wang
Baifeng Shi
Xudong Wang
Adam Yala
Trevor Darrell
Alexei A. Efros
Ken Goldberg
288
32
0
25 Jan 2024
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty
European Conference on Computer Vision (ECCV), 2024
Tim Brödermann
David Brüggemann
Daniel Gehrig
Kevin Ta
Odysseas Liagouris
Jason Corkill
Luc Van Gool
256
22
0
23 Jan 2024
EEND-M2F: Masked-attention mask transformers for speaker diarization
Interspeech (Interspeech), 2024
Marc Härkönen
Samuel J. Broughton
Lahiru Samarakoon
250
19
0
23 Jan 2024
IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images
Computer Vision and Pattern Recognition (CVPR), 2024
Zhi-Hao Lin
Jia-Bin Huang
Zhengqin Li
Zhao Dong
Christian Richardt
Tuotuo Li
Michael Zollhöfer
Johannes Kopf
Shenlong Wang
Changil Kim
3DV
340
7
0
23 Jan 2024
Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis
Pattern Recognition (Pattern Recogn.), 2024
Jiawei Wang
Kai Hu
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
280
12
0
22 Jan 2024
Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Ci-Siang Lin
Chien-Yi Wang
Yu-Chiang Frank Wang
Min-Hung Chen
VLM
715
3
0
22 Jan 2024
S
3
^3
3
M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving
IEEE Transactions on Intelligent Vehicles (TIV), 2024
Zhiyuan Wu
Yi Feng
Chuangwei Liu
Fisher Yu
Qijun Chen
Rui Fan
256
20
0
21 Jan 2024
Pixel-Wise Recognition for Holistic Surgical Scene Understanding
Nicolás Ayobi
Santiago Rodríguez
Alejandra Pérez
Isabela Hernández
Nicolás Aparicio
...
Sebastián Pena
J. Santander
J. Caicedo
Nicolás Fernández
Pablo Arbelaez
ViT
MedIm
185
31
0
20 Jan 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaohan Li
Jiashi Feng
Hengshuang Zhao
VLM
624
1,348
0
19 Jan 2024
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion
Zuoyue Li
Zhenqiang Li
Zhaopeng Cui
Marc Pollefeys
Martin R. Oswald
197
24
0
19 Jan 2024
Symbol as Points: Panoptic Symbol Spotting via Point-based Representation
Wenlong Liu
Tianyu Yang
Yuhan Wang
Qizhi Yu
Lei Zhang
3DPC
196
8
0
19 Jan 2024
OMG-Seg: Is One Model Good Enough For All Segmentation?
Xiangtai Li
Haobo Yuan
Wei Li
Henghui Ding
Size Wu
Wenwei Zhang
Yining Li
Kai Chen
Chen Change Loy
VLM
MLLM
ViT
270
103
0
18 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
308
14
0
18 Jan 2024
Supervised Fine-tuning in turn Improves Visual Foundation Models
Xiaohu Jiang
Yixiao Ge
Yuying Ge
Dachuan Shi
Chun Yuan
Ying Shan
VLM
CLIP
198
12
0
18 Jan 2024
Image Translation as Diffusion Visual Programmers
Cheng Han
James Liang
Qifan Wang
Majid Rabbani
S. Dianat
Raghuveer M. Rao
Ying Nian Wu
Dongfang Liu
204
19
0
18 Jan 2024
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
Ze-Long Cheng
Kehan Li
Hao Li
Peng Jin
Chang Liu
Xiawu Zheng
Rongrong Ji
Jie Chen
VOS
241
4
0
18 Jan 2024
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
Shilin Xu
Haobo Yuan
Qingyu Shi
Lu Qi
Jingbo Wang
...
Kai Chen
Yunhai Tong
Guohao Li
Xiangtai Li
Ming-Hsuan Yang
VLM
86
8
0
18 Jan 2024
Dynamic Relation Transformer for Contextual Text Block Detection
IEEE International Conference on Document Analysis and Recognition (ICDAR), 2024
Jiawei Wang
Shunchi Zhang
Kai Hu
Chixiang Ma
Zhuoyao Zhong
Lei-huan Sun
Qiang Huo
126
1
0
17 Jan 2024
MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation
Computer Vision and Pattern Recognition (CVPR), 2024
Mi Yan
JIazhao Zhang
Yan Zhu
Hongan Wang
3DV
ISeg
255
47
0
15 Jan 2024
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized HD Map Construction
Toyota Li
173
8
0
14 Jan 2024
Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering
International Conference on 3D Vision (3DV), 2024
Damien Robert
Hugo Raguet
Loic Landrieu
189
27
0
12 Jan 2024
Hyper-STTN: Hypergraph Augmented Spatial-Temporal Transformer Network for Trajectory Prediction
Weizheng Wang
Baijian Yang
Baijian Yang
Guohua Chen
Byung-Cheol Min
HAI
ViT
257
5
0
12 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Computer Vision and Pattern Recognition (CVPR), 2024
Yuwen Xiong
Zhiqi Li
Yuntao Chen
Feng Wang
Xizhou Zhu
...
Jiaming Song
Yu Qiao
Lewei Lu
Jie Zhou
Jifeng Dai
137
125
0
11 Jan 2024
Distribution-aware Interactive Attention Network and Large-scale Cloud Recognition Benchmark on FY-4A Satellite Image
Jiaqing Zhang
Jie Lei
Weiying Xie
Kai Jiang
Mingxiang Cao
Yunsong Li
119
3
0
06 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
249
85
0
05 Jan 2024
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain
Pushkal Katara
N. Gkanatsios
Adam W. Harley
Gabriel H. Sarch
Kriti Aggarwal
Vishrav Chaudhary
Katerina Fragkiadaki
3DPC
445
16
0
04 Jan 2024
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
International Journal of Computer Vision (IJCV), 2024
Zheng Yuan
Jie Zhang
Yude Wang
Shiguang Shan
Xilin Chen
AAML
447
2
0
03 Jan 2024
FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
IEEE Transactions on Image Processing (TIP), 2024
Zheng Yuan
Jie Zhang
Shiguang Shan
Xilin Chen
275
4
0
03 Jan 2024
PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning
Xuntao Liu
Yuzhou Yang
Qichao Ying
Zhenxing Qian
Xinpeng Zhang
Sheng Li
VLM
166
4
0
01 Jan 2024
WoodScape Motion Segmentation for Autonomous Driving -- CVPR 2023 OmniCV Workshop Challenge
Saravanabalagi Ramachandran
Nathaniel Cibik
Ganesh Sistu
John L McDonald
343
1
0
31 Dec 2023
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Dimitrios Psychogyios
Emanuele Colleoni
Beatrice van Amsterdam
Chih-Yang Li
Shu-Yu Huang
...
Santiago Rodriguez
Juanita Puentes
Pablo Arbelaez
Omid Mohareri
Danail Stoyanov
155
37
0
31 Dec 2023
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
265
7
0
31 Dec 2023
Generalizing Single-View 3D Shape Retrieval to Occlusions and Unseen Objects
International Conference on 3D Vision (3DV), 2023
Qirui Wu
Daniel E. Ritchie
Manolis Savva
Angel X. Chang
3DPC
181
5
0
31 Dec 2023
Previous
1
2
3
...
19
20
21
...
32
33
34
Next