ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.05770
  4. Cited By
PolyMaX: General Dense Prediction with Mask Transformer

PolyMaX: General Dense Prediction with Mask Transformer

IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
9 November 2023
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
Siyuan Qiao
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
ArXiv (abs)PDFHTMLHuggingFace (11 upvotes)

Papers citing "PolyMaX: General Dense Prediction with Mask Transformer"

14 / 14 papers shown
Title
DiffPixelFormer: Differential Pixel-Aware Transformer for RGB-D Indoor Scene Segmentation
DiffPixelFormer: Differential Pixel-Aware Transformer for RGB-D Indoor Scene Segmentation
Yan Gong
J. Lu
Yongsheng Gao
Jie Zhao
X. Zhang
Susanto Rahardja
52
0
0
17 Nov 2025
Neural USD: An object-centric framework for iterative editing and control
Neural USD: An object-centric framework for iterative editing and control
Alejandro Escontrela
Shrinu Kushagra
Sjoerd van Steenkiste
Yulia Rubanova
Aleksander Holynski
Kelsey R. Allen
Kevin Murphy
Thomas Kipf
DiffM
108
0
0
28 Oct 2025
Towards In-the-wild 3D Plane Reconstruction from a Single Image
Towards In-the-wild 3D Plane Reconstruction from a Single ImageComputer Vision and Pattern Recognition (CVPR), 2025
Jiachen Liu
Jingbo Xia
Sili Chen
Sharon X. Huang
Hengkai Guo
3DV
176
5
0
03 Jun 2025
PDDM: Pseudo Depth Diffusion Model for RGB-PD Semantic Segmentation Based in Complex Indoor Scenes
PDDM: Pseudo Depth Diffusion Model for RGB-PD Semantic Segmentation Based in Complex Indoor ScenesAAAI Conference on Artificial Intelligence (AAAI), 2025
Xinhua Xu
Hong Liu
Jianbing Wu
Jinfu Liu
DiffM
175
1
0
24 Mar 2025
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary TasksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Alessio Quercia
Erenus Yildiz
Zhuo Cao
Kai Krajsek
Abigail Morrison
Ira Assent
Hanno Scharr
258
0
0
22 Jan 2025
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Yue Duan
Zhangxuan Gu
ZhenZhe Ying
Changhua Meng
Xuelong Li
283
17
0
02 Aug 2024
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision
  Transformer
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
ViT
464
48
0
03 Jun 2024
COCONut: Modernizing COCO Segmentation
COCONut: Modernizing COCO Segmentation
XueQing Deng
Qihang Yu
Peng Wang
Xiaohui Shen
Liang-Chieh Chen
174
19
0
12 Apr 2024
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid,
  Asymmetric, and Progressive Heterogeneous Feature Fusion
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion
Jiahang Li
Peng Yun
Qijun Chen
Rui Fan
213
12
0
04 Apr 2024
ViTamin: Designing Scalable Vision Models in the Vision-Language Era
ViTamin: Designing Scalable Vision Models in the Vision-Language EraComputer Vision and Pattern Recognition (CVPR), 2024
Jienneg Chen
Qihang Yu
Xiaohui Shen
Yaoyao Liu
Liang-Chieh Chen
3DVVLM
357
47
0
02 Apr 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal EstimationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Mu Hu
Wei Yin
C. Zhang
Yong Deng
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
476
295
0
22 Mar 2024
Diffusion Models are Efficient Data Generators for Human Mesh Recovery
Diffusion Models are Efficient Data Generators for Human Mesh Recovery
Yongtao Ge
Wenjia Wang
Yongfan Chen
Fanzhou Wang
Lei Yang
Hao Chen
Chunhua Shen
3DH
399
8
0
17 Mar 2024
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature
  Refinement and Regularized Image-Text Alignment
EVP: Enhanced Visual Perception using Inverse Multi-Attentive Feature Refinement and Regularized Image-Text Alignment
M. Lavrenyuk
Shariq Farooq Bhat
Matthias Müller
Peter Wonka
ObjDMDE
200
13
0
13 Dec 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene UnderstandingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hanrong Ye
Dan Xu
ViT
201
23
0
08 Jun 2023
1