ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.04838
  4. Cited By
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with
  Transformers

CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

9 March 2022
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
    ViT
ArXivPDFHTML

Papers citing "CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers"

34 / 34 papers shown
Title
Segment Any RGB-Thermal Model with Language-aided Distillation
Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing
Xianxun Zhu
Wei Zhou
Qika Lin
Hang Yang
Yuqing Wang
VLM
49
0
0
04 May 2025
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
Shuobin Wei
Zhuang Zhou
Zhengan Lu
Zizhao Yuan
Binghua Su
MDE
34
0
0
18 Apr 2025
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
MemorySAM: Memorize Modalities and Semantics with Segment Anything Model 2 for Multi-modal Semantic Segmentation
Chenfei Liao
Xu Zheng
Yuanhuiyi Lyu
Haiwei Xue
Yihong Cao
Jiawen Wang
Kailun Yang
Xuming Hu
VLM
45
2
0
09 Mar 2025
Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation
Rethinking Early-Fusion Strategies for Improved Multimodal Image Segmentation
Zhengwen Shen
Yulian Li
Han Zhang
Yuchen Weng
Jun Wang
35
0
0
19 Jan 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
55
2
0
14 Jan 2025
IRFusionFormer: Enhancing Pavement Crack Segmentation with RGB-T Fusion and Topological-Based Loss
IRFusionFormer: Enhancing Pavement Crack Segmentation with RGB-T Fusion and Topological-Based Loss
Ruiqiang Xiao
Xiaohu Chen
29
0
0
31 Dec 2024
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks
Yaming Zhang
Chenqiang Gao
Fangcen Liu
Junjie Guo
Lan Wang
Xinggan Peng
Deyu Meng
83
0
0
21 Dec 2024
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
J. Han
ViT
17
1
0
19 Oct 2024
Order-aware Interactive Segmentation
Order-aware Interactive Segmentation
Bin Wang
Anwesa Choudhuri
Meng Zheng
Zhongpai Gao
Benjamin Planche
Andong Deng
Qin Liu
Terrence Chen
Ulas Bagci
Ziyan Wu
VLM
40
1
0
16 Oct 2024
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Jianqi Chen
Panwen Hu
Xiaojun Chang
Z. Shi
Michael C. Kampffmeyer
Xiaodan Liang
38
5
0
14 Oct 2024
IVGF: The Fusion-Guided Infrared and Visible General Framework
IVGF: The Fusion-Guided Infrared and Visible General Framework
Fangcen Liu
Chenqiang Gao
Fang Chen
Pengcheng Li
Junjie Guo
Deyu Meng
19
0
0
02 Sep 2024
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic
  Segmentation of Driving Scenes
CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes
Danial Qashqai
Emad Mousavian
S. B. Shokouhi
S. Mirzakuchaki
28
0
0
01 Jul 2024
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
Y. Chen
X. Huang
Quan Zhang
Wei Li
Mingjian Zhu
...
Hanting Chen
Hailin Hu
J. Yang
W. Liu
Jie Hu
EGVM
41
1
0
24 Jun 2024
OmniBind: Teach to Build Unequal-Scale Modality Interaction for
  Omni-Bind of All
OmniBind: Teach to Build Unequal-Scale Modality Interaction for Omni-Bind of All
Yuanhuiyi Lyu
Xueye Zheng
Dahun Kim
Lin Wang
27
10
0
25 May 2024
RABBIT: A Robot-Assisted Bed Bathing System with Multimodal Perception and Integrated Compliance
RABBIT: A Robot-Assisted Bed Bathing System with Multimodal Perception and Integrated Compliance
Rishabh Madan
Skyler Valdez
David Kim
Sujie Fang
Luoyan Zhong
Diego Virtue
T. Bhattacharjee
12
14
0
26 Jan 2024
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Xiao Wang
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
58
3
0
18 Dec 2023
PolyMaX: General Dense Prediction with Mask Transformer
PolyMaX: General Dense Prediction with Mask Transformer
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
...
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
18
14
0
09 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
13
64
0
07 Nov 2023
Impact of Pseudo Depth on Open World Object Segmentation with Minimal
  User Guidance
Impact of Pseudo Depth on Open World Object Segmentation with Minimal User Guidance
Robin Schon
K. Ludwig
Rainer Lienhart
VLM
MDE
14
2
0
12 Apr 2023
Breaking Modality Disparity: Harmonized Representation for Infrared and
  Visible Image Registration
Breaking Modality Disparity: Harmonized Representation for Infrared and Visible Image Registration
Zhiying Jiang
Zengxi Zhang
Jinyuan Liu
Xin-Yue Fan
Risheng Liu
9
2
0
12 Apr 2023
DepthFormer: Multimodal Positional Encodings and Cross-Input Attention
  for Transformer-Based Segmentation Networks
DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks
F. Barbato
Giulia Rizzoli
Pietro Zanuttigh
MDE
ViT
18
4
0
08 Nov 2022
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object
  Detection
HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
Tim Broedermann
Christos Sakaridis
Dengxin Dai
Luc Van Gool
43
28
0
30 Jun 2022
Semantic Segmentation by Early Region Proxy
Semantic Segmentation by Early Region Proxy
Yifan Zhang
Bo Pang
Cewu Lu
ViT
34
23
0
26 Mar 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
133
360
0
24 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
209
222
0
20 Jan 2022
CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object
  Detection
CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object Detection
Youwei Pang
Xiaoqi Zhao
Lihe Zhang
Huchuan Lu
27
90
0
04 Dec 2021
Channel Exchanging Networks for Multimodal and Multitask Dense Image
  Prediction
Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction
Yikai Wang
Fuchun Sun
Wenbing Huang
Fengxiang He
Dacheng Tao
30
17
0
04 Dec 2021
ConvMLP: Hierarchical Convolutional MLPs for Vision
ConvMLP: Hierarchical Convolutional MLPs for Vision
Jiachen Li
Ali Hassani
Steven Walton
Humphrey Shi
31
55
0
09 Sep 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
Trear: Transformer-based RGB-D Egocentric Action Recognition
Trear: Transformer-based RGB-D Egocentric Action Recognition
Xiangyu Li
Yonghong Hou
Pichao Wang
Zhimin Gao
Mingliang Xu
Wanqing Li
ViT
168
88
0
05 Jan 2021
Boundary-Aware Feature Propagation for Scene Segmentation
Boundary-Aware Feature Propagation for Scene Segmentation
Henghui Ding
Xudong Jiang
A. Liu
N. Magnenat-Thalmann
G. Wang
130
253
0
31 Aug 2019
Deep High-Resolution Representation Learning for Visual Recognition
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
176
3,480
0
20 Aug 2019
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
Iro Armeni
S. Sax
Amir Zamir
Silvio Savarese
3DV
3DPC
111
864
0
03 Feb 2017
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image
  Segmentation
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
420
15,438
0
02 Nov 2015
1