Cross-aware Early Fusion with Stage-divided Vision and Language Transformer Encoders for Referring Image Segmentation

IEEE transactions on multimedia (IEEE TMM), 2024

14 August 2024

Yubin Cho

Hyunwoo Yu

Suk-Ju Kang

ArXiv (abs)PDF HTML Github

Papers citing "Cross-aware Early Fusion with Stage-divided Vision and Language Transformer Encoders for Referring Image Segmentation"

16 / 16 papers shown

Integrating Visual and X-Ray Machine Learning Features in the Study of Paintings by Goya

Hassan Ugail

Ismail Lujain Jaleel

110

02 Nov 2025

Latent Expression Generation for Referring Image Segmentation and Grounding

306

07 Aug 2025

Multimodal Referring Segmentation: A Survey

521

01 Aug 2025

RemoteSAM: Towards Segment Anything for Earth Observation

871

23 May 2025

BiPVL-Seg: Bidirectional Progressive Vision-Language Fusion with Global-Local Alignment for Medical Image Segmentation

292

30 Mar 2025

RSRefSeg: Referring Remote Sensing Image Segmentation with Foundation Models

340

12 Jan 2025

Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation

530

11 Oct 2024

Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image SegmentationIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

Sen Lei

350

20 Sep 2024

Depth-Weighted Detection of Behaviours of Risk in People with Dementia using Cameras

318

28 Aug 2024

MetaSeg: MetaFormer-based Global Contexts-aware Network for Efficient Semantic SegmentationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

322

14 Aug 2024

Embedding-Free Transformer with Inference Spatial Reduction for Efficient Semantic Segmentation

299

24 Jul 2024

Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation

Seonghoon Yu

Paul Hongsuck Seo

Jeany Son

DiffM

479

10 Jul 2024

Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation

602

24 May 2024

Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation

Jing Liu

309

18 May 2024

TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer

Eunjee Choi

Jong-Kook Kim

333

19 Mar 2024

EAVL: Explicitly Align Vision and Language for Referring Image Segmentation

379

18 Aug 2023